PairBoost: Gradient Boosted Classification from Pairwise Data
Open Access
Author:
Ashtekar, Neil
Area of Honors:
Computer Science
Degree:
Bachelor of Science
Document Type:
Thesis
Thesis Supervisors:
Mehrdad Mahdavi, Thesis Supervisor Rebecca Jane Passonneau, Thesis Honors Advisor
Keywords:
Machine Learning Ensemble Learning Gradient Boosting Data Science Artificial Intelligence
Abstract:
Supervised binary classification requires access to a fully labeled dataset. In many applications, gathering labels can be costly and difficult, and may even be impossible due to privacy concerns. However, it is often feasible to obtain alternative feedback such as pairwise comparisons. This thesis proposes PairBoost -- a gradient boosted binary classification algorithm capable of learning from pairwise comparisons and unlabeled data. Specifically, we consider instance pairs with labels indicating which of the two instances is more likely to be positive. Our algorithm consists of two decoupled steps: first, learn a pairwise ranker to transfer the knowledge from pairwise comparisons to unlabeled instances, and second, learn a boosted binary classifier where the labels are adaptively assigned based on the discrepancy between the current classifier's predictions and the confidence of the pairwise ranker. We evaluate PairBoost on several real-world datasets, showing the practical usefulness of our approach and demonstrating significant performance improvements over existing methods.