자료유형 | 학위논문 |
---|---|
서명/저자사항 | Algorithms for Query-Efficient Active Learning. |
개인저자 | Yan, Songbai. |
단체저자명 | University of California, San Diego. Computer Science and Engineering. |
발행사항 | [S.l.]: University of California, San Diego., 2019. |
발행사항 | Ann Arbor: ProQuest Dissertations & Theses, 2019. |
형태사항 | 211 p. |
기본자료 저록 | Dissertations Abstracts International 81-05B. Dissertation Abstract International |
ISBN | 9781088341681 |
학위논문주기 | Thesis (Ph.D.)--University of California, San Diego, 2019. |
일반주기 |
Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
Advisor: Chaudhuri, Kamalika |
이용제한사항 | This item must not be sold to any third party vendors. |
요약 | Recent decades have witnessed great success of machine learning, especially for tasks where large annotated datasets are available for training models. However, in many applications, raw data, such as images, are abundant, but annotations, such as descriptions of images, are scarce. Annotating data requires human effort and can be expensive. Consequently, one of the central problems in machine learning is how to train an accurate model with as few human annotations as possible. Active learning addresses this problem by bringing the annotator to work together with the learner in the learning process. In active learning, a learner can sequentially select examples and ask the annotator for labels, so that it may require fewer annotations if the learning algorithm avoids querying less informative examples.This dissertation focuses on designing provable query-efficient active learning algorithms. The main contributions are as follows. First, we study noise-tolerant active learning in the standard stream-based setting. We propose a computationally efficient algorithm for actively learning homogeneous halfspaces under bounded noise, and prove it achieves nearly optimal label complexity. Second, we theoretically investigate a novel interactive model where the annotator can not only return noisy labels, but also abstain from labeling. We propose an algorithm which utilizes abstention responses, and analyze its statistical consistency and query complexity under different conditions of the noise and abstention rate. Finally, we study how to utilize auxiliary datasets in active learning. We consider a scenario where the learner has access to a logged observational dataset where labeled examples are observed conditioned on a selection policy. We propose algorithms that effectively take advantage of both auxiliary datasets and active learning. We prove that these algorithms are statistically consistent, and achieve a lower label requirement than alternative methods theoretically and empirically. |
일반주제명 | Computer science. Artificial intelligence. |
언어 | 영어 |
바로가기 |
: 이 자료의 원문은 한국교육학술정보원에서 제공합니다. |