MARC보기
LDR00000nam u2200205 4500
001000000433679
00520200225152506
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781088341681
035 ▼a (MiAaPQ)AAI22583693
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 001
1001 ▼a Yan, Songbai.
24510 ▼a Algorithms for Query-Efficient Active Learning.
260 ▼a [S.l.]: ▼b University of California, San Diego., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 211 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
500 ▼a Advisor: Chaudhuri, Kamalika
5021 ▼a Thesis (Ph.D.)--University of California, San Diego, 2019.
506 ▼a This item must not be sold to any third party vendors.
520 ▼a Recent decades have witnessed great success of machine learning, especially for tasks where large annotated datasets are available for training models. However, in many applications, raw data, such as images, are abundant, but annotations, such as descriptions of images, are scarce. Annotating data requires human effort and can be expensive. Consequently, one of the central problems in machine learning is how to train an accurate model with as few human annotations as possible. Active learning addresses this problem by bringing the annotator to work together with the learner in the learning process. In active learning, a learner can sequentially select examples and ask the annotator for labels, so that it may require fewer annotations if the learning algorithm avoids querying less informative examples.This dissertation focuses on designing provable query-efficient active learning algorithms. The main contributions are as follows. First, we study noise-tolerant active learning in the standard stream-based setting. We propose a computationally efficient algorithm for actively learning homogeneous halfspaces under bounded noise, and prove it achieves nearly optimal label complexity. Second, we theoretically investigate a novel interactive model where the annotator can not only return noisy labels, but also abstain from labeling. We propose an algorithm which utilizes abstention responses, and analyze its statistical consistency and query complexity under different conditions of the noise and abstention rate. Finally, we study how to utilize auxiliary datasets in active learning. We consider a scenario where the learner has access to a logged observational dataset where labeled examples are observed conditioned on a selection policy. We propose algorithms that effectively take advantage of both auxiliary datasets and active learning. We prove that these algorithms are statistically consistent, and achieve a lower label requirement than alternative methods theoretically and empirically.
590 ▼a School code: 0033.
650 4 ▼a Computer science.
650 4 ▼a Artificial intelligence.
690 ▼a 0984
690 ▼a 0800
71020 ▼a University of California, San Diego. ▼b Computer Science and Engineering.
7730 ▼t Dissertations Abstracts International ▼g 81-05B.
773 ▼t Dissertation Abstract International
790 ▼a 0033
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15492804 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1008102
991 ▼a E-BOOK