MARC보기
LDR00000nam u2200205 4500
001000000435012
00520200227113443
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781392428009
035 ▼a (MiAaPQ)AAI27548298
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 621
1001 ▼a Guo, Yuan .
24510 ▼a Experimental Design under Comparisons.
260 ▼a [S.l.]: ▼b Northeastern University., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 100 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
500 ▼a Advisor: Ioannidis, Stratis.
5021 ▼a Thesis (Ph.D.)--Northeastern University, 2019.
506 ▼a This item must not be sold to any third party vendors.
506 ▼a This item must not be added to any third party search indexes.
520 ▼a Labels generated by human experts via comparisons exhibit smaller variance compared to traditional sample labels. Collecting comparison labels is challenging over large datasets, as the number of comparisons grows quadratically with the dataset size. We study the following experimental design problem: given a budget of expert comparisons, and a set of existing sample labels, we determine the comparison labels to collect that lead to the highest classification improvement. We study several experimental design objectives motivated by the Bradley-Terry model. The resulting optimization problems amount to maximizing submodular functions.We especially study a natural experimental design objective, namely, D-optimality. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A naive greedy implementation has O(N2d2K) complexity, where N is the dataset size, d is the feature space dimension, and K is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset namely, that it consists of pairwise comparison's the greedy algorithms complexity can be reduced to O(N2(K + d) + N(dK + d2) + d2K). We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with 108 comparisons
590 ▼a School code: 0160.
650 4 ▼a Electrical engineering.
650 4 ▼a Computer engineering.
690 ▼a 0544
690 ▼a 0464
71020 ▼a Northeastern University. ▼b Electrical and Computer Engineering.
7730 ▼t Dissertations Abstracts International ▼g 81-05B.
773 ▼t Dissertation Abstract International
790 ▼a 0160
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15494518 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1008102
991 ▼a E-BOOK