자료유형 | 학위논문 |
---|---|
서명/저자사항 | Experimental Design under Comparisons. |
개인저자 | Guo, Yuan . |
단체저자명 | Northeastern University. Electrical and Computer Engineering. |
발행사항 | [S.l.]: Northeastern University., 2019. |
발행사항 | Ann Arbor: ProQuest Dissertations & Theses, 2019. |
형태사항 | 100 p. |
기본자료 저록 | Dissertations Abstracts International 81-05B. Dissertation Abstract International |
ISBN | 9781392428009 |
학위논문주기 | Thesis (Ph.D.)--Northeastern University, 2019. |
일반주기 |
Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
Advisor: Ioannidis, Stratis. |
이용제한사항 | This item must not be sold to any third party vendors.This item must not be added to any third party search indexes. |
요약 | Labels generated by human experts via comparisons exhibit smaller variance compared to traditional sample labels. Collecting comparison labels is challenging over large datasets, as the number of comparisons grows quadratically with the dataset size. We study the following experimental design problem: given a budget of expert comparisons, and a set of existing sample labels, we determine the comparison labels to collect that lead to the highest classification improvement. We study several experimental design objectives motivated by the Bradley-Terry model. The resulting optimization problems amount to maximizing submodular functions.We especially study a natural experimental design objective, namely, D-optimality. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A naive greedy implementation has O(N2d2K) complexity, where N is the dataset size, d is the feature space dimension, and K is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset namely, that it consists of pairwise comparison's the greedy algorithms complexity can be reduced to O(N2(K + d) + N(dK + d2) + d2K). We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with 108 comparisons |
일반주제명 | Electrical engineering. Computer engineering. |
언어 | 영어 |
바로가기 |
: 이 자료의 원문은 한국교육학술정보원에서 제공합니다. |