대구한의대학교 향산도서관

상세정보

부가기능

Advances in Machine Learning: Nearest Neighbour Search, Learning to Optimize and Generative Modelling

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Advances in Machine Learning: Nearest Neighbour Search, Learning to Optimize and Generative Modelling.
개인저자Li, Ke.
단체저자명University of California, Berkeley. Electrical Engineering & Computer Sciences.
발행사항[S.l.]: University of California, Berkeley., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항126 p.
기본자료 저록Dissertations Abstracts International 81-06B.
Dissertation Abstract International
ISBN9781392717806
학위논문주기Thesis (Ph.D.)--University of California, Berkeley, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-06, Section: B.
Advisor: Malik, Jitendra.
이용제한사항This item must not be sold to any third party vendors.
요약Machine learning is the embodiment of an unapologetically data-driven philosophy that has increasingly become one of the most important drivers of progress in artificial intelligence and beyond. Existing machine learning methods, however, entail making trade-offs in terms of computational efficiency, modelling flexibility and/or formulation faithfulness. In this dissertation, we will cover three different ways in which limitations along each axis can be overcome, without compromising on other axes.Computational EfficiencyWe start with limitations on computational efficiency. Many modern machine learning methods require performing large-scale similarity search under the hood. For example, classifying an input into one of a large number of classes requires comparing the weight vector associated with each class to the activations of the penultimate layer, attending to particular memory cells of a neural net requires comparing the keys associated with each memory cell to the query, and sparse recovery requires comparing each dictionary element to the residual. Similarity search in many cases can be reduced to nearest neighbour search, which is both a blessing and a curse. On the plus side, the nearest neighbour search problem has been extensively studied for more than four decades. On the minus side, no exact algorithm developed over the past four decades can run faster than naive exhaustive search when the intrinsic dimensionality is high, which is almost certainly the case in machine learning. Given this state of affairs, should we give up any hope of doing any better than the naive approach of exhaustive comparing each element one-by-one?It turns out this pessimism, while tempting, is unwarranted. We introduce a new family of exact randomized algorithms, known as Dynamic Continuous Indexing, which overcomes both the curse of ambient dimensionality and the curse of intrinsic dimensionality: more specifically, DCI simultaneously achieves a query time complexity with a linear dependence on ambient dimensionality, a sublinear dependence on intrinsic dimensionality and a sublinear dependence on dataset size. The key insight is that the curse of intrinsic dimensionality in many cases arises from space partitioning, which is a divide-and-conquer strategy used by most nearest neighbour search algorithms. While space partitioning makes intuitive sense and works well in low dimensions, we argue that it fundamentally fails in high dimensions, because it requires distances between each point and every possible query to be approximately preserved in the data structure. We develop a new indexing scheme that only requires the ordering of nearby points relative to distant points to be approximately preserved, and show that the number of out-of-place points after projecting to just a single dimension is sublinear in the intrinsic dimensionality. In practice, our algorithm achieves a 14 - 116x speedup and a 21x reduction in memory consumption compared to locality-sensitive hashing (LSH). Modelling FlexibilityNext we move onto probabilistic modelling, which is critical to realizing one of the central objectives of machine learning, which is to model the uncertainty that is inherent in prediction. The community has wrestled with the problem of how to strike the right balance between modelling flexibility and computational efficiency. Simple models can often be learned straightforwardly and efficiently but are not expressive
일반주제명Computer science.
Artificial intelligence.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼