LDR | | 00000nam u2200205 4500 |
001 | | 000000433308 |
005 | | 20200225140210 |
008 | | 200131s2019 ||||||||||||||||| ||eng d |
020 | |
▼a 9781085792608 |
035 | |
▼a (MiAaPQ)AAI13885989 |
040 | |
▼a MiAaPQ
▼c MiAaPQ
▼d 247004 |
082 | 0 |
▼a 004 |
100 | 1 |
▼a Jin, Chi. |
245 | 10 |
▼a Machine Learning: Why Do Simple Algorithms Work So Well?. |
260 | |
▼a [S.l.]:
▼b University of California, Berkeley.,
▼c 2019. |
260 | 1 |
▼a Ann Arbor:
▼b ProQuest Dissertations & Theses,
▼c 2019. |
300 | |
▼a 158 p. |
500 | |
▼a Source: Dissertations Abstracts International, Volume: 81-04, Section: B. |
500 | |
▼a Advisor: Jordan, Michael I. |
502 | 1 |
▼a Thesis (Ph.D.)--University of California, Berkeley, 2019. |
506 | |
▼a This item must not be sold to any third party vendors. |
520 | |
▼a While state-of-the-art machine learning models are deep, large-scale, sequential and highly nonconvex, the backbone of modern learning algorithms are simple algorithms such as stochastic gradient descent, gradient descent with momentum or Q-learning (in the case of reinforcement learning tasks). A basic question endures---why do simple algorithms work so well even in these challenging settings?To answer above question, this thesis focuses on four concrete and fundamental questions:- In nonconvex optimization, can (stochastic) gradient descent or its variants escape saddle points efficiently?- Is gradient descent with momentum provably faster than gradient descent in the general nonconvex setting?- In nonconvex-nonconcave minmax optimization, what is a proper definition of local optima and is gradient descent ascent game-theoretically meaningful?- In reinforcement learning, is Q-learning sample efficient?This thesis provides the first line of provably positive answers to all above questions. In particular, this thesis will show that although the standard versions of these classical algorithms do not enjoy good theoretical properties in the worst case, simple modifications are sufficient to grant them desirable behaviors, which explain the underlying mechanisms behind their favorable performance in practice. |
590 | |
▼a School code: 0028. |
650 | 4 |
▼a Computer science. |
690 | |
▼a 0984 |
710 | 20 |
▼a University of California, Berkeley.
▼b Computer Science. |
773 | 0 |
▼t Dissertations Abstracts International
▼g 81-04B. |
773 | |
▼t Dissertation Abstract International |
790 | |
▼a 0028 |
791 | |
▼a Ph.D. |
792 | |
▼a 2019 |
793 | |
▼a English |
856 | 40 |
▼u http://www.riss.kr/pdu/ddodLink.do?id=T15491479
▼n KERIS
▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다. |
980 | |
▼a 202002
▼f 2020 |
990 | |
▼a ***1816162 |
991 | |
▼a E-BOOK |