자료유형 | 학위논문 |
---|---|
서명/저자사항 | Ray: A Distributed Execution Engine for the Machine Learning Ecosystem. |
개인저자 | Moritz, Philipp C. |
단체저자명 | University of California, Berkeley. Electrical Engineering & Computer Sciences. |
발행사항 | [S.l.]: University of California, Berkeley., 2019. |
발행사항 | Ann Arbor: ProQuest Dissertations & Theses, 2019. |
형태사항 | 103 p. |
기본자료 저록 | Dissertations Abstracts International 81-06B. Dissertation Abstract International |
ISBN | 9781392404805 |
학위논문주기 | Thesis (Ph.D.)--University of California, Berkeley, 2019. |
일반주기 |
Source: Dissertations Abstracts International, Volume: 81-06, Section: B.
Advisor: Jordan, Michael I |
이용제한사항 | This item must not be sold to any third party vendors.This item must not be added to any third party search indexes. |
요약 | In recent years, growing data volumes and more sophisticated computational procedures have greatly increased the demand for computational power. Machine learning and artificial intelligence applications, for example, are notorious for their computational requirements. At the same time, Moores law is ending and processor speeds are stalling. As a result, distributed computing has become ubiquitous. While the cloud makes distributed hardware infrastructure widely accessible and therefore offers the potential of horizontal scale, developing these distributed algorithms and applications remains surprisingly hard. This is due to the inherent complexity of concurrent algorithms, the engineering challenges that arise when communicating between many machines, the requirements like fault tolerance and straggler mitigation that arise at large scale and the lack of a general-purpose distributed execution engine that can support a wide variety of applications.In this thesis, we study the requirements for a general-purpose distributed computation model and present a solution that is easy to use yet expressive and resilient to faults. At its core our model takes familiar concepts from serial programming, namely functions and classes, and generalizes them to the distributed world, therefore unifying stateless and stateful distributed computation. This model not only supports many machine learning workloads like training or serving, but is also a good t for cross-cutting machine learning applications like reinforcement learning and data processing applications like streaming or graph processing. We implement this computational model as an open-source system called Ray, which matches or exceeds the performance of specialized systems in many application domains, while also offering horizontally scalability and strong fault tolerance properties. |
일반주제명 | Computer science. |
언어 | 영어 |
바로가기 | ![]() |