대구한의대학교 향산도서관

상세정보

부가기능

Multi-tenant Geo-distributed Data Analytics

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Multi-tenant Geo-distributed Data Analytics.
개인저자Jonathan, Albert.
단체저자명University of Minnesota. Computer Science.
발행사항[S.l.]: University of Minnesota., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항146 p.
기본자료 저록Dissertations Abstracts International 81-02B.
Dissertation Abstract International
ISBN9781085763684
학위논문주기Thesis (Ph.D.)--University of Minnesota, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
Advisor: Chandra, Abhishek
이용제한사항This item must not be sold to any third party vendors.
요약Geo-distributed data analytics has gained much interest in recent years due to the need for extracting insights from geo-distributed data. Traditionally, data analytics has been done within a cluster/data center environment. However, analyzing geo-distributed data using existing cluster-based systems typically cannot satisfy the timeliness requirement of most applications and result in wasteful resource consumption due to the fundamental differences of the environments, especially due to the scarce, highly heterogeneous, and dynamic nature of the wide-area resources: compute power and network bandwidth. This thesis addresses the challenges faced by geo-distributed data analytics systems in ensuring high-performance and reliable execution of multiple data analytics applications/queries. Specifically, the focus is on sharing resources across multiple users, applications, and computing frameworks. Sharing resources is attractive as it increases resource utilization and reduces operational cost. However, ensuring high-performance execution of multiple applications in a shared environment is challenging as they may compete for the same resources, especially in a wide-area environment with scarce resources. Furthermore, dynamics such as workload variation, resource variation, stragglers, and failures are inevitable in large-scale distributed systems. These can cause large resource perturbation that significantly affect the performance of query executions.This thesis makes the following contributions. First, we present a resource sharing technique across multiple geo-distributed data analytics frameworks. The main challenge here is how to elastically partition resources while allowing high locality scheduling to each individual framework, which is critical to the execution performance of geo-distributed analytics queries. We then address the problem of how to identify and exploit common executions across multiple queries to mitigate wasteful resource consumption. We demonstrate that traditional multi-query optimization may degrade the overall query execution performance due to its lack of support for network awareness. Finally, we highlight the importance of adaptability in ensuring reliable query execution in the presence of dynamics, both for single and multiple query executions. We propose a systematic approach that can selectively determine which queries to adapt and how to adapt them based on the types of queries, dynamics, and optimization goals.
일반주제명Computer science.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼