대구한의대학교 향산도서관

상세정보

부가기능

Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Distribution-based Summarization for Large Scale Simulation Data Visualization and Analysis.
개인저자Wang, Ko-Chih.
단체저자명The Ohio State University. Computer Science and Engineering.
발행사항[S.l.]: The Ohio State University., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항194 p.
기본자료 저록Dissertations Abstracts International 81-02B.
Dissertation Abstract International
ISBN9781085658799
학위논문주기Thesis (Ph.D.)--The Ohio State University, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
Advisor: Shen, Han-Wei.
이용제한사항This item must not be sold to any third party vendors.
요약The advent of high-performance supercomputers enables scientists to perform extreme-scale simulations that generate millions of cells and thousands of time steps. Through exploring and analyzing the simulation outputs, scientists can gain a deeper understanding of the modeled phenomena. When the size of simulation output is small, the common practice is to simply move the data to the machines that perform post analysis. However, as the size of data grows, the limited bandwidth and capacity of networking and storage devices that connect the supercomputers to the analysis machine become a major bottleneck. Therefore, visualizing and analyzing large-scale simulation datasets are posing significant challenges. This dissertation addresses the big data challenge and suggests distribution-based in-situ techniques. The technique uses the same supercomputer resources to analyze the raw data and generate compact data proxies which use distribution to statistically summarize the raw data. Only the compact data proxies are moved to the post-analysis machine to overcome the bottleneck. Because the distribution-based data representation keeps the statistical data properties, it has the potential to facilitate flexible post-hoc data analysis and enable uncertainty quantification. We firstly focus on the problem of large data volume rendering on resource-limited post analysis machines. To tackle the limited I/O bandwidth and storage space challenge, distributions are used to summarize the data. When visualizing the data, importance sampling is proposed to draw a small number of samples and minimize the demand of computational power. The error of the proxies is quantified and visually presented to scientists by uncertainty animation. We also tackle the problem of error reduction when approximating the spatial information in distribution-based representations. The error could cause low visualization quality and hinder the data exploration. The basic distribution-based approach is augmented by our proposed spatial distribution which is represented by a three-dimensional Gaussian Mixture Model (GMM). The new representation not only improves the visualization quality but can also be used in various visualization techniques, such as volume rendering, uncertain isosurface, and salient feature exploration. Then, a technique is developed to tackle the problem of large-scale time-varying datasets. This representation stores the time-varying datasets with a lower temporal resolution and utilizes the temporal coherence to reconstruct the data at non-sampled time steps. Each pixel ray at a view at non-sampled time step is decoupled into a value distribution and samples' location information. Our representation utilizes the data coherence to recover the samples' location information and store less data. In addition, similar value distributions from multiple rays are represented by one distribution to save more storage. Finally, a statistical-based super resolution technique is proposed to solve the big data problem caused by a huge parameter space. Simulation runs with a few parameter samples output full resolution data which is used to create the prior knowledge. Data from rest of simulation runs in the parameter space is statistically down-sampled to compact representation in situ to reduce the data size. These compact data representation can be reconstructed to high resolution by combining with the prior knowledge for data analysis.
일반주제명Computer engineering.
Computer science.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼