MARC보기
LDR00000nam u2200205 4500
001000000433262
00520200225132827
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781085704748
035 ▼a (MiAaPQ)AAI13885922
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 004
1001 ▼a Chen, Xilun.
24510 ▼a Efficient Incremental Model Learning on Data Streams.
260 ▼a [S.l.]: ▼b Arizona State University., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 185 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
500 ▼a Advisor: Candan, K. Selcuk.
5021 ▼a Thesis (Ph.D.)--Arizona State University, 2019.
506 ▼a This item must not be sold to any third party vendors.
520 ▼a With the development of modern technological infrastructures, such as social networks or the Internet of Things (IoT), data is being generated at a speed that is never before seen. Analyzing the content of this data helps us further understand underlying patterns and discover relationships among different subsets of data, enabling intelligent decision making. In this thesis, I first introduce the Low-rank, Windowed, Incremental Singular Value Decomposition (SVD) framework to inclemently maintain SVD factors over streaming data. Then, I present the Group Incremental Non-Negative Matrix Factorization framework to leverage redundancies in the data to speed up incremental processing. They primarily tackle the challenges of using factorization models in the scenarios with streaming textual data. In order to tackle the challenges in improving the effectiveness and efficiency of generative models in this streaming environment, I introduce the Incremental Dynamic Multiscale Topic Model framework, which identifies multi-scale patterns and their evolutions within streaming datasets. While the latent factor models assume the linear independence in the latent factors, the generative models assume the observation is generated from a set of latent variables with various distributions. Furthermore, some models may not be accessible or their underlying structures are too complex to understand, such as simulation ensembles, where there may be thousands of parameters with a huge parameter space, the only way to learn information from it is to execute real simulations. When performing knowledge discovery and decision making through data- and model-driven simulation ensembles, it is expensive to operate these ensembles continuously at large scale, due to the high computational. Consequently, given a relatively small simulation budget, it is desirable to identify a sparse ensemble that includes the most informative simulations, while still permitting effective exploration of the input parameter space. Therefore, I present Complexity-Guided Parameter Space Sampling framework, which is an intelligent, top-down sampling scheme to select the most salient simulation parameters to execute, given a limited computational budget. Moreover, I also present a Pivot-Guided Parameter Space Sampling framework, which incrementally maintains a diverse ensemble of models of the simulation ensemble space and uses a pivot guided mechanism for future sample selection.
590 ▼a School code: 0010.
650 4 ▼a Computer science.
690 ▼a 0984
71020 ▼a Arizona State University. ▼b Computer Science.
7730 ▼t Dissertations Abstracts International ▼g 81-04B.
773 ▼t Dissertation Abstract International
790 ▼a 0010
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15491472 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1816162
991 ▼a E-BOOK