MARC보기
LDR00000nam u2200205 4500
001000000433303
00520200225135940
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781085567510
035 ▼a (MiAaPQ)AAI13812570
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 001
1001 ▼a Aggour, Kareem Sherif.
24510 ▼a Intelligent and Scalable Algorithms for Canonical Polyadic Decomposition.
260 ▼a [S.l.]: ▼b Rensselaer Polytechnic Institute., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 168 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
500 ▼a Advisor: Yener, Bulent
5021 ▼a Thesis (Ph.D.)--Rensselaer Polytechnic Institute, 2019.
506 ▼a This item must not be sold to any third party vendors.
520 ▼a Over the past two decades, there has been a dramatic increase in the volume and variety of data generated in almost every scientific discipline. To enable the efficient storage and processing of these massive datasets, a variety of fault tolerant, scalable distributed storage and processing platforms have been popularized---most famously, Hadoop MapReduce and Spark. Novel distributed algorithms are being developed to take full advantage of these platforms, including scalable variants of algorithms such as Canonical Polyadic Decomposition (CPD), an unsupervised learning technique frequently used in data mining and machine learning applications to discover latent factors in a class of multimodal datasets referred to as tensors. Current research in scalable CPD algorithms have focused almost exclusively on the analysis of large sparse tensors, however.This research addresses the complementary need for efficient, scalable algorithms to decompose large dense tensors that arise in many signal processing and anomaly detection applications. To that end, we developed a progression of algorithms designed for MapReduce settings that incorporate combinations of regularization and sketching to efficiently operate on dense, skewed tensors. The first MapReduce CPD algorithm utilizes an Alternating Least Squares (ALS) strategy that is mathematically equivalent to the classical sequential CPD-ALS algorithm. A second algorithm was then developed that features regularization and sketching working in tandem to accelerate and stabilize tensor decompositions. Prior research had demonstrated the benefits of applying either regularization or sketching to CPD-ALS, but to our knowledge this work is the first to demonstrate the utility of using both together, outperforming the use of either technique alone. However, this algorithm requires the manual selection of the sketching and regularization hyperparameter values.We next developed two novel algorithms that employ online learning-based approaches to dynamically select the sketching rate and regularization parameters at runtime, further optimizing CP decompositions while simultaneously eliminating the burden of manual hyperparameter selection. This work is the first to intelligently choose the sketching rate and regularization parameters at each iteration of a CPD algorithm to balance the trade-off between minimizing the runtime and maximizing the decomposition accuracy. On both synthetic and real data, it was observed that for noisy tensors, our intelligent CPD algorithm produces decompositions of accuracy comparable to the exact distributed CPD-ALS algorithm in less time, often half the time. For ill-conditioned tensors, given the same time budget, the intelligent CPD algorithm produces decompositions with significantly lower relative error, often yielding an order of magnitude improvement.
590 ▼a School code: 0185.
650 4 ▼a Computer science.
650 4 ▼a Artificial intelligence.
690 ▼a 0984
690 ▼a 0800
71020 ▼a Rensselaer Polytechnic Institute. ▼b Computer Science.
7730 ▼t Dissertations Abstracts International ▼g 81-02B.
773 ▼t Dissertation Abstract International
790 ▼a 0185
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15490755 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1816162
991 ▼a E-BOOK