대구한의대학교 향산도서관

상세정보

부가기능

Chemical Process Data Analytics via Text Mining and Machine Learning

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Chemical Process Data Analytics via Text Mining and Machine Learning.
개인저자Zhang, Tong.
단체저자명Carnegie Mellon University. Chemical Engineering.
발행사항[S.l.]: Carnegie Mellon University., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항185 p.
기본자료 저록Dissertations Abstracts International 81-05B.
Dissertation Abstract International
ISBN9781088389799
학위논문주기Thesis (Ph.D.)--Carnegie Mellon University, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
Advisor: Sahinidis, Nikolaos V.
이용제한사항This item must not be sold to any third party vendors.This item must not be added to any third party search indexes.
요약Today, chemical engineers have access to enormous amounts of data from a variety of sources. Decision makers are frequently tasked with manipulating and analyzing complex datasets. This data can be generated in different forms, such as numerical data, text data, and graphical data. Although numerous studies explore numerical data analysis, only a very small number explore text data and graphical data.In this dissertation, we develop different methodologies that integrate text mining techniques with optimization algorithms to automatically extract information from text and graphical data in chemical engineering applications. We use graphical data in Chapter 2, and text data in Chapters 3, 4, and 5.In Chapter 2, we address the problem of mining chemical flowsheets for process patterns. We propose a systematic methodology for mining structural patterns in chemical process flowsheets using sequence comparison algorithms. Our proposed methodology consists of three major steps. First, we generate graphical representations of general process flowsheets. Second, we use a depth-first search algorithm to traverse the graph of a flowsheet and convert it into a string. Finally, we use sequence alignment algorithms to mine flowsheet strings for process patterns. Depending on which alignment algorithm is used, the identified process patterns may or may not have inserted gaps. In addition, we conduct several case studies and present many resulting flowsheet patterns, which we are able to relate to heuristic rules in the literature.In Chapter 3, we address the problem of evaluating chemical patents. We propose the simultaneous use of eight criteria for patent ranking and evaluation. We also develop an intuitive linear optimization model that determines how to weigh different criteria. Our proposed methodology has been implemented in a web-based decision support system, and tested for its ability to identify the most important patents in the production of 22 chemicals.In Chapter 4, we analyze a collection of scientific literature using a technique in unsupervised data analytics, called "topic modeling." We use a state-of-the-art topic model to study the topic coverage in Computers & Chemical Engineering. This topic model uses the nonnegative matrix factorization technique to uncover the latent semantic structure (topics) in the documents of the journal. The results show that the journal has expanded its original four topics to 18 topics nowadays. Since 2000, the supply chain topic has grown rapidly and become a popular research area.In Chapter 5, we tackle a supervised learning task. We propose a modeling framework that uses derivative-free optimization to optimize document classification models with imbalanced datasets. Document classification models are considered to be black-box systems due to their hyperparameters. Derivative-free optimization is a well-suited technique for optimizing the performance of black-box systems. The nature of data imbalance affects a model's performance in two ways, both of which we address in our proposed modeling framework. To address the first effect, we maximize the smallest F1 prediction accuracy, and to address the second effect, we maximize the model prediction accuracy. Applied to a real dataset from Linde, our methodology resulted in up to 61% improvements of manual classification schemes.
일반주제명Chemical engineering.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼