자료유형 | 학위논문 |
---|---|
서명/저자사항 | Statistical and Computational Methods for Biological Data. |
개인저자 | Hao, Yuning. |
단체저자명 | Michigan State University. Statistics - Doctor of Philosophy. |
발행사항 | [S.l.]: Michigan State University., 2019. |
발행사항 | Ann Arbor: ProQuest Dissertations & Theses, 2019. |
형태사항 | 101 p. |
기본자료 저록 | Dissertations Abstracts International 81-03B. Dissertation Abstract International |
ISBN | 9781085696005 |
학위논문주기 | Thesis (Ph.D.)--Michigan State University, 2019. |
일반주기 |
Source: Dissertations Abstracts International, Volume: 81-03, Section: B.
Includes supplementary digital materials. Advisor: Xie, Yuying. |
이용제한사항 | This item must not be sold to any third party vendors.This item must not be added to any third party search indexes. |
요약 | The development of biological data focuses on machine learning and statistical methods. In immunotherapy, gene-expression deconvolution is used to quantify different types of cells in a mixed population. It provides a highly promising solution to rapidly characterize the tumor-infiltrating immune landscape and identify cold cancers. However, a major challenge is that gene-expression data are frequently contaminated by many outliers that decrease the estimation accuracy. Thus, it is imperative to develop a robust deconvolution method that automatically decontaminates data by reliably detecting and removing outliers. Our development of an algorithm called adaptive Least Trimmed Square (aLTS) identifies outliers in regression models, allows us to effectively detect and omit the outliers, and provides us robust estimations of the coefficients. For the guarantees of the convergence property and parameters recovery, we also included certain theoretical results.Another interesting topic is the investigation of the association of phenotype responses with the identified intricate patterns in transcription factor binding sites for DNA sequences. To address these concerns, we pushed forward with a deep learning-based framework. On one hand, to capture regulatory motifs, we utilized convolution and pooling layers. On the other hand, to understand the long-term dependencies among motifs, we used position embedding and multi-head self-attention layers. We pursued the improvement of our model's overall efficacy through the integration of transfer learning and multi-task learning. To ascertain confirmed and novel transcription factor binding motifs (TFBMs), along with their relationships internally, we provided interpretations of our DNA quantification model. |
일반주제명 | Statistics. Computer science. |
언어 | 영어 |
바로가기 |
: 이 자료의 원문은 한국교육학술정보원에서 제공합니다. |