대구한의대학교 향산도서관

상세정보

부가기능

Bayesian Approaches for Modeling Variation

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Bayesian Approaches for Modeling Variation.
개인저자Moran, Gemma E.
단체저자명University of Pennsylvania. Statistics.
발행사항[S.l.]: University of Pennsylvania., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항144 p.
기본자료 저록Dissertations Abstracts International 81-02B.
Dissertation Abstract International
ISBN9781085576550
학위논문주기Thesis (Ph.D.)--University of Pennsylvania, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
Advisor: George, Edward I.
이용제한사항This item must not be sold to any third party vendors.This item must not be added to any third party search indexes.This item must not be added to any third party search indexes.This item must not be sold to any third party vendors.
요약A core focus of statistics is determining how much of the variation in data may be attributed to the signal of interest, and how much to noise. When the sources of variation are many and complex, a Bayesian approach to data analysis offers a number of advantages. In this thesis, we propose and implement new Bayesian methods for modeling variation in two general settings. The first setting is high-dimensional linear regression where the unknown error variance is also of interest. Here, we show that a commonly used class of conjugate shrinkage priors can lead to underestimation of the error variance. We then extend the Spike-and-Slab Lasso (SSL, Rockova and George, 2018) to the unknown variance case, using an alternative, independent prior framework. This extended procedure outperforms both the fixed variance approach and alternative penalized likelihood methods on both simulated and real data.For the second setting, we move from univariate response data where the predictors are known, to multivariate response data in which potential predictors are unobserved. In this setting, we first consider the problem of biclustering, where a motivating example is to find subsets of genes which have similar expression in a subset of patients. For this task, we propose a new biclustering method called Spike-and-Slab Lasso Biclustering (SSLB). SSLB utilizes the SSL prior to find a doubly-sparse factorization of the data matrix via a fast EM algorithm. Applied to both a microarray dataset and a single-cell RNA-sequencing dataset, SSLB recovers biologically meaningful signal in the data.The second problem we consider in this setting is nonlinear factor analysis. The goal here is to find low-dimensional, unobserved "factors'' which drive the variation in the high-dimensional observed data in a potentially nonlinear fashion. For this purpose, we develop factor analysis BART (faBART), an MCMC algorithm which alternates sampling from the posterior of (a) the factors and (b) a functional approximation to the mapping from the factors to the data. The latter step utilizes Bayesian Additive Regression Trees (BART, Chipman et al., 2010). On a variety of simulation settings, we demonstrate that with only the observed data as the input, faBART is able to recover both the unobserved factors and the nonlinear mapping.
일반주제명Statistics.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼