대구한의대학교 향산도서관

상세정보

부가기능

Clinical Information Extraction from Unstructured Free-Texts

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Clinical Information Extraction from Unstructured Free-Texts.
개인저자Tao, Mingzhe.
단체저자명State University of New York at Albany. Information Science.
발행사항[S.l.]: State University of New York at Albany., 2018.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2018.
형태사항142 p.
기본자료 저록Dissertation Abstracts International 79-12A(E).
Dissertation Abstract International
ISBN9780438255500
학위논문주기Thesis (Ph.D.)--State University of New York at Albany, 2018.
일반주기 Source: Dissertation Abstracts International, Volume: 79-12(E), Section: A.
Advisers: Ozlem Uzuner
요약Information extraction (IE) is a fundamental component of natural language processing (NLP) that provides a deeper understanding of the texts. In the clinical domain, documents prepared by medical experts (e.g., discharge summaries, drug labels,
요약In the past decade, there have been many efforts focused on extraction of clinical information, i.e., clinical IE. In this dissertation, we present novel extensions to IE methods for automatically identifying clinically-relevant information from
요약(1) Knowledge representations that utilize real-valued word embeddings outperform their categorical counterparts. Categorical embeddings eliminate word-to-word distances in the high-dimensional space when converting words into discrete labels. R
요약(2) Introducing pseudo-sequences from unannotated data can improve extraction of entity categories that are sparsely represented in the training data. We use a supervised model trained on annotated data to predict pseudo-sequences from unannotat
요약(3) We can address lack of available annotated data through pseudo-data generation. We experiment with three different methods of pseudo-data generation. The first method is based on professional gazetteers. It replaces entities in the annotated
요약(4) Sequence labeling approach to relation extraction can benefit this task. Sequence labeling can identify textual excerpts that contain entities and enables subsequent extraction of sequences of related entities from these excerpts.
요약Cross-validated results across multiple clinical IE tasks show overall significant performance improvement from the knowledge representations, pseudo-sequences, pseudo-data, and relation extraction models we proposed in our study. The generalize
일반주제명Information science.
Computer science.
Bioinformatics.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼