MARC보기
LDR00000nam u2200205 4500
001000000433338
00520200225140529
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781085746366
035 ▼a (MiAaPQ)AAI13884770
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 004
1001 ▼a Pervaiz, Fahad.
24510 ▼a Understanding Challenges in the Data Pipeline for Development Data.
260 ▼a [S.l.]: ▼b University of Washington., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 165 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-03, Section: B.
500 ▼a Advisor: Anderson, Richard.
5021 ▼a Thesis (Ph.D.)--University of Washington, 2019.
506 ▼a This item must not be sold to any third party vendors.
506 ▼a This item must not be added to any third party search indexes.
520 ▼a The developing world is relying more and more on data driven policies. Numerous development agencies have pushed for on-ground data collection to support the development work they pursue. Many governments have launched efforts for more frequent information gathering. Overall, the amount of data collected is tremendous, yet we face significant issues in doing useful analysis. Most of these barriers are around data cleaning and merging, and they require a data engineer to support some parts of the analysis. This thesis aims to understand the pain points of cleaning development data. It also proposes solutions that harness the thought process of a data engineer to reduce the manual workload of the tedious process of cleaning such data. To achieve these goals, two research areas are critical: (1) to discern current data usage patterns and to build a taxonomy of data cleaning in the developing world
590 ▼a School code: 0250.
650 4 ▼a Computer science.
690 ▼a 0984
71020 ▼a University of Washington. ▼b Computer Science and Engineering.
7730 ▼t Dissertations Abstracts International ▼g 81-03B.
773 ▼t Dissertation Abstract International
790 ▼a 0250
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15491392 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1816162
991 ▼a E-BOOK