대구한의대학교 향산도서관

상세정보

부가기능

Embodied Visual Perception Models for Human Behavior Understanding

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Embodied Visual Perception Models for Human Behavior Understanding.
개인저자Bertasius, Gediminas.
단체저자명University of Pennsylvania. Computer and Information Science.
발행사항[S.l.]: University of Pennsylvania., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항258 p.
기본자료 저록Dissertations Abstracts International 81-02B.
Dissertation Abstract International
ISBN9781085560221
학위논문주기Thesis (Ph.D.)--University of Pennsylvania, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
Advisor: Shi, Jianbo.
이용제한사항This item must not be sold to any third party vendors.
요약Many modern applications require extracting the core attributes of human behavior such as a person's attention, intent, or skill level from the visual data. There are two main challenges related to this problem. First, we need models that can represent visual data in terms of object-level cues. Second, we need models that can infer the core behavioral attributes from the visual data. We refer to these two challenges as ``learning to see'', and ``seeing to learn'' respectively. In this PhD thesis, we have made progress towards addressing both challenges.We tackle the problem of ``learning to see'' by developing methods that extract object-level information directly from raw visual data. This includes, two top-down contour detectors, DeepEdge and HfL, which can be used to aid high-level vision tasks such as object detection. Furthermore, we also present two semantic object segmentation methods, Boundary Neural Fields (BNFs), and Convolutional Random Walk Networks (RWNs), which integrate low-level affinity cues into an object segmentation process. We then shift our focus to video-level understanding, and present a Spatiotemporal Sampling Network (STSN), which can be used for video object detection, and discriminative motion feature learning.Afterwards, we transition into the second subproblem of ``seeing to learn'', for which we leverage first-person GoPro cameras that record what people see during a particular activity. We aim to infer the core behavior attributes such as a person's attention, intention, and his skill level from such first-person data. To do so, we first propose a concept of action-objects--the objects that capture person's conscious visual (watching a TV) or tactile (taking a cup) interactions. We then introduce two models, EgoNet and Visual-Spatial Network (VSN), which detect action-objects in supervised and unsupervised settings respectively. Afterwards, we focus on a behavior understanding task in a complex basketball activity. We present a method for evaluating players' skill level from their first-person basketball videos, and also a model that predicts a player's future motion trajectory from a single first-person image.
일반주제명Computer science.
Artificial intelligence.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼