MARC보기
LDR00000nam u2200205 4500
001000000434641
00520200226161751
008200131s2019 ||||||||||||||||| ||eng d
020 ▼a 9781687955777
035 ▼a (MiAaPQ)AAI22621413
040 ▼a MiAaPQ ▼c MiAaPQ ▼d 247004
0820 ▼a 629.8
1001 ▼a Byravan, Arunkumar.
24510 ▼a Structured Deep Visual Dynamics Models for Robot Manipulation.
260 ▼a [S.l.]: ▼b University of Washington., ▼c 2019.
260 1 ▼a Ann Arbor: ▼b ProQuest Dissertations & Theses, ▼c 2019.
300 ▼a 178 p.
500 ▼a Source: Dissertations Abstracts International, Volume: 81-05, Section: B.
500 ▼a Advisor: Fox, Dieter.
5021 ▼a Thesis (Ph.D.)--University of Washington, 2019.
506 ▼a This item must not be sold to any third party vendors.
506 ▼a This item must not be added to any third party search indexes.
520 ▼a The emergence of deep learning, access to large amounts of data and powerful computing hardware have led to great strides in the state-of-the-art in robotics, computer vision, and AI. Unlike traditional methods that are strongly model-based with priors and explicit structural constraints, these newer learning approaches tend to be data-driven and often neglect the underlying problem structure. As a consequence, while they usually outperform their traditional counterparts on many problems, achieving good generalisation, interpretability, task transfer and data-efficiency has been challenging. Combining the strengths of the two paradigms, the flexibility of modern learning techniques, and the domain knowledge and structure of traditional methods should help bridge this gap.In this thesis, we will present work that combines these two paradigms, specifically in the context of learning visual dynamics models for robot manipulation tasks. This thesis is divided into two parts. In the first part, we discuss a structured approach to designing visual dynamics models for manipulation tasks. We propose a specific class of deep visual dynamics models (SE3-Nets) that explicitly encode strong physical and 3D geometric priors (specifically, rigid body physics) in their structure. As opposed to deep models that reason about motion a pixel level, SE3-Nets model the dynamics of observed scenes at the object level - they identify objects in the scene and predict rigid body rotation and translation per object. This leads to an interpretable architecture that can robustly model the dynamics of complex interactions. Next, we discuss SE3-Pose-Nets, an extension of SE3-Nets that additionally learns to estimate a latent, globally-consistent pose representation for objects and use the corresponding representation for real-time closed-loop visuomotor control of a Baxter robot. We show that the structure inherent in SE3-Pose-Nets allows them to be robust to visual perturbations and noise, generalizing to settings significantly different than seen during training. We also briefly discuss Dynamics-Nets, a recurrent extension to SE3-Pose-Nets that can be used for the control of dynamical systems.In the second part of the thesis, we present an approach towards solving long-horizon manipulation tasks, using reinforcement learning
590 ▼a School code: 0250.
650 4 ▼a Robotics.
690 ▼a 0771
71020 ▼a University of Washington. ▼b Computer Science and Engineering.
7730 ▼t Dissertations Abstracts International ▼g 81-05B.
773 ▼t Dissertation Abstract International
790 ▼a 0250
791 ▼a Ph.D.
792 ▼a 2019
793 ▼a English
85640 ▼u http://www.riss.kr/pdu/ddodLink.do?id=T15493810 ▼n KERIS ▼z 이 자료의 원문은 한국교육학술정보원에서 제공합니다.
980 ▼a 202002 ▼f 2020
990 ▼a ***1008102
991 ▼a E-BOOK