대구한의대학교 향산도서관

상세정보

부가기능

Geometric Deep Learning for Monocular Object Orientation Estimation

상세 프로파일

상세정보
자료유형학위논문
서명/저자사항Geometric Deep Learning for Monocular Object Orientation Estimation.
개인저자Mahendran, Siddharth.
단체저자명The Johns Hopkins University. Electrical and Computer Engineering.
발행사항[S.l.]: The Johns Hopkins University., 2019.
발행사항Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항238 p.
기본자료 저록Dissertations Abstracts International 81-06B.
Dissertation Abstract International
ISBN9781687987136
학위논문주기Thesis (Ph.D.)--The Johns Hopkins University, 2019.
일반주기 Source: Dissertations Abstracts International, Volume: 81-06, Section: B.
Advisor: Vidal, Rene.
이용제한사항This item must not be sold to any third party vendors.
요약Monocular object orientation estimation or estimating the 3D orientation of an object given a single 2D image of the object, is an important component of traditional computer vision problems like scene understanding and 3D reconstruction as well as modern vision challenges like autonomous driving, augmented reality and robot manipulation. A main challenge of the object orientation estimation problem is that the task of estimating 3D orientation from a single 2D image is ill-posed. It requires a 3D object model in the loop and a key disadvantage of prior work using geometric models is that they are hand-crafted. Recent work proposes the use of powerful deep learning features via Convolutional Neural Networks (CNNs) that learn appropriate features and models from the data. Prior to the work described in this thesis, deep learning models for object orientation estimation formulated the problem as one of pose classification on the Euler angles. However, this ignores the geometry of the problem and the inherent structure in the orientation space, the set of all rotation matrices, SO(3).This thesis uses Geometric Deep Learning models for the orientation estimation task, which incorporate geometry of the orientation space into the deep learning pipeline by carefully choosing and designing representations, loss functions and network architectures well suited for this application. We first consider the problem of estimating the orientation of an object in an image assuming known object category and a bounding box containing the object in the image. We show that modeling the orientation space correctly by designing Riemannian CNNs i.e. regression and classification CNNs that use axis-angle or quaternion representations of rotation matrices and geodesic loss functions, leads to good performance on a challenging benchmarking dataset. We also propose a family of Bin & Delta models that combine pose classification CNNs (bin model) to get a coarse estimate of the object orientation and pose regression CNNs (delta model) that refine the coarse orientation estimate. Such models achieve state-of-the-art performance in benchmark datasets. Additionally, we have extended these models to the scenarios of unknown categorization and unknown localization by designing novel Integrated Networks to solve these multi-task problems.
일반주제명Computer science.
Artificial intelligence.
Robotics.
언어영어
바로가기URL : 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

서평(리뷰)

  • 서평(리뷰)

태그

  • 태그

나의 태그

나의 태그 (0)

모든 이용자 태그

모든 이용자 태그 (0) 태그 목록형 보기 태그 구름형 보기
 
로그인폼