대구한의대학교 향산도서관

링크메뉴

주메뉴

- 소장자료+타기관자료검색
- 단행본+목차
- 연속간행물
- 학위논문
- 멀티미디어자료
- 기사색인
- 고문헌
- 전자책
- 신착/인기자료
- 오디오북
- 전자책
- 전자잡지(교양)
- 국내학술 DB
- 국외학술 DB
- E-learning
- Hot Article
- 전자저널 A-Z
- SFX(구독자원연계) 이용안내
- 전국대학 학위논문검색
- 논문 표절검사 솔루션
- 소개
- 개관시간
- 대출/반납
- 실별안내
- 편의시설
- 전자정보 교외접속안내
- 도서기증
- 이용교육자료
- 연구지원서비스
- 특별회원제도 안내
- LIFE 독서 마라톤
- DHU독서샘물
- 공지사항
- 자주묻는질문
- 묻고답하기
- 자료실
- 베스트 셀러
- 추천도서
- 대출현황/연장/예약
- 캠퍼스간대출신청현황
- 자료배달신청현황
- 희망도서신청
- 나의서평
- 타기관자료이용신청
- 그룹스터디룸
- 영상시설이용
- 관심도서리스트
- 소재불명도서 신고처리현황
- 교육 및 참가신청
- 개인정보관리

상세정보

상세정보

Export to Refworks

부가기능

MARC보기

Design of a Scalable, Configurable, and Cluster-based Hierarchical Hardware Accelerator for a Cortically Inspired Algorithm and Recurrent Neural Networks

상세 프로파일

상세정보
자료유형	학위논문
서명/저자사항	Design of a Scalable, Configurable, and Cluster-based Hierarchical Hardware Accelerator for a Cortically Inspired Algorithm and Recurrent Neural Networks.
개인저자	Dey, Sumon.
단체저자명	North Carolina State University.
발행사항	[S.l.]: North Carolina State University., 2019.
발행사항	Ann Arbor: ProQuest Dissertations & Theses, 2019.
형태사항	128 p.
기본자료 저록	Dissertations Abstracts International 81-05B. Dissertation Abstract International
ISBN	9781392766576
학위논문주기	Thesis (Ph.D.)--North Carolina State University, 2019.
일반주기	Source: Dissertations Abstracts International, Volume: 81-05, Section: B. Advisor: Laber, Eric
이용제한사항	This item must not be sold to any third party vendors.
요약	Machine learning algorithms based on deep learning have met with enormous success to achieve higher performance in applications ranging from object recognition to defeating a human expert in the complex game. Furthermore, it can broaden the horizon by processing a massive amount of multimodal natural data (video, audio) and learning useful join representations in applications. In addition to deep learning, cortical learning algorithms can also learn representations in much closer to in a way a human brain works. Unlike the use of dense data in deep learning, binary data is used in cortical learning to model sparse distributed memory for different representations. These algorithms use artificial neural networks to model them into hardware. However, implementation of such networks in hardware relies on throughput and memory bandwidth of hardware architecture, which requires dealing with massive amount of data. To advance the research of these rapidly evolving techniques, there is a direct need for the design and implementation of specialized hardware to accelerate these algorithms. In this work, a scalable, configurable, and cluster-based hierarchical hardware accelerator is designed and implemented through an application-specific integrated circuit (ASIC) for Sparsey, a cortical learning algorithm. Also, an application-specific instruction set processor (ASIP) is designed and implemented for recurrent neural networks (RNNs). A distributed on-chip memory organization is designed and implemented in ASIC to improve memory bandwidth and accelerate the memory read and write operations for synaptic weight matrices. A bit-level data process from memory, storage, and special multiply-accumulate hardware are implemented for multiply-accumulation operations. The fixed-point arithmetic and fixed-point storage are also adapted in ASIC implementation. At 16nm, the ASIC of Sparsey achieved an overall speedup of 25.24x and 353.12x reduction in energy per frame, and 1.43x reduction in silicon area against a GPU. In ASIP, the emerging 3D-stacked memory is used to increase the off-chip memory bandwidth and sized on-chip memory to improve data locality inside the processor. A set of short instructions are also implemented in ASIP architecture after analyzing different complex, time-consuming, special operations into high-level functional blocks, and a look-up table based special function operations to improve its performance. State-of-the-art mixed precision training and inference are also adapted in this architecture. A high-level programming environment is also developed to generate Very Long Instruction Word (VLIW) instructions for ASIP to process a variant of RNNs. At 16nm, an ASIP achieved 1.5x - 5.6x faster processing, 4.3x - 40.8x reduction in energy per sequence, and 1.5x area benefit than a GPU.
일반주제명	Computer engineering. Artificial intelligence.
언어	영어
바로가기	: 이 자료의 원문은 한국교육학술정보원에서 제공합니다.

정보 더보기(Naver)

서평(리뷰)

서평(리뷰)

링크메뉴

주메뉴

전체메뉴

상세정보

부가기능

상세 프로파일

서평(리뷰)

태그

나의 태그

모든 이용자 태그

MY MENU

도서관정보

서평(리뷰)
별점:
별점
제목:

내용: