Efficient Embedded Segmentation Network for a Wearable Ultrasound Bladder Monitoring System
- 주제(키워드) Deep learning , Medical ultrasound , image processing , Image segmentation , AI accelerator , Hardware accelerator , 딥러닝 , 의료초음파 , 영상처리 , 영상분할 , 하드웨어 가속기
- 발행기관 서강대학교 일반대학원
- 지도교수 유양모
- 발행년도 2021
- 학위수여년월 2021. 2
- 학위명 석사
- 학과 및 전공 일반대학원 전자공학과
- UCI I804:11029-000000065879
- 본문언어 영어
- 저작권 서강대학교 논문은 저작권보호를 받습니다.
초록/요약
Medical ultrasound imaging system is safe and simple to implement on a single chip so that has a great potential for being used as the next generation connected healthcare device. However, there is an obstacle that a doctor's diagnosis is required to provide clinically useful information in ultrasound images. To resolve this problem, various studies using deep learning for diagnosis are actively being conducted. However, deep learning is difficult to use in edge devices due to its high computational complexity. Deep learning tasks are mainly performed based on general-purpose computing on the graphics processing unit (GPGPU) of the graphics processing unit (GPU), which can handle parallel tasks. However, it is not a proper architecture for deep learning tasks since it is inefficient for use in edge devices. In this paper, the architecture of a new AI accelerator for a system on chip (SoC), which can perform efficiently deep learning tasks without the GPU, is proposed. The proposed AI accelerator uses a FPGA module and an ARM core processor, and quickly computes separable convolutions with low hardware utilization to efficiently perform deep learning inference tasks. In addition, deep learning inference tasks are implemented using lightweight segmentation networks based on CNN at AI accelerators. This network has a U-Net-based deep learning architecture, and it is designed for AI accelerators. Finally, ultrasound bladder image segmentation is implemented using the AI accelerator and the lightweight segmentation network in a wearable ultrasound bladder system, which is as a connected healthcare device application, and showing about 10% reduction in performance compared to that from commercial CPUs with much less resource utilization.
more