검색 상세

Optimization of neural network accelerators for edge device computation

초록 (요약문)

This paper introduces a set of optimization strategies for neural network accelerators to enhance computation on edge devices. First, we present Teleport, a hardware accelerator designed for a lightweight neural network called ShiftNet. Shift Convolution operations are performed with low utilization on conventional hardware, while Teleport uses an Address Translator to process Shift Convolution more efficiently. Secondly, we propose NexusCIM, which performs DNN computations while minimizing communication bottlenecks in a multi-CIM architecture. Traditional multi- CIM architectures, simultaneous data transfers from each CIM unit during DNN computations lead to communication bottlenecks. To address this problem, NexusCIM replaced the router with a hub core and implemented C-Mesh NoC.

more

목차

I. Research Overview 2
II. Introduction 3
III. Hardware Accelerator for ShiftNet (Computation Bottleneck) 6
3.1 Preliminaries 6
3.1.1 Shift Convolution 6
3.1.2 Systolic Array 8
3.1.3 Previous Work 9
3.2 Teleport Architecture 12
3.2.1 Address Translator 12
3.2.2 Low-Cost Systolic Loader 15
3.2.3 Top-Level Architecture 16
3.2.4 Network Mapping and Dataflow 18
3.3 Results 20
3.3.1 Experimental Setup 20
3.3.2 Results 22
3.4 Summary 27
IV. Multi-CIM Architecture for DNN (Communication Bottleneck) 28
4.1 Preliminaries 28
4.1.1 DNN on Multi-CIM 28
4.1.2 Challenges of DNN on Multi-CIM 31
4.1.3 Previous Work 32
4.2 NexusCIM Architecture 34
4.2.1 Top-Level Architecture 35
4.2.2 Nexus Block Dataflow 36
4.2.3 Hub Core 39
4.2.4 Reconfigurable CIMU Group Modes 41
4.2.5 Mapping Strategy 42
4.3 Results 45
4.3.1 Experimental Setup 45
4.3.2 Results 45
4.4 Summary 54
V. Conclusion 55
References 57

more