Accelerating Diffusion Transformers by Dynamically Skipping Redundant Operations
- 발행기관 서강대학교 일반대학원
- 지도교수 류성주
- 발행년도 2026
- 학위수여년월 2026. 2
- 학위명 석사
- 학과 및 전공 일반대학원 반도체공학과
- 실제URI http://www.dcollection.net/handler/sogang/000000082308
- UCI I804:11029-000000082308
- 본문언어 영어
- 저작권 논문은 저작권에 의해 보호받습니다.
초록(요약문)
Diffusion Transformers (DiTs) have demonstrated outstanding performance as generative models, but their computationally expensive iterative sampling process results in significant latency and energy costs. We propose a novel software-hardware co-optimized acceleration framework designed to address these computational challenges by leveraging the inherent temporal redundancy in the DiT inference process. We introduce a redundancy-aware computing mechanism that selectively skips redundant operations and reuses computational results from previous timestep. To minimize potential accuracy degradation caused by cumulative approximation errors, dynamic threshold scaling (DTS) method is employed to adjust similarity criteria. Furthermore, we design dedicated units capable of efficient low bit compression and comparison to reduce hardware overhead. We design an accelerator architecture based on this dynamic skipping method, and experiments confirm it achieves substantial performance and energy gains while maintaining output quality.
more목차
Ⅰ. Introduction 1
1.1. Background 1
1.2. Problem Statement 2
Ⅱ. Preliminaries 4
2.1. Diffusion Model 4
2.2. Diffusion Transformer 7
Ⅲ. RADiT: Redundancy-Aware Diffusion Transformer 9
3.1. Observation 9
3.2. Overview 13
3.3. Redundancy-Aware Algorithms 15
3.3.1 Block-Level Redundancy Detection 16
3.3.2 Dynamic Threshold Scaling 19
3.3.3 Efficient Bit Compression 23
3.4 Hardware Architecture 27
3.5 Summary 29
Ⅳ. Experiments and Results 31
4.1. Experimental Setup 31
4.2 Evaluations 33
4.2.1 Accuracy Evaluation 33
4.2.2 Performance Evaluation 36
4.3 Summary 40
Ⅴ. Conclusion 42
References 43

