dCollection 디지털 학술정보 유통시스템

Accelerating Diffusion Transformers by Dynamically Skipping Redundant Operations

원문보기

발행기관 서강대학교 일반대학원
지도교수 류성주
발행년도 2026
학위수여년월 2026. 2
학위명 석사
학과 및 전공 일반대학원 반도체공학과
실제URI http://www.dcollection.net/handler/sogang/000000082308
UCI I804:11029-000000082308
본문언어 영어
저작권 논문은 저작권에 의해 보호받습니다.

초록(요약문)

Diffusion Transformers (DiTs) have demonstrated outstanding performance as generative models, but their computationally expensive iterative sampling process results in significant latency and energy costs. We propose a novel software-hardware co-optimized acceleration framework designed to address these computational challenges by leveraging the inherent temporal redundancy in the DiT inference process. We introduce a redundancy-aware computing mechanism that selectively skips redundant operations and reuses computational results from previous timestep. To minimize potential accuracy degradation caused by cumulative approximation errors, dynamic threshold scaling (DTS) method is employed to adjust similarity criteria. Furthermore, we design dedicated units capable of efficient low bit compression and comparison to reduce hardware overhead. We design an accelerator architecture based on this dynamic skipping method, and experiments confirm it achieves substantial performance and energy gains while maintaining output quality.

Ⅰ. Introduction 1
1.1. Background 1
1.2. Problem Statement 2
Ⅱ. Preliminaries 4
2.1. Diffusion Model 4
2.2. Diffusion Transformer 7
Ⅲ. RADiT: Redundancy-Aware Diffusion Transformer 9
3.1. Observation 9
3.2. Overview 13
3.3. Redundancy-Aware Algorithms 15
3.3.1 Block-Level Redundancy Detection 16
3.3.2 Dynamic Threshold Scaling 19
3.3.3 Efficient Bit Compression 23
3.4 Hardware Architecture 27
3.5 Summary 29
Ⅳ. Experiments and Results 31
4.1. Experimental Setup 31
4.2 Evaluations 33
4.2.1 Accuracy Evaluation 33
4.2.2 Performance Evaluation 36
4.3 Summary 40
Ⅴ. Conclusion 42
References 43

반출 Meta View 목록

서강대학교

검색 상세

Accelerating Diffusion Transformers by Dynamically Skipping Redundant Operations

초록(요약문)

목차