검색 상세

Beyond Noise Suppression : Dynamics-driven Distortion Control Loss for Speech Enhancement and Robust Automatic Speech Recognition

목차

1 Introduction 1
2 Proposed Method 4
2.1 Speech Enhancement Model 4
2.2 Proposed Loss Function 5
2.2.1 Magnitude Distortion Control (Lmag) 6
2.2.2 Temporal Dynamics Preservation (Ldyn) 7
2.2.3 Residual Suppression (Lrs) 8
2.2.4 Final Loss Formulation 8
3 Experiments 10
3.1 Experimental Setup 10
3.1.1 Dataset 10
3.1.2 Training and Model Configuration 11
3.1.3 Baseline Loss Functions 12
3.2 Experimental Results 14
3.2.1 Signal-Level Performance Evaluation 15
3.2.2 Spectrogram Analysis 21
3.2.3 Impact of Various Loss Functions on ASR Performance 23
3.2.4 Generalization to Unseen SNR Conditions 24
3.2.5 Ablation Study on Loss Components 26
4 Conclusion 29
4.1 Summary 29
4.2 Limitations and Future Works 30
Bibliography 32

more