Improved Multiple Sound Source Localization and Separation Using Online Convolutional Beamformer Guided by Microphone Array Geometry
- 주제어 (키워드) acoustic transfer vector , blind source separation , convolutional beamformer , microphone array , multiple sound source localization , robust speech recognition , source separation , SVD-PHAT
- 발행기관 서강대학교 일반대학원
- 지도교수 박형민
- 발행년도 2025
- 학위수여년월 2025. 2
- 학위명 박사
- 학과 및 전공 일반대학원 전자공학과
- 실제 URI http://www.dcollection.net/handler/sogang/000000079379
- UCI I804:11029-000000079379
- 본문언어 영어
- 저작권 서강대학교 논문은 저작권 보호를 받습니다.
목차
1 Introduction 1
1.1 Research Background 1
1.1.1 Challenges of ASR in Cocktail Party Effect 1
1.1.2 Multiple Sound Source Localization 2
1.1.3 Guided Blind Source Separation 4
1.2 Overview of the Proposed Method 6
1.3 Outline of the Thesis 7
2 Convolutional Beamformer 9
2.1 Microphone Array Signal Model 9
2.2 Source-Wise Factorization and Probabilistic Model 10
2.2.1 Source-Wise Factorization 10
2.2.2 Probabilistic Model 11
2.3 Offline Convolutional Beamformer 12
2.3.1 Negative Log-likelihood Function 12
2.3.2 Parameters Optimization 12
2.4 Online Convolutional Beamformer 14
2.4.1 Negative Log-likelihood Function with Forgetting Factor 14
2.4.2 Parameters Optimization 15
3 Multiple Sound Source Localization Based on Convolutional Beamformer 18
3.1 Direction of Arrival Vector 19
3.2 Conventional Multiple Sound Source Localization Methods 20
3.2.1 The Acoustic Transfer Vector 20
3.2.2 Beam Pattern-based Multiple Sound Source Localization 20
3.2.3 TDOA-based Multiple Sound Source Localization 23
3.3 Proposed Multiple Sound Source Localization 24
3.3.1 SRP-PHAT-ATV 25
3.3.2 SVD-PHAT-ATV 25
3.3.3 Online SVD-PHAT-ATV 28
3.4 Experimental Results 29
3.4.1 Experiments with Simulation Data 29
3.4.2 Experiments with LOCATA Dataset 36
4 Sound Source Separation Based on Guided Convolutional Beamformer 42
4.1 Conventional Guided Separation Methods 42
4.1.1 Sound Source Location Guided Separation 42
4.1.2 DNN Guided Separation 46
4.2 Proposed Guided Separation Methods 47
4.2.1 AG-oCBF 47
4.2.2 AG-oCBF-Spatial 51
4.2.3 AG-oCBF-DNN 52
4.2.4 Practical Considerations 53
4.3 Experimental Results 59
4.3.1 Experiments on Simulation Data 59
4.3.2 Experiments on REVERB 2-MIX 65
4.3.3 Experiments on LibriCSS 67
5 Conclusion and Future Works 70
5.1 Conclusion 70
5.2 Future Works 71
Bibliography 73

