An Audio Event Detection Method Robust to Inaccurate Timestamps by Limiting Event Boundary Intervals
- 주제(키워드) Audio Event Detection , Event Detection , Label noise
- 발행기관 서강대학교 일반대학원
- 지도교수 박형민
- 발행년도 2021
- 학위수여년월 2021. 2
- 학위명 석사
- 학과 및 전공 일반대학원 전자공학과
- UCI I804:11029-000000065939
- 본문언어 영어
- 저작권 서강대학교 논문은 저작권보호를 받습니다.
초록/요약
This thesis addresses the label noise issue in audio event detection (AED) by refining strong labels with inaccurate timestamps into sequential labels. In AED, the strong labels contain the occurrence of a specific event and its timestamps corresponding to the start and end of the event in an audio clip. The timestamps are very useful information for training a model, but label noise is inevitable because the boundaries of events are ambiguous or depend on the subjectivity of each annotator. To avoid performance degradation caused by the label noise, we propose an AED scheme to train with sequential labels in addition to given strong labels after converting the strong labels into the sequential labels. In particular, in order to fully exploit information from the available strong labels when calculating the sequential loss, we additionally propose a sequential loss calculation method that considers the error-prone time information. Since sequential labels have only sequence information refined from strong labels, the effect of the label noise is reduced by emphasizing the accurate information of the strong labels by using strong and sequential labels together. In addition, by limiting the frame interval, at which event boundaries can occur, with timestamps of the strong labels, we trained the model more efficiently. Experimental results on DCASE 2019 Task 4 demonstrated that the proposed method could successfully mitigate the label noise.
more

