In this contribution, we present a sound anomaly detection system based on a fully convolutional network which exploits image spatial filtering and an Atrous Spatial Pyramid Pooling module. To cope with the lack of datasets specifically designed for sound event detection, a dataset for the specific application of noisy bus environments has been designed. The dataset has been obtained by mixing background audio files, recorded in a real environment, with anomalous events extracted from monophonic collections of labelled audios. The performances of the proposed system have been evaluated through segment-based metrics such as error rate, recall, and F1-Score. Moreover, robustness and precision have been evaluated through four different tests. The analysis of the results shows that the proposed sound event detector outperforms both state-of-the-art methods and general purpose deep learning-solutions.
Download and Codes
Citing
If you use these resources, please cite the following article:
@article{neri_Access_2022, title = {{Sound Event Detection for Human Safety and Security in Nosiy Environments}}, author = {Michael Neri and Federica Battisti and Alessandro Neri and Marco Carli}, year = {2022}, journal = {IEEE Access}, doi = {10.1109/ACCESS.2022.3231681} }