Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Environmental sound recognition using short-time feature aggregation
Roma G., Herrera P., Nogueira W. Journal of Intelligent Information Systems51 (3):457-475,2018.Type:Article
Date Reviewed: Jan 31 2019

Enabling the automatic human-level (or better) detection and classification of audio events and sound environments would be a clear plus for artificial intelligence (AI)-based applications such as robotics and social signal processing. Typical machine learning approaches to such analysis problems rely on the prior extraction of description features from raw data before semantic analysis; audio-specific feature proposals abound, from frame-based mel-frequency cepstral coefficients (MFCCs) to recurrence quantification analysis (RQA) data.

This paper provides experimental evidence that accuracy gains can be expected from both aggregating short-time features and separating the event detection and classification tasks. First, a new framework for the automatic frequency-domain-based recognition of environmental sounds and a new single-channel noise reduction algorithm are introduced and used in four experiments. Experiment 1 focuses on RQA and suggests that the RQA+MFCC combination performs better than existing related approaches for scene classification on the D-CASE2013, “in-house,” and Rouen datasets. Experiment 2 reaches similar conclusions regarding aggregation for event classification. Experiment 3 addresses segmentation issues, where the goal is to detect events independently of their class. Finally, experiment 4 looks at joint detection and classification; here, aggregating some features (RQA) helped, as did noise reduction, while others (derivative statistics), not so much.

Overall, this rather technical paper provides some experimental motivation for additional research focusing on independent segmentation, detection, and classification of environmental sounds, in particular using the promising approach of feature aggregation. It will be of interest to researchers and advanced graduate students well versed in audio semantic analysis techniques.

Reviewer:  P. Jouvelot Review #: CR146408 (1905-0184)
Bookmark and Share
  Editor Recommended
 
 
Sound And Music Computing (H.5.5 )
 
 
Feature Evaluation And Selection (I.5.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Sound And Music Computing": Date
Music, cognition, and computerized sound
Cook P., MIT Press, Cambridge, MA, 1999. Type: Book (9780262032568)
Jul 1 1999
Linux music & sound
Phillips D., No Starch Press, San Francisco, CA, 2000.  399, Type: Book (9781886411340)
Aug 1 2001
Machine musicianship
Rowe R., MIT Press, Cambridge, MA, 2001.  399, Type: Book (9780262182065)
Aug 1 2001
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy