Computing Reviews

A novel framework for efficient automated singer identification in large music databases
Shen J., Shepherd J., Cui B., Tan K. ACM Transactions on Information Systems27(3):1-31,2009.Type:Article
Date Reviewed: 06/01/10

It is very difficult to automate the recognition of music and musicians by content rather than bibliographical detail. Recent work to identify a singer uses voice data, often abstracted from a more complicated musical source, as the most representative feature, with some form of statistical modeling and machine learning of the singer’s characteristics, so that further examples can be tested and categorized.

Shen et al. regard other information, such as beat and timbre, as being of equal and additional importance to vocal data, as well as what they call the genre of the piece, the accompanying instrumental music. The authors present in this paper a new method, the hybrid singer identification (HSI), which they claim is more robust than previous techniques. HSI is a multi-faceted method, in which the four specific features noted above are abstracted from a piece of music, and a statistical profile for a particular singer is constructed for each feature, from sample performances. This profile is then used to classify further songs.

Several assumptions are made in the creation of each profile, namely, that singers tend to play with the same backing band and that the type of instrumentation does not change from recording to recording. These assumptions are not realistic, since session musicians are used extensively in recordings of major artists who may also change style, hence the instrumentation, across a range of musical genres. Problems also occur with the first stage of profiling--using datasets from complete albums biases the “learning” toward the style of an album, not a singer, although HSI attempts to overcome this bias.

The authors provide a useful overview of a range of associated research that is often conducted on small datasets, introduce us to a benchmarking system for such research set up in 2005, and proceed to demonstrate that HSI performs better than comparative methods over a number of factors, such as robustness and scalability. Their experimental work uses a large dataset of commercial popular singers of the late 20th century; it would be interesting to see how HSI fares when applied to different types of music and to what extent it can apply to other forms of multimedia.

Reviewer:  Rosa Michaelson Review #: CR138055 (1010-1049)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy