Undoubtedly, music mood classification is a fascinating area for music researchers, with people beginning to build music libraries based on moods rather than other classifiers such as artist, genre, and so on. Given that the aforesaid area in the context of Indian music is relatively new compared to that in Western music, Patra et al. propose an interesting mood taxonomy for the music mood classification of Hindi and Western songs.
The proposed taxonomy has five classes:
- (1) Class_Ex: Excited, Astonished, Aroused;
- (2) Class_Ha: Happy, Delighted, Pleased;
- (3) Class_Ca: Calm, Relaxed, Satisfied;
- (4) Class_Sa: Sad, Gloomy, Depressed; and
- (5) Class_An: Angry, Alarmed, Tensed.
The logic behind such a fivefold classification is “the significant invariability among the audio features of the subclasses with respect to its corresponding mood class. For example, a happy and a delighted song have high valence, whereas an aroused and an excited song have high arousal.”
The next step is to annotate the audio and lyrics of Hindi and Western songs using the proposed mood taxonomy. However, it so happens that for some Hindi songs, the mood depicted by the lyrics contradicts the mood detected by listening. Therefore, the authors adopt a correlation-based feature selection technique to identify the important audio and lyric features, and implement feed-forward neural networks (FFNNs) to develop mood classification systems.
The authors successfully develop “several mood classification systems ... for [both] Hindi and Western songs” based on audio and lyric features as well as their combination. The FFNNs “for Hindi and Western songs obtained the maximum F-measures of 0.751 and 0.835, respectively.”
The paper is interesting, has some useful references, and will definitely draw interest from music researchers and students, music enthusiasts, musicians, and musicologists.
My personal view is that songs fall under composite art in which it is not simply the lyrics and the tune that are crucial, but their interaction; the left and right hemispheres of the brain can and do perform such an interactive processing of speech and music. The strength of the paper thus lies in considering such an interaction (“combination”) in the study.