Computing Reviews

Speech enhancement :theory and practice (2nd ed.)
Loizou P., CRC Press, Inc.,Boca Raton, FL,2013. 711 pp.Type:Book
Date Reviewed: 07/25/13

The first edition of this book established itself as the best reference for single-channel speech enhancement [1]. Amazingly, this new edition is even better, and could be the most authoritative work in the area of modern single-channel techniques for speech enhancement to date. As with the previous edition, most algorithms are provided in a MATLAB implementation so that anyone can estimate the tradeoffs and benefits of the various approaches.

The book consists of an introductory chapter and four parts. The introduction clearly states what noise actually is in the speech enhancement context, what the classes of described algorithms are, and what can be expected from the latter.

Part 1 has three chapters: a brief introduction to discrete-time signal processing, emphasizing some techniques that are most useful in speech analysis; a short tutorial on speech production and perception; and a detailed account of how human listeners try to compensate for the noise in noisy listening conditions.

Part 2, which forms the core of the book, presents several modern speech enhancement algorithms. The algorithms are carefully classified, and the five chapters follow this classification. Spectral-subtractive algorithms are the first class discussed. More than half-a-dozen algorithms are described in detail, and an expert evaluation of their performance and shortcomings is included. These algorithms are also provided in MATLAB format on the accompanying DVD. The next class of algorithms is based on Wiener filtering, and the exposition follows the same pattern in the variety of algorithms presented and the level of detail. Statistical-model-based speech enhancement algorithms are presented in chapter 7. The next chapter is devoted to subspace enhancement algorithms, which are among the best in terms of both noise reduction and speech intelligibility of the resulting enhanced signal. The two essential subclasses are presented: singular value decomposition (SVD)-based enhancers and eigenvalue decomposition (EVD)-based enhancers. The latter ones are presented in slightly more detail, perhaps due to the more widely known Karhunen-Loève transform. The last chapter in this part treats the very important topic of noise estimation. Several approaches are described, including minimum statistics, time recursive averaging, and histogram techniques.

Part 3 is comprised of three chapters. Its main topic is evaluating the performance of enhancement algorithms, as well as detailing the modern procedures of quality assessment and intelligibility. The last chapter of this part is devoted to comparisons of the described speech enhancers.

The last part of the book attempts to answer why most if not all current single-channel speech enhancers remove noise, but do not improve intelligibility, and in some cases even deteriorate it.

In conclusion, this is a unique book, combining both thorough theoretical developments and practical implementations. I highly recommend it to those interested in speech enhancement, as well as applied signal processing.


1)

Loizou, P. C. Speech enhancement: theory and practice (1st ed.). CRC Press, Boca Raton, FL, 2011.

Reviewer:  Vladimir Botchev Review #: CR141395 (1310-0884)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy