Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Speech enhancement : theory and practice (2nd ed.)
Loizou P., CRC Press, Inc., Boca Raton, FL, 2013. 711 pp. Type: Book (978-1-466504-21-9)
Date Reviewed: Jul 25 2013

The first edition of this book established itself as the best reference for single-channel speech enhancement [1]. Amazingly, this new edition is even better, and could be the most authoritative work in the area of modern single-channel techniques for speech enhancement to date. As with the previous edition, most algorithms are provided in a MATLAB implementation so that anyone can estimate the tradeoffs and benefits of the various approaches.

The book consists of an introductory chapter and four parts. The introduction clearly states what noise actually is in the speech enhancement context, what the classes of described algorithms are, and what can be expected from the latter.

Part 1 has three chapters: a brief introduction to discrete-time signal processing, emphasizing some techniques that are most useful in speech analysis; a short tutorial on speech production and perception; and a detailed account of how human listeners try to compensate for the noise in noisy listening conditions.

Part 2, which forms the core of the book, presents several modern speech enhancement algorithms. The algorithms are carefully classified, and the five chapters follow this classification. Spectral-subtractive algorithms are the first class discussed. More than half-a-dozen algorithms are described in detail, and an expert evaluation of their performance and shortcomings is included. These algorithms are also provided in MATLAB format on the accompanying DVD. The next class of algorithms is based on Wiener filtering, and the exposition follows the same pattern in the variety of algorithms presented and the level of detail. Statistical-model-based speech enhancement algorithms are presented in chapter 7. The next chapter is devoted to subspace enhancement algorithms, which are among the best in terms of both noise reduction and speech intelligibility of the resulting enhanced signal. The two essential subclasses are presented: singular value decomposition (SVD)-based enhancers and eigenvalue decomposition (EVD)-based enhancers. The latter ones are presented in slightly more detail, perhaps due to the more widely known Karhunen-Loève transform. The last chapter in this part treats the very important topic of noise estimation. Several approaches are described, including minimum statistics, time recursive averaging, and histogram techniques.

Part 3 is comprised of three chapters. Its main topic is evaluating the performance of enhancement algorithms, as well as detailing the modern procedures of quality assessment and intelligibility. The last chapter of this part is devoted to comparisons of the described speech enhancers.

The last part of the book attempts to answer why most if not all current single-channel speech enhancers remove noise, but do not improve intelligibility, and in some cases even deteriorate it.

In conclusion, this is a unique book, combining both thorough theoretical developments and practical implementations. I highly recommend it to those interested in speech enhancement, as well as applied signal processing.

Reviewer:  Vladimir Botchev Review #: CR141395 (1310-0884)
1) Loizou, P. C. Speech enhancement: theory and practice (1st ed.). CRC Press, Boca Raton, FL, 2011.
Bookmark and Share
  Reviewer Selected
 
 
Speech Recognition And Synthesis (I.2.7 ... )
 
 
Signal Processing (I.5.4 ... )
 
 
Signal Processing Systems (C.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Speech Recognition And Synthesis": Date
On-line recognition of spoken words from a large vocabulary
Kohonen T. (ed), Riittinen H., Reuhkala E., Haltsonen S. Information Sciences 33(1-2): 3-30, 1984. Type: Article
Oct 1 1985
Connected spoken word recognition algorithms by constant time delay DP, O (n) DP and augmented continuous DP matching
Nakagawa S. Information Sciences 33(1-2): 63-85, 1984. Type: Article
Jun 1 1985
The phonetic basis for computer speech processing
Ladefoged P., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1985. Type: Book (9789780131638419)
Dec 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy