Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Speaker clustering for speech recognition using vocal tract parameters
Naito M., Deng L., Sagisaka Y. Speech Communication36 (3):305-315,2002.Type:Article
Date Reviewed: May 7 2003

Identifying a person by means of a spoken phrase is one method of biometric identification. The most common way that this is done is by voice print analysis, using acoustic characteristics.

Naito, Deng, and Sagisaka propose correlating the properties of the spoken text to the physical dimensions of the vocal tract. The vocal tract is divided into two parts: the oral section, which has 18 parameters, and the pharyngeal section, which has seven parameters. A mapping process between the characteristics of selected phonemes determines the values of the parameters.

This research was based on previous work done by others involving speakers of European languages. The authors applied these methods to 148 Japanese-speakers in the study. Twenty of the speakers provided two smaller experimental sets, while 128 were used to set up the base clustering data. Clustering of speakers is an important step in the process, since placement of an unknown speaker into a cluster reduces the computational search requirements needed for a more thorough identification.

The authors report success, with statistical confirmation. They also provide data to show that the use of physical characteristics is computationally efficient. This method shows promise.

Reviewer:  Anthony J. Duben Review #: CR127579 (0308-0809)
Bookmark and Share
  Featured Reviewer  
 
Speech Recognition And Synthesis (I.2.7 ... )
 
 
Clustering (I.5.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Speech Recognition And Synthesis": Date
On-line recognition of spoken words from a large vocabulary
Kohonen T. (ed), Riittinen H., Reuhkala E., Haltsonen S. Information Sciences 33(1-2): 3-30, 1984. Type: Article
Oct 1 1985
Connected spoken word recognition algorithms by constant time delay DP, O (n) DP and augmented continuous DP matching
Nakagawa S. Information Sciences 33(1-2): 63-85, 1984. Type: Article
Jun 1 1985
The phonetic basis for computer speech processing
Ladefoged P., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1985. Type: Book (9789780131638419)
Dec 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy