Computing Reviews, the leading online review service for computing literature.

Search

Speaker clustering for speech recognition using vocal tract parameters
Naito M., Deng L., Sagisaka Y. Speech Communication36 (3):305-315,2002.Type:Article

Date Reviewed: May 7 2003

Identifying a person by means of a spoken phrase is one method of biometric identification. The most common way that this is done is by voice print analysis, using acoustic characteristics. Naito, Deng, and Sagisaka propose correlating the properties of the spoken text to the physical dimensions of the vocal tract. The vocal tract is divided into two parts: the oral section, which has 18 parameters, and the pharyngeal section, which has seven parameters. A mapping process between the characteristics of selected phonemes determines the values of the parameters. This research was based on previous work done by others involving speakers of European languages. The authors applied these methods to 148 Japanese-speakers in the study. Twenty of the speakers provided two smaller experimental sets, while 128 were used to set up the base clustering data. Clustering of speakers is an important step in the process, since placement of an unknown speaker into a cluster reduces the computational search requirements needed for a more thorough identification. The authors report success, with statistical confirmation. They also provide data to show that the use of physical characteristics is computationally efficient. This method shows promise.

Reviewer: Anthony J. Duben	Review #: CR127579 (0308-0809)

Speech Recognition And Synthesis (I.2.7 ... )

Clustering (I.5.3 )

Would you recommend this review?

yes

Other reviews under "Speech Recognition And Synthesis":	Date

On-line recognition of spoken words from a large vocabulary Kohonen T. (ed), Riittinen H., Reuhkala E., Haltsonen S. Information Sciences 33(1-2): 3-30, 1984. Type: Article	Oct 1 1985

Connected spoken word recognition algorithms by constant time delay DP, O (n) DP and augmented continuous DP matching Nakagawa S. Information Sciences 33(1-2): 63-85, 1984. Type: Article	Jun 1 1985

The phonetic basis for computer speech processing Ladefoged P., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1985. Type: Book (9789780131638419)	Dec 1 1987

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy