Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
How good is the factor concept in estimating the average number of character comparisons per item in string sorting?
Mustafa S., Masoud F. Journal of Systems and Software57 (3):221-226,2001.Type:Article
Date Reviewed: Oct 29 2002

For decades, sorting has been a popular topic. Sorting strings, however, has attracted less attention than sorting numbers. In particular, there have not been many metrics proposed to assist in the analysis of string sorting. Previous efforts mainly provided simplistic, average-case formulas. This paper reports the results of an empirical study to determine the average number of character comparisons during string sorting, as well as the required run times, assuming that strings (in English and Arabic) are sorted using the quicksort algorithm.

The paper uses the concept of critical factor (CF), which is defined as the position number where two strings disagree in a character-by-character comparison. The paper also introduces the notion of average CF, which is defined over a set of sorted strings, and the notion of run-time average CF.

Data strings were extracted from bibliographic databases: the data was not simple words, but rather sentences. Using the above metrics, the authors performed an extensive experiment to investigate differences between the English and Arabic languages. Interestingly, the average CF ranged from 1 to 7 for about 50 percent of English cases and about 60 percent of Arabic cases, whereas 24 percent of the Arabic strings and 16 percent of the English strings disagreed in the first, second, or third letter. Several similar remarks are extracted by interpreting the results. The paper is an interesting first step towards the empirical analysis of string sorting.

Reviewer:  Yannis Manolopoulos Review #: CR126577 (0301-0069)
Bookmark and Share
 
Sorting And Searching (F.2.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Sorting And Searching": Date
Binary search trees of almost optimal height
Anderson A., Icking C., Klein R., Ottmann T. (ed) Acta Informatica 28(2): 165-178, 1990. Type: Article
May 1 1992
More nearly optimal algorithms for unbounded searching, part II
Reingold E., Shen X. SIAM Journal on Computing 20(1): 184-208, 1991. Type: Article
Apr 1 1992
A simple linear-time algorithm for in situ merging
Mannila H., Ukkonen E. Information Processing Letters 18(4): 203-208, 1984. Type: Article
Feb 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy