Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Conjugation-based compression for Hebrew texts
Wiseman Y., Gefner I. ACM Transactions on Asian Language Information Processing6 (1):4-es,2007.Type:Article
Date Reviewed: Jun 27 2007

A compression technique designed for the Hebrew language is presented in this paper. The well-known Burrows-Wheeler algorithm is used, but a preprocessing step makes use of the fact that Hebrew words are derived from roots of two, three, or four letters, with morphology characterized by infixing additional letters.

These patterns, with a few exceptions, such as special forms for some final letters, are used in a first step, and roots are extracted where possible. The Burrows-Wheeler algorithm is used to compress both files. Hebrew words are written without vowels, which textually appear as diacritical marks. However, normally these are absent. The main computational obstacle to this method is choosing the set of patterns to use. In the paper, a greedy method is employed, but details are not provided.

The paper’s results are interesting and suggest that morphological features of a language can make material improvements in compression. The paper contains interesting information on the Hebrew language, and is part of a continuing research project on Hebrew text compression.

Reviewer:  Bruce Litow Review #: CR134476 (0806-0600)
Bookmark and Share
 
Coding And Information Theory (E.4 )
 
 
Text Processing (I.5.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Coding And Information Theory": Date
Bruck nets, codes, and characters of loops
Moorhouse G. Designs, Codes and Cryptography 1(1): 7-29, 1991. Type: Article
Jul 1 1992
A simple proof of the Delsarte inequalities
Simonis J., de Vroedt C. Designs, Codes and Cryptography 1(1): 77-82, 1991. Type: Article
Dec 1 1991
Diacritical analysis of systems
Oswald J., Ellis Horwood, Upper Saddle River, NJ, 1991. Type: Book (9780132087520)
Aug 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy