Computing Reviews, the leading online review service for computing literature.

Search

Farsi and Arabic document images lossy compression based on the mixed raster content model
Grailu H., Lotfizad M., Sadoghi-Yazdi H. International Journal on Document Analysis and Recognition12 (4):227-248,2009.Type:Article

Date Reviewed: Mar 18 2010

This paper--on the image compression of documents containing texts--proposes a new compression method based on the mixed raster content (MRC) model. The new method improves compression performance over other similar methods for documents with Farsi and Arabic texts. Sections 1 and 2 describe the MRC model of compression and existing packages that use the model. They discuss drawbacks of the existing packages in dealing with document images that contain Farsi and Arabic texts. Section 3 begins with a block diagram of the method. The input document is segmented into background, foreground, and mask layers. The components of the segmentation are binarization, refinement, and boundary smoothing. All three components are described in detail, with threshold formulas. The mask layer compression encodes the library prototypes extracted from chain code signals from the input. It uses the properties of Farsi and Arabic texts. The section ends with a description of how foreground and background layers are compressed. Section 4 discusses the performance results of the experiments on Farsi and Arabic documents. The performance is evaluated at various stages of the compression. The comparison is done using the DjVu compression method. Section 5 presents conclusions. An introductory background in image compression may be enough to understand the paper.

Reviewer: Maulik A. Dave	Review #: CR137824 (1008-0844)

Document Analysis (I.7.5 ... )

Image (H.2.0 ... )

Compression (Coding) (I.4.2 )

Would you recommend this review?

yes

Other reviews under "Document Analysis":	Date

Generating indicative-informative summaries with sumUM: a 3D dynamic virtual shop Saggion H., Lapalme G. Computational Linguistics 28(4): 497-526, 2002. Type: Article	Jun 20 2003

Parameter-Free Geometric Document Layout Analysis Lee S., Ryu D. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(11): 1240-1256, 2001. Type: Article	Jul 26 2002

A hierarchical neural network document classifier with linguistic feature selection Chen C., Lee H., Hwang C. Applied Intelligence 23(3): 277-294, 2005. Type: Article	Aug 2 2006

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy