Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Exploring image binarization techniques
Chaki N., Shaikh S., Saeed K., Springer Publishing Company, Incorporated, New York, NY, 2014. 82 pp. Type: Book (978-8-132219-06-4)
Date Reviewed: Mar 4 2015

An important aspect of making the workplace efficient is accessing information contained within prior documents in an accurate manner. Typically, these documents have been stored over a number of years but are no longer available in an electronic format or are not searchable. Fortunately, there is an established technology--optical character recognition (OCR)--with an explicit goal of accurately retrieving each character (letter, number, punctuation) from a scanned document and subsequently grouping together a sequence of these characters based on delimiters such as spaces or blank lines. However, OCR software systems do not perform well with degraded images. This text mainly addresses this need, and the authors study a number of binarization techniques, more commonly called thresholding algorithms.

It should be stated a priori, as the authors themselves point out at the conclusion of chapter 1 and more fully in chapter 5, that the algorithms developed within are applicable to other application domains as well, such as hand and gait gestures; fingerprint, iris, and face recognition; and medical scans. This review is written from the OCR perspective for a number of reasons: first, practically, people need improved OCR technology on a more regular basis than the other binarization applications; second, the experiments reported in this text provide resultant document images that are quite impressive; third, OCR technology is more readily understood than what is needed for medical and physical applications; and fourth, binarization assumes a grayscale image, which is most typically found in text document scans.

Before going into details about the technology, it is important for the reader to understand that the problem addressed by the book is as its title suggests, the problem of image binarization, separating the image pixels into two groups, the background and the foreground. This typically assumes a bi-modal histogram of color intensity levels. Within the context of document scanning, the background of the image encompasses the portions of the image that do not contain information (characters). The foreground of the image contains the portions of the image that contain the pixels that comprise the characters. There are other equally important steps required by any OCR system, such as image acquisition, deskewing rotated images, and the correlation of image component features with a character database. These latter tasks are not the focus of this book.

Applications can only be as successful as the image (scan) quality provides for. If even a slight amount of discoloration prevails in a document scan, the OCR software will combine multiple characters into a single component that will not be discernible. The authors of this scientific text survey several thresholding algorithms (chapter 2) and develop a hybrid method based on Otsu thresholding (chapter 3). It is an automatic thresholding procedure based on an iterative selection method.

The Otsu algorithm chooses two groups of pixels with the least variance of color levels within each group and appropriately labels these groups as the background and foreground pixels. The authors improve this method by incorporating an iterative partitioning procedure into the method. If a region of the image is better suited with more than two classes of color, the upgraded algorithm would keep on subdividing the regions until exactly two classes of color remain, upon which the Otsu method can readily be applied. The algorithm will then recombine these subregions based on statistical measures in order to ultimately result in one entire image containing exactly two classes of pixels (chapter 3).

In chapter 4, the authors provide a framework for testing the various algorithms on degraded images and measuring their relative performance. Their tables of experimental data and the thresholded images shown are quite impressive. Their hybrid approach is a strong competitor and should be seriously considered by the scientific community. The authors then suggest a majority voting system (section 4.5) to further improve their results under even more conditions. Seven major thresholding algorithms were considered in this book (section 4.2), and each will provide a categorization for each pixel into either the background or foreground. Section 4.3 selects an appropriate tuning parameter (called k) so that the thresholding algorithms can more properly handle degraded images. Apply all seven approaches independently to the original image, and, for each pixel, take a vote as to which group it belongs to. The final demarcation will be based on the majority of votes from the seven methods.

While the authors admit that the degree that their approach can handle noisy or degraded images requires further study (chapter 6, discussing conclusions and further study), the results of their current experimentation as presented in the chapters of this book suggest that researchers and practitioners in the imaging field should consider their methodology a serious approach to the binary segmentation of images.

Reviewers:  Michael GoldbergR. Goldberg Review #: CR143222 (1506-0463)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Image Representation (I.4.10 )
 
 
Applications (I.4.9 )
 
Would you recommend this review?
yes
no
Other reviews under "Image Representation": Date
On detecting all saddle points in 2D images
Kuijper A. Pattern Recognition Letters 25(15): 1665-1672, 2004. Type: Article
Jul 14 2005
General adaptive neighborhood image processing
Debayle J., Pinoli J. Journal of Mathematical Imaging and Vision 25(2): 267-284, 2006. Type: Article
Mar 29 2007
Human skeleton tracking from depth data using geodesic distances and optical flow
Schwarz L., Mkhitaryan A., Mateus D., Navab N. Image and Vision Computing 30(3): 217-226, 2012. Type: Article
Aug 26 2013
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy