Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Extracting person names from diverse and noisy OCR text
Packer T., Lutes J., Stewart A., Embley D., Ringger E., Seppi K., Jensen L.  AND 2010 (Proceedings of the 4th Workshop on Analytics for Noisy Unstructured Text Data, Toronto, ON, Canada, Oct 26, 2010)19-26.2010.Type:Proceedings
Date Reviewed: Mar 31 2011

The authors of this paper provide a satisfying read about name entity recognition (NER) in noisy optical character recognition (OCR) texts. They deliver on their promise of providing answers to many questions that researchers in this area might have.

Packer et al. draw many interesting conclusions about performing the difficult task of extracting names from noisy scanned documents: “Word order errors can play a bigger role in poor extraction performance than character recognition errors”; “The knowledge-based approaches performed better than the machine learning (ML) approaches”; and “Combining basic extraction methods can produce higher quality NER.”

Regarding the conclusion about machine learning approaches, ML lovers need not despair. The authors point out two ways to overcome their deficiencies: either apply a more realistic noise model of OCR errors to the computational natural language learning (CoNLL) training data or use semi-supervised ML techniques to take advantage of the large number of unlabeled documents.

Reviewer:  João Luís G. Rosa Review #: CR138942 (1110-1086)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Language Parsing And Understanding (I.2.7 ... )
 
 
Content Analysis And Indexing (H.3.1 )
 
Would you recommend this review?
yes
no
Other reviews under "Language Parsing And Understanding": Date
Computer processing of natural language
Krulee G., Prentice-Hall, Inc., Upper Saddle River, NJ, 1991. Type: Book (9780136102885)
Sep 1 1992
Deep and superficial parsing
Wilks Y., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1985. Type: Book (9789780131638419)
Dec 1 1987
Compound noun interpretation problems
Jones K., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1985. Type: Book (9789780131638419)
Dec 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy