Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Intelligent document retrieval : exploiting markup structure
Kruschwitz U., Springer-Verlag New York, Inc., Secaucus, NJ, 2005. 197 pp. Type: Book (9781402037672)
Date Reviewed: Jul 10 2006

The main idea of this book, based on the author’s PhD thesis, is to use markup information as a series of cues to the significance of words and concepts in a text, thus enhancing the indexing of that text. The technique is developed for collections of texts with a specific focus, such as a Web site or a collection of documents maintained in an organization.

The first part of the process is the automatic construction of domain models. Concepts can be extracted automatically and straightforwardly, based on cues such as terms appearing in multiple markup contexts (or even multiple fonts) in a document. These concepts are then organized into a model for the domain from which the texts are drawn. Specifically, they are built into hierarchies.

The second part of the process uses this model to interactively refine user searches. The hierarchies can be used to shape an interactive dialogue with a user conducting a search: when a search is completed, the dialogue manager can offer options such as refining the search into more detailed concepts included in the collection of texts, or synthesizing the search into a search on a more global and inclusive context. The dialogue manager idea is general, so the software can be structured as a user interface to an existing search engine (such as Google), or can be used as the entry point for a custom engine. Two different applications are described in this part. The first involves indexing Web pages, and is given in two variations: a university Web site and a BBC news Web site. The second application involves searching a directory of classified advertisements.

The presented approach is attractive because it can be adapted to different contexts in a straightforward manner, and is simple both to explain and to implement.

Reviewer:  D. T. Barnard Review #: CR133050
Bookmark and Share
 
Information Search And Retrieval (H.3.3 )
 
 
Document Management (I.7.1 ... )
 
 
Heuristic Methods (I.2.8 ... )
 
 
Content Analysis And Indexing (H.3.1 )
 
 
Document and Text Editing (I.7.1 )
 
 
Document Preparation (I.7.2 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Information Search And Retrieval": Date
Nested transactions in a combined IRS-DBMS architecture
Schek H. (ed)  Research and development in information retrieval (, King’s College, Cambridge,701984. Type: Proceedings
Nov 1 1985
An integrated fact/document information system for office automation
Ozkarahan E., Can F. (ed) Information Technology Research Development Applications 3(3): 142-156, 1984. Type: Article
Oct 1 1985
Access methods for text
Faloutsos C. ACM Computing Surveys 17(1): 49-74, 1985. Type: Article
Jan 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy