Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Web data management
Madria S., Bhowmick S., Ng W., Springer-Verlag New York, Inc., Secaucus, NJ, 2003. 480 pp. Type: Book (9780387001753)
Date Reviewed: Apr 9 2004

This timely book focuses on the idea of designing and developing a Web warehouse. This system gathers information from heterogeneous sources (the many Web sites available on the Internet), serves as a shared information repository, and provides other services, such as personalization, summarization, and data mining.

The book consists of 12 chapters, which present their material in a well-organized and logical order. Chapter 1 serves as an introduction and discusses the characteristics of Web data. The authors substantiate the need for Web warehousing tools and techniques by discussing the limitations of heterogeneous data available on the Web, and by proposing a conceptual architecture for the Web warehouse. This chapter also offers a preview of the relevant research issues.

Chapter 2 presents a high-level overview of existing techniques in the Web data management area, and highlights their similarities with, and differences from, the authors’ work. In particular, the book focuses on modeling and querying the Web, Web site construction, and restructuring. The Web warehouse object model is presented in chapter 3, together with a discussion of the many issues encountered while modeling the warehouse data. Chapters 4 and 5 discuss the related issues of imposing constraints on Web metadata and Web content, as well as issues related to the formal representation of these constraints. These chapters also present an important feature of the authors’ approach in representing inter-document relationships, based on the partial knowledge of the user about the structure of hyperlinks.

A novel mechanism for querying the Web data is described in chapter 6, through a number of examples. Features of the querying mechanism include the ability to query metadata, content, and hyperlink structure, as well as the ability to control the query execution process. Chapter 7 presents a novel method of constructing and formalizing Web schemas, which represent a collection of documents relevant to a particular user. Chapter 8 focuses on algebraic operations that can be applied to manipulate a Web warehouse. This foundation is extended in chapter 9, which presents a set of data visualization operators. The issues of detecting and representing the changes that occur in Web documents, and their hyperlink structure over time, are discussed in chapter 10. The concept of a Web bag is introduced in Chapter 11. This concept is used extensively by the authors to aid in knowledge discovery on the Web. Finally, chapter 12 concludes the book with a summary and an outline of possible directions for future research.

This book offers a very thorough treatment of Web data warehousing, an area that definitely needs to be explored and researched further. Readers will find the material to be very detailed, in many cases requiring an extensive background in formal database theory. In a handful of cases, the authors should have gone slightly further in their research. In particular, in the discussion of the changes that occur in Web documents over time, the authors should have employed temporal logic techniques, which are widely used in temporal databases. Overall, however, this work is a welcome addition to the line of titles in the area of Web information retrieval, Web search, and Web mining.

Reviewer:  Stan Kurkovsky Review #: CR129432 (0410-1141)
Bookmark and Share
  Featured Reviewer  
 
Data Sharing (H.3.5 ... )
 
 
Large Text Archives (H.3.6 ... )
 
 
Standards (H.3.7 ... )
 
 
World Wide Web (WWW) (H.3.4 ... )
 
 
Digital Libraries (H.3.7 )
 
 
Library Automation (H.3.6 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Data Sharing": Date
Issues in online database searching
Tenopir C., Libraries Unlimited, Inc., Englewood, CO, 1989. Type: Book (9789780872877092)
Aug 1 1990
Sharing scientific data
Sterling T., Weinkam J. Communications of the ACM 33(9): 112-119, 1990. Type: Article
Mar 1 1991
Data caching issues in an information retrieval system
Alonso R., Barbara D., Garcia-Molina H. ACM Transactions on Database Systems 15(3): 359-384, 1990. Type: Article
Mar 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy