Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Mining interesting knowledge from weblogs: a survey
Facca F., Lanzi P. Data & Knowledge Engineering53 (3):225-241,2005.Type:Article
Date Reviewed: Nov 7 2005

This paper surveys the field of Web usage mining, which is a sub-area of Web mining, which, in turn, is a sub-area of data mining. Web usage mining is the part of Web mining that deals with the extraction of knowledge from server log files. Such Web logs, or weblogs, are mostly textual logs, collected when users access Web servers, and are stored in one of several commonly used formats. (Note that weblogs in this context are not blogs, as that term has come to be known recently.)

The sections of the paper include an introduction, and cover data sources, data preprocessing, knowledge discovery techniques, applications, software support, moving from techniques to applications, privacy issues, future trends, and a brief summary. There are also 112 references. There are no figures, but there is one helpful table that provides references to carefully selected papers, with representative applications, techniques, and data sources.

Data sources for Web usage mining are Web servers, proxy servers, and Web clients. Preprocessing includes data cleaning, identifying and reconstructing users’ sessions, retrieving information about page content and structure, and data formatting. Knowledge discovery techniques for research in Web usage mining, as opposed to the statistical analysis typical of commercial applications, focus on association rules, sequential patterns, and clustering. Since the general goal of Web usage mining is to gather useful information about Web users’ navigation patterns, the results produced by mining Web logs can be used to personalize the delivery of Web content, improve user navigation by means of prefetching and caching, improve Web design, and improve customer satisfaction in e-commerce. Software support has evolved over the last several years, with e-commerce Web usage mining becoming part of integrated customer relationship management (CRM) solutions, simple Web log analyzers for general usage, and an open source tool, the Web utilization miner (WUM), for the research community. The privacy issue, in general, is still being considered by the Web usage mining community. Future trends appear to be tied to the emergence and proliferation of the semantic Web concept.

The paper serves as a good survey of Web usage mining. It is recommended for anyone wanting to understand the essentials of this rapidly emerging field.

Reviewer:  M. G. Murphy Review #: CR131998 (0606-0631)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
User Profiles And Alert Services (H.3.4 ... )
 
 
Data Mining (H.2.8 ... )
 
 
World Wide Web (WWW) (H.3.4 ... )
 
 
Database Applications (H.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "User Profiles And Alert Services": Date
An adaptive system for the personalized access to news
Ardissono L., Console L., Torre I. AI Communications 14(3): 129-147, 2001. Type: Article
May 27 2003
Adaptive Web search based on user profile constructed without any effort from users
Sugiyama K., Hatano K., Yoshikawa M.  World Wide Web (Proceedings of the 13th conference on World Wide Web, New York, NY, USA, May 17-20, 2004)675-684, 2004. Type: Proceedings
Jun 15 2004
Recommender systems research: a connection-centric survey
Perugini S., Gonçalves M., Fox E. Journal of Intelligent Information Systems 23(2): 107-143, 2004. Type: Article
Nov 2 2005
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy