Computing Reviews, the leading online review service for computing literature.

Search

Mining interesting knowledge from weblogs: a survey
Facca F., Lanzi P. Data & Knowledge Engineering53 (3):225-241,2005.Type:Article

Date Reviewed: Nov 7 2005

This paper surveys the field of Web usage mining, which is a sub-area of Web mining, which, in turn, is a sub-area of data mining. Web usage mining is the part of Web mining that deals with the extraction of knowledge from server log files. Such Web logs, or weblogs, are mostly textual logs, collected when users access Web servers, and are stored in one of several commonly used formats. (Note that weblogs in this context are not blogs, as that term has come to be known recently.) The sections of the paper include an introduction, and cover data sources, data preprocessing, knowledge discovery techniques, applications, software support, moving from techniques to applications, privacy issues, future trends, and a brief summary. There are also 112 references. There are no figures, but there is one helpful table that provides references to carefully selected papers, with representative applications, techniques, and data sources. Data sources for Web usage mining are Web servers, proxy servers, and Web clients. Preprocessing includes data cleaning, identifying and reconstructing users’ sessions, retrieving information about page content and structure, and data formatting. Knowledge discovery techniques for research in Web usage mining, as opposed to the statistical analysis typical of commercial applications, focus on association rules, sequential patterns, and clustering. Since the general goal of Web usage mining is to gather useful information about Web users’ navigation patterns, the results produced by mining Web logs can be used to personalize the delivery of Web content, improve user navigation by means of prefetching and caching, improve Web design, and improve customer satisfaction in e-commerce. Software support has evolved over the last several years, with e-commerce Web usage mining becoming part of integrated customer relationship management (CRM) solutions, simple Web log analyzers for general usage, and an open source tool, the Web utilization miner (WUM), for the research community. The privacy issue, in general, is still being considered by the Web usage mining community. Future trends appear to be tied to the emergence and proliferation of the semantic Web concept. The paper serves as a good survey of Web usage mining. It is recommended for anyone wanting to understand the essentials of this rapidly emerging field.

Reviewer: M. G. Murphy	Review #: CR131998 (0606-0631)

User Profiles And Alert Services (H.3.4 ... )

Data Mining (H.2.8 ... )

World Wide Web (WWW) (H.3.4 ... )

Database Applications (H.2.8 )

Would you recommend this review?

yes

Other reviews under "User Profiles And Alert Services":	Date

An adaptive system for the personalized access to news Ardissono L., Console L., Torre I. AI Communications 14(3): 129-147, 2001. Type: Article	May 27 2003

Adaptive Web search based on user profile constructed without any effort from users Sugiyama K., Hatano K., Yoshikawa M. World Wide Web (Proceedings of the 13th conference on World Wide Web, New York, NY, USA, May 17-20, 2004)675-684, 2004. Type: Proceedings	Jun 15 2004

Recommender systems research: a connection-centric survey Perugini S., Gonçalves M., Fox E. Journal of Intelligent Information Systems 23(2): 107-143, 2004. Type: Article	Nov 2 2005

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy