Computing Reviews, the leading online review service for computing literature.

Search

Focused crawling for the hidden web
Liakos P., Ntoulas A., Labrinidis A., Delis A. World Wide Web19 (4):605-631,2016.Type:Article

Date Reviewed: Oct 11 2016

The hidden web, or the deep web, is defined as the part of web documents whose content can only be accessed by submitting queries to websites and that cannot be indexed by traditional search engines. Some good examples are e-commerce and question-answering websites that dynamically generate web pages for submitted queries. Web content that is publicly accessible is referred to as the surface web. It is estimated that the deep web is much larger than the surface web. This paper is on focused, or topic-sensitive, crawling, like accessing only the political movies on a website about movies. The authors introduce an intuitive algorithm and evaluate it on four websites under different policies. They provide a comparative evaluation. They also compare their approach with some previous work, mostly in terms of their principles. I found the paper and the problem interesting. However, the presentation needs improvement. It contains forward references that decrease the effectiveness of reading. The paper also contains some typographical errors. (The one in the second sentence of the second paragraph of section 1 is unfortunate since it is too early and too obvious.) The figures are not of high quality. For example, figure 1 is too big, and does not contain definitions of the symbols used. Figure 4 uses the traditional decision box of flowcharts to indicate a process; it is a misleading choice.

Reviewer: F. Can	Review #: CR144828 (1701-0065)

World Wide Web (WWW) (H.3.4 ... )

Would you recommend this review?

yes

Other reviews under "World Wide Web (WWW)":	Date

Intranet document management Bannan J., Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1997. Type: Book (9780201873795)	Feb 1 1998

Developing databases for the Web and intranets Rodley J., Coriolis Group Books, Scottsdale, AZ, 1997. Type: Book (9781576100516)	Jun 1 1998

1001 programming resources Edward J. J., Jamsa Press, Houston, TX, 1996. Type: Book (9781884133503)	Apr 1 1998

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy