Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Semantic-based merging of RSS items
Taddesse F., Tekli J., Chbeir R., Viviani M., Yetongnon K. World Wide Web13 (1-2):169-207,2010.Type:Article
Date Reviewed: Oct 29 2010

As an alternative to visiting Web pages, really simple syndication (RSS) feeds offer a number of advantages: they come in an Extensible Markup Language (XML)-based format, they include content-based meta-information, and they can be aggregated and presented in a variety of ways. In particular, users who are interested in obtaining a quick overview of events in a particular domain can do this by combining related RSS feeds from various sources and displaying the headings of individual items (and some content, if desired) on one page.

One problem with such RSS feeds is the semantic overlap: news items from the same source may be offered through multiple RSS feeds, leading to duplicate entries, and items covering the same event may have a significant overlap in content, even though their presentation is different. While outright duplication is fairly straightforward to resolve, the semantic overlap of similar items is a more challenging problem. The authors of this paper propose an approach to merge similar items based on their relatedness.

For human readers, it is often sufficient to glance at the heading and maybe the first few sentences of two or more news items to determine if they are related; for computers, this is a challenge. Computers treat RSS entries as structured entities consisting of strings that are disassociated from the meaning that humans attribute to the words specified by these strings; determining the topic and content of an item requires additional measures. RSS items often have tags (keywords) that describe their content. An informal or formal structure can be used to describe the relations between tags (such as taxonomies, folksonomies, and ontologies). It is assumed that RSS entries with a large overlap in their label sets are related, and the additional structure allows an expansion or refinement of the relatedness between items.

Tags are usually created by the item’s author(s); by themselves, they are not a sufficiently reliable basis to determine the relatedness between items. Thus, Taddesse et al. expand the calculation of relatedness to the actual contents of the items. The goal is to group similar items together and to possibly merge some of their constituent components into one overall item, to be presented to the reader (the provenance of the individual pieces is still traceable).

Taddesse et al.’s experiments on two sets of RSS items show that the core method works reasonably well. Due to the overall difficulty of the relatedness and merging problems, they’re working on a full merging language that includes user preferences and the expansion of their methods to other media types.

Reviewer:  Franz Kurfess Review #: CR138538 (1105-0540)
Bookmark and Share
  Featured Reviewer  
 
Semantic Networks (I.2.4 ... )
 
 
Semantic Web (H.3.4 ... )
 
 
User-Centered Design (H.5.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Semantic Networks": Date
The METANET: a means for the specification of semantic networks as abstract data types
Dilger W., Womann W. International Journal of Man-Machine Studies 21(6): 463-492, 1984. Type: Article
Nov 1 1985
Processing of semantic nets on dataflow architectures
Bic L. Artificial Intelligence 27(2): 219-227, 1985. Type: Article
May 1 1987
Semantic networks
Mac Randal D., Research Studies Press Ltd., Taunton, UK, 1988. Type: Book (9780471917854)
Jun 1 1989
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy