Computing Reviews, the leading online review service for computing literature.

Search

Social microblogging cube
Hannachi L., Benblidia N., Bentayeb F., Boussaid O. DOLAP 2013 (Proceedings of the 16th International Workshop on Data Warehousing and OLAP, San Francisco, CA, Oct 28, 2013)19-26.2013.Type:Proceedings

Date Reviewed: Mar 5 2014

This paper proposes microblogging cube, a new social network analysis (SNA) model for analyzing microblog users and locations according to semantic, geographic, and temporal axes. Compared to the standard online analytical processing (OLAP) technology enforcing rigid summarization in all dimensional hierarchies, this new model applies varying measures depending on the level of microblog user and domain aggregation. With the explosive increase in microblogging data nowadays, this model will help people interactively and semantically view and analyze distributed microblogging data at different granularities. Researchers in business intelligence, SNA, ontology, and data mining will want to study this work. The proposed model is divided into three parts: the microblogging word cube, the microblogging domain-word cube, and the microblogging domain cube. The microblogging word cube conducts word-centered analysis. Its analysis schema includes six types: microblog word, location, user, time, community, and word. The term frequency-inverse document frequency (TF-IDF) is a numeric measure of how important a word is to a document in a collection, and the normalized Google distance (NGD) method is a semantic similarity measure for a given set of keywords. First, the TF-IDF is used to determine the top-K most representative words for a location or user. Then, the NGD is used to calculate the semantic word distance among the selected words. Based on the word distance, the user distance is determined by weighting each distance measure using user-specified factors. Finally, users are aggregated into different communities by using the user distance. In this part, since each word represents a very specific unit, analysts cannot automatically generate the overall knowledge of a user or location. Thus, the microblogging domain-word cube extends the Open Directory Project (ODP) taxonomy to include a new type (Domain_Word) that defines the domain distribution characterizing each user or location. As a result, SNA is based on both words and domains. The microblogging domain cube conducts domain-centered analysis to provide comprehensive and global visions about the microblog data. In this part, the domain type replaces the word type in the microblogging word cube, and the user communities are extracted based on the domain relations. Experiments tested the proposed model on a dataset of approximately 14 million tweets and 1,000 relevant users. The reported results reflect the analysis of five experimental users and demonstrate that the proposed approach can do impressive SNA. However, the paper would have been more complete if the authors had provided the experimental results of the remaining 995 experimental users.

Reviewer: Yingjie Li	Review #: CR142065 (1406-0454)

Data Mining (H.2.8 ... )

Social Networking (H.3.4 ... )

Web 2.0 (H.3.4 ... )

Would you recommend this review?

yes

Other reviews under "Data Mining":	Date

Feature selection and effective classifiers Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article	May 1 1999

Rule induction with extension matrices Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article	Jul 1 1998

Predictive data mining Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)	Feb 1 1999

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy