Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data in the wild: some reflections
Ang C., Bobrowicz A., Schiano D., Nardi B. interactions20 (2):39-43,2013.Type:Article
Date Reviewed: Nov 21 2013

“Data in the wild” refers to raw data that can be gathered from social networking and microblogging sites, as well as other sites featuring user-generated content. In this brief paper, the authors discuss the problems associated with gathering data in the wild for research purposes.

The authors note that doing research based on data in the wild deviates from the usual design of a research project. This type of data is not constructed and designed with research questions in mind. Research is conducted by a means over which the data investigator has little control. Answers to typical survey questions such as the gender or educational background of participants are difficult to ascertain from data in the wild.

Scientific research generally involves the development of a hypothesis, the selection of a means of measuring the hypothesis, and the analysis of the gathered data. In the social sciences, this is typically done with surveys specifically developed for the research purpose. With data in the wild, researchers mine existing data for patterns and then work backward to develop hypotheses about what they see.

There are also potential ethical issues when conducting research on data in the wild. Subjects cannot be informed about the type of research that is performed on their data, and informed consent about their participation is impossible.

Finally, the authors note legal problems associated with gathering social data for research purposes. Some of the authors tried doing research on self-disclosure and privacy by sampling screen shots from live webcams from a publicly available social networking site. The authors quickly realized they could inadvertently collect data that could be either illegal or disturbing (such as pornography, child pornography, and so on) and abandoned their research project.

The authors call for multidisciplinary research involving law, computer science, social science, and humanities to address the concerns discussed in this paper. They note a need to develop guidelines for conducting research with data in the wild.

Reviewer:  W. E. Mihalo Review #: CR141754 (1401-0091)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Data Mining (H.2.8 ... )
 
 
Ethics (K.4.1 ... )
 
 
Privacy (K.4.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Mining": Date
Feature selection and effective classifiers
Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article
May 1 1999
Rule induction with extension matrices
Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article
Jul 1 1998
Predictive data mining
Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)
Feb 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy