Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Hate speech detection with comment embeddings
Djuric N., Zhou J., Morris R., Grbovic M., Radosavljevic V., Bhamidipati N.  WWW 2015 Companion (Proc. of the 24th International World Wide Web Conference, Florence, Italy, May 18-22, 2015)29-30.2015.Type:Proceedings
Date Reviewed: Jun 11 2015

Hate speech comments in online forums are a form of offensive language targeted at specific groups with an aim to dishonor. Hate speech is also considered as synonym to misinformation, smears, and social pollution. The unmonitored activities of online social communities and uncontrollable access to the Internet are proliferating hate speech in online comments.

The authors propose a two-step method to address the issue of hate speech detection in online comments. The method comprises a continuous bag-of-words (BOW) neural language model and embeddings using paragraph-to-vector and a binary classifier for training, respectively. In the first step, the method uses hierarchical soft-max to reduce time complexity, which enables efficient training. In the second step, the method learns vector representations for processing through a linear regression classifier to distinguish between hate speech and clean comments.

The authors collected 56,280 hate speech comments and 895,456 clean comments from 209,776 anonymous Yahoo Finance website users over six months. They claim that the vocabulary size of 304,427 is the largest dataset of hate speech comments available in the literature. The neural language model accepts a continuous feature vector of dimensionality of size 200 and the context for word sequences of length 5 for 5 iterative processing. The authors compared the proposed method with BOW (term frequency) and BOW (term frequency-inverse document frequency) and use the area under the curve to validate their results.

The authors present insights on the proposed method in terms of reduced training time and less memory usage compared to other methods. They further propose that their method is a solution to the hate speech detection problem, alongside reducing high dimensionality and sparsity issues in online comments.

Reviewer:  Lalit Saxena Review #: CR143514 (1509-0814)
Bookmark and Share
 
Language Models (I.2.7 ... )
 
 
Text Processing (I.5.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Language Models": Date
A framework for investigating language-mediated interaction with machines
Zoeppritz M. International Journal of Man-Machine Studies 25(3): 295-315, 1986. Type: Article
Oct 1 1987
Prolog and natural-language analysis
Pereira F., Shieber S., CSLI/Stanford, Stanford, CA, 1987. Type: Book (9789780937073186)
Jun 1 1988
Competence and performance in the design of natural language systems
Bara B., Guida G., Elsevier North-Holland, Inc., New York, NY, 1984. Type: Book (9789780444875983)
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy