Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Applying Authorship Analysis to Extremist-Group Web Forum Messages
Abbasi A., Chen H. (ed) IEEE Intelligent Systems & Their Applications20 (5):67-75,2005.Type:Article
Date Reviewed: Feb 27 2006

The question of whether an author leaves an unconscious but statistically discernable “signature” on his or her writing was first visited by Wake at Oxford in 1911. Wake was an eminent classicist, but he was not a statistician, and his sentence length statistics did not prove useful. In the 1960s, a Church of England minister and New Testament scholar, A.Q. Morton, who was a statistician, developed a statistical authorship test for Greek, and used it successfully on the Pauline Epistles, the Gospel of Luke, and the Acts of the Apostles. He and others later used it on Homer’s Iliad, also with notable success. The test was very simple, but useful for Greek text; he simply counted the number of times kai was used in each sentence. Kai is a coordinating conjunction in Greek 95 percent of the time (it is an adverb the other five percent), and performs the combined roles of all the coordinating conjun!ctions in English (and, or, but, and so on). Alvar Ellegard developed a much more sophisticated statistical method [1] for his doctoral dissertation at Uppsala, and used it to prove that Sir Philip Francis, a British civil servant, had written the scathing Junius Letters to the London Public Advertiser criticizing King George III and his war against the American colonies. Junius Brutus killed Julius Caesar, but George III would certainly have hanged this Junius, Philip Francis, for sedition if he knew he was the author of the letters.

This fascinating paper takes the unconscious authorship signature problem into new theoretical (but also very practical) realms. The paper presents new methods that go beyond Greek and English literary texts to the analysis of extremist multi-language polemics on Internet Web sites. This extension of the technology opens up new vistas. For example, Internet Web sites are a very new literary genre, and the Arabic language, with its 5,000 roots or stems, is very highly inflected. Arabic has 15 verbal conjugations, compared to Hebrew with only eight, and Indo-European languages with even fewer. The liaison issues in Arabic, which is only written cursively, and which has initial, medial, and final forms for many letters, along with infixes and consonant stacking, add to the morphological, grammatical, and syntactical interface of the language. The authors find that this craggy linguistic interface, while complex, does add some statistical hand and toe holds. Their methods sho!w significant discriminating power in the application of authorship identification techniques to both English and Arabic messages. Ku Klux Klan (KKK) polemics were used as a sort of English language control in the development of the methods.

This well-presented, well-written paper illustrates an important and very current application of computer-based statistical methods for authorship identification. It is so good, and so relevant to our times, that I am surprised it wasn’t classified by the US National Security Agency (NSA).

Reviewer:  P. C. Patton Review #: CR132494 (0611-1170)
1) Ellegard, A. Who was Junius?. Almqvist and Wiksell, Stockholm, 1962.
Bookmark and Share
  Editor Recommended
Featured Reviewer
 
 
Text Analysis (I.2.7 ... )
 
 
Data Mining (H.2.8 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Text Analysis": Date
Some issues in the semantics and pragmatics of definite reference in the context of natural language database access
Berry-Rogghe G. Circuits, Systems, and Signal Processing 3(1): 47-54, 1984. Type: Article
Jun 1 1985
Word division in Spanish
Mañas J. Communications of the ACM 30(7): 612-616, 1987. Type: Article
Jul 1 1989
Schemata for understanding of argumentation in newspaper texts
Roesner D.  Progress in artificial intelligence (, Orsay, France,3111985. Type: Proceedings
Apr 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy