While large language models (LLMs) like ChatGPT can assist writers with editing, they might hinder students from learning to generate ideas or write creatively.
This article surveys the current state of algorithms for detecting LLM-generated content. Given that both LLMs and detection algorithms are rapidly evolving fields, the authors present the concepts with references, avoiding mathematical explanations in order to cater to both nontechnical readers and researchers in machine learning. However, many of the ideas have become outdated due to the field’s dynamic nature.
The article categorizes detection algorithms into black-box and white-box types. White-box algorithms require access to the LLM’s development environment, while black-box algorithms do not. With new LLMs being released frequently, white-box algorithms are primarily of academic interest and are not practically useful. For black-box detection, feature selection includes principles like Zipf’s law, linguistic patterns, sentiment analysis, and the low perplexity score of LLM-generated content. It also covers classification models using support vector machines (SVMs), random forests, and deep learning with neural networks.
OpenAI [1], the Massachusetts Institute of Technology (MIT) [2], Syracuse University [3], and other institutions have concluded that it is impossible to detect AI-generated text without false positives, prompting them to explore alternative methods for evaluating students. Consequently, readers may not find practical solutions in this article, though it offers valuable academic insights. However, Hu et al.’s recent paper [4] shows promise (and references GLTR, an older related tool).
In conclusion, this well-written article provides an excellent introduction to this emerging field.