What if Wikis were freed from their monolithic task of knowledge presentation, and instead were part of information systems (ISs) that could answer questions, summarize articles, and ultimately become “self-aware” of their own content? The authors seek to demonstrate just that, and to provide, at the very least, a plausible road map to such an outcome using state-of-the-art natural language processing (NLP) techniques, along with open-source products such as MediaWiki and the General Architecture for Text Engineering (GATE).
The focus of this research is not to discover new NLP techniques, but rather to augment current ones to improve and automate content retrieval, analysis, and text generation. The result of such research is an IS that shows a major improvement in the quality of index generation based on a semantic metalanguage and metatags, content development, summarization, and question answering. The architecture of such an IS may include multiple tiers, starting with a (Web) client interacting with a presentation/interaction layer, which in turn triggers the services run by the NLP system; another tier includes the knowledge base in support of such processes.
Using Wikis as part of NLP is exciting since it demonstrates the use of knowledge-based understanding techniques, in particular the potential value of understanding techniques based on sublanguages (after all, each Wiki is created to reflect knowledge in a particular subject area). More importantly, Wikis as part of NLP might open a way to connect the monoliths of knowledge, ultimately tying them all together in one giant semantic Web.