Wikipedia is a vast store of relatively unstructured information, generally accessed through keywords. Wikipedia infoboxes allow the representation of relatively structured information across similar topics. A topical organization such as an ontology can provide an additional access point for this information. In this paper, Wu and Weld describe the Kylin Ontology Generator (KOG) system for automatically restructuring Wikipedia’s manually constructed infobox ontology.
The KOG system accepts a set of infobox templates and then cleans and restructures them. Cleaning includes recognizing duplicate schemata, assigning meaningful names, and attribute type inference. Restructuring uses a WordNet mapping to create an ISA hierarchy that is then further refined, using support vector machines and Markov logic networks. Preliminary evaluation is promising, but of limited value, since it is apparently based on the authors’ own evaluation of the appropriateness of the choices.
The paper is well written and should be of interest to researchers and developers who are working with Wikipedia and other search environments. KOG is an important step in the construction of Wikipedia ontologies. However, such work needs to be accompanied by user studies for evaluation of the resulting ontologies and, more importantly, of the utility of ontologies in Wikipedia use in general.