The citation analysis of research articles plays an important role in helping researchers identify the body of research done in related areas. Traditionally, citation analysis works at the level where authors and the whole article are cited; it is not possible to identify further information, such as where and how the references are cited in the article.
Callahan, Hockema, and Eysenbach’s model and methods move this analysis a giant step forward: contextual citation analysis analyzes co-citations that appear in an article, a section, or a paragraph within the article. Contextual co-citation analysis enables researchers to investigate at a finer granularity how co-citations appear and in what context.
The proposed model views the structure of a document as a tree: the whole article is the root, the successive levels of children are sections of the article or paragraphs of the sections, and the co-citations appear as leaf nodes. The model then computes the length of the common ancestor--the degree of closeness--of the two citations at leaf level.
The authors use biomedical research papers from the WebCite database (http://www.webcitation.org/) to test their model. The collection contains about 36,000 digital object identifiers (DOIs), and their full articles are in Extensible Markup Language (XML) format. The authors then apply their model in order to answer questions--for example, “What resources have been contextually co-cited with the Web resource http://regulondb.ccg.unam.mx/?”
Although the data presented in the paper is in the area of biomedical science and engineering, the concept and the framework can be applied to many other areas. Furthermore, while the data presented is on section-level analysis, one can certainly carry a similar analysis to finer levels of granularity, such as subsections, paragraphs, and even sentences. This model will help researchers find more useful information about references cited in a group of papers.