The task of clustering similar or related documents is important to information retrieval systems, like search engines. This is done by building graphs, where the given set of documents form the nodes and the edges represent the similarity between the documents and nodes.
The authors present a graph-building algorithm that closely follows the self-assembly behavior observed when ants build living structures by connecting their bodies together. The tabulated results show that the proposed algorithm outperforms standard methods, such as relative neighborhood graphs (RNG) methods, for building graphs, while finding more similarity, that is, creating more links between documents.
The key principle of the proposed algorithm is that the graph is built incrementally. When a new document is added, it follows the path of maximum similarity: it is connected to all neighboring nodes and documents whose similarity to the new document is higher than a given similarity threshold.
As this is a short poster session paper, details of the algorithm and evaluation are omitted. It may be worthwhile to investigate other related publications by the authors, as the high performance of this algorithm makes it a promising substitute for current clustering algorithms.