When I read the title of this paper, I expected an interesting contribution to the important problem of measuring the similarity between RNA molecules. Regrettably, this paper presents a questionable solution to a doubtful problem. After giving some background information, I will explain why this is the case.
The RNA secondary structure is the structure induced by the ordered nucleotide base pairs and the unpaired bases in an RNA molecule. The RNA primary structure is the sequence of nucleotides in an RNA molecule. Because different sequences can produce similar secondary structures, RNA secondary structures are used for comparing RNA molecules. RNA secondary structures can be represented as ordered labeled trees, where “ordered” means that sibling nodes are ordered, so that the tree is an accurate representation of the RNA structure.
There are multiple ways to define such trees. Several edit operations, such as node relabeling, node insertion, and node deletion, are defined on these trees, and a cost is associated with these operations. Comparing two RNA structures then reduces to finding the edit distance between the trees representing the structures, that is, to finding the sequence of edit operations with minimum cost that map one tree to the other.
The authors claim that these classical edit operations are inadequate, because they can lead to tree mappings that do not accurately reflect the similarity between structures. In fact, the authors’ claim is based on using inappropriate tree representations for the RNA structures, and on using an incomplete definition of a map between trees (their map does not specify preserving structural properties such as sibling and ancestor order). Based on this claim, the authors propose new edit operations, called edge fusion and node fusion, define a new edit distance, and describe an algorithm to determine this distance. Unfortunately, the proposed algorithm has exponential time and space complexity, thus it is inefficient for nontrivial RNA sequences.
In addition to its questionable scientific merit, this paper is marred by its hard-to-read figures and clumsy notation. An example of poor notation is in Section 4.3: “a path is list of of pairs (e or u, v).” The editors of the journal should have reviewed this paper more carefully before publication.