In an Extensible Markup Language (XML) application, it may be necessary to simultaneously locate a k-tuple of nodes in an input (XML) tree that satisfies some given relationship. The difficulty here is that it is necessary to maintain context information about the initial part of any potential match while searching for the complete match.
This paper describes an approach, based on the use of efficient push down tree automata, that gives an effective algorithm for finding k-ary matches. A language, fxgrep, is introduced that is a practical query language for unary and binary queries. The algorithm for finding binary matches uses two passes through the document tree. The first, which is a top-down left-to-right pass, labels the nodes of the tree according to the states of the automaton at the node during the run. A second pass, also top-down, but this time right-to-left, can collect the matches because the labels provide the needed context information.
For binary queries, the algorithm is, at worst, of order n squared, where n is the number of nodes in the tree, but this happens only when the number of matching nodes is itself of order n, so that, in general, the order will be closer to being linear in n. Although the authors have implemented the algorithm, and the language fxgrep, they do not present any experimental results. It would be interesting to have a repository of problems on which one could test k-ary matching algorithms.