Computing Reviews

Dual syntax for XML languages
Brabrand C., Møller A., Schwartzbach M. Information Systems33(4-5):385-406,2008.Type:Article
Date Reviewed: 11/24/08

Extensible Markup Language (XML) has become the lingua franca for many applications requiring platform independent data exchange, as it has the advantage of being both machine parsable and human readable. However, it has been criticized for being rather verbose for human writing (run a search engine against the terms “XML” and “verbose” to see such arguments), and many tools have arisen to support editing for both construction and maintenance of XML texts and their specification documents (document type definition (DTD) documents).

Accordingly, there have been numerous attempts to soften this maintenance task by defining alternative syntactic representations of both the XML data and its specification. On the specification side, there are many representations such as DTDs, XML schemas, and relaxed new generation schemas. On the XML data side, there are several alternative representations (XML Schema, TREX, RELAX, and RELAX NG), but to date, only ad-hoc translators exist to convert between these representations.

This paper defines a formal approach to this problem: it offers XSugar, a formal grammar that defines the syntax of both XML and non-XML representations of a given XML sublanguage (as defined by a DTD or schema), as well as a system for parsing, verifying, and translating between representations. The grammar is easily constructed, and defines both the nonterminal and terminal components. Terminals can specify lexical fragments in the language through regular expressions, which allows the idempotent translation of canonical representations from XML to non-XML representations and back again (and vice versa).

The paper uses a simple example to motivate the design and show how it applies in practice. It also describes how XSugar can be used to validate the requirement that translations of non-XML grammars do indeed generate valid XML data. An evaluation section shows how the prototype can handle the definition of a range of related XML-derived formats, such as XFlat, RELAX NG, BibTeXML, and the Wiki notation.

The paper will be relevant to researchers engaged in XML syntactic and semantic issues, as well as to many practitioners using XML to translate between XML and non-XML data representations.

Reviewer:  John Hurst Review #: CR136273 (0909-0875)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy