Programming Language Support for Semi-Structured Data

XML is a standard format for the exchange of semi-structured data. A variety of XML processing languages exist to specify the transformation of one XML document into another. I am particularly interested in typed languages such as XQuery and XDuce. Typed languages are safe in the sense that we can give strong static guarantees about the well-formedness of XML documents and transformations. The challenge is to integrate high-level languages such as Haskell and ML with the XDuce concepts of regular expression types, regular expression pattern matching and semantic subtyping.

In [30], I co-developed a systematic transformation method among values with different structured but semantically equivalent representations. In some subsequent work, I co-designed a type-preserving compilation scheme from XDuce to ML [19]. These results led to the development of the XHaskell language, a variant of Haskell with regular expression types, regular expression pattern matching and semantic subtyping in style of XDuce. Noteworthy features include the mixing of Haskell data types and parametric polymorphism with regular expression types and a type-preserving translation scheme from XHaskell to Haskell.



Martin Sulzmann 2006-07-19