A tag algebra for document markup
Lars G Johnsen
University of Bergen, Norway
University of Bergen
This paper takes its point of departure in an overview of the overlap problem, and of proposed solutions to that problem. We then look at some analogies between bracketed markup notations and rules for well-formedness and structuring of simple parenthetical expressions. We propose a method for building lattices from marked up documents with and without overlap, and for generating, from these lattices, document models in the form of trees for XML documents, and in the form of GODDAGs for documents with overlap. It turns out that one and the same method can be used for generating both kinds of models, and we argue that lattices can also be used to implement well-formedness constraints for both kinds of documents. Finally, we discuss and compare some of the algebraic features of the document models, and the relations between them.