![[x]](../../../icons/eks.png)
Burnard, L. and Bauman, S.
TEI P5: Guidelines for Electronic Text Encoding and Interchange. Text
Encoding Initiative, 2007
![[x]](../../../icons/eks.png)
Carletta, J.; Kilgour, J.;
O'Donnell, T.; Evert, S. and Voormann, H. The NITE Object Model Library for Handling
Structured Linguistic Annotation on Multimodal Data Sets. In: Proceedings of the EACL
Workshop on Language Technology and the Semantic Web (3rd Workshop on NLP and XML, NLPXML-2003),
2003
![[x]](../../../icons/eks.png)
Carletta, J.; DeRose, S.;
Durusau, P.; Piez, W.; Sperberg-McQueen, C. M.; Tennison, J. and Witt, A. International
Workshop on Markup of Overlapping Structures. In: Usdin, B. T. (ed.) Proceedings of
Extreme Markup Languages 2007, 2007
![[x]](../../../icons/eks.png)
Carpenter, B. The
Logic of Typed Feature Structures: With Applications to Unification Grammars, Logic Programs and
Constraint Resolution. Cambridge University Press, 1992
![[x]](../../../icons/eks.png)
DeRose, S. Markup Overlap: A
Review and a Horse. In: Usdin, B. T. (ed.) Proceedings of Extreme Markup Languages
2004, 2004
![[x]](../../../icons/eks.png)
Diestel, R. Graph
Theory. Springer, 2005
![[x]](../../../icons/eks.png)
Hilbert, M.; Schonefeld, O.
and Witt, A. Making CONCUR work. In: Usdin, B. T. (ed.) Proceedings of
Extreme Markup Languages 2005, 2005
![[x]](../../../icons/eks.png)
24610-1:2006, I. Language
Resource Management -- Feature Structures -- Part 1: Feature Structure
Representation.International Organization for Standardization, 2006
![[x]](../../../icons/eks.png)
Kay, M. XSLT 2.0 and XPath 2.0
Programmer's Reference. Wrox Press Ltd., 2008
![[x]](../../../icons/eks.png)
Custom Metadata Group. In:
Journal Archiving and Interchange Tag Set Tag Library version 3.0, Version of November
2008.
![[x]](../../../icons/eks.png)
Pollard, C. and Sag, I.
Head-Driven Phrase Structure Grammar. The University of Chicago Press,
1994
![[x]](../../../icons/eks.png)
Sailer, M. and Richter, F.
Eine XML-Kodierung für AVM-Beschreibungen. In: Lobin, H. (ed.). Sprach- und
Texttechnologie in digitalen Medien: Proceedings der GLDV-Frühjahrstagung 2001. BOD - Books on
Demand, 2001, 161-168
![[x]](../../../icons/eks.png)
Schonefeld, O. and
Witt, A. Towards validation of concurrent markup. In: Usdin, B. T. (ed.).
Proceedings of Extreme Markup Languages 2006, 2006
![[x]](../../../icons/eks.png)
Shieber, S. M. An
Introduction to Unification-based Approaches to Grammar. CSLI Publications,
1986
![[x]](../../../icons/eks.png)
Sperberg-McQueen, C. M.
and Burnard, L. TEI Guidelines for Electronic Text Encoding and Interchange (TEI
P3). Text Encoding Initiative, 1994
![[x]](../../../icons/eks.png)
Sperberg-McQueen, C. M.
and Burnard, L. Guidelines for Electronic Text Encoding and Interchange (TEI
P4). Text Encoding Initiative, 2001
![[x]](../../../icons/eks.png)
Sperberg-McQueen,
C. M. Representation of overlapping structures. In: Usdin, B. T. (ed.)
Extreme Markup Languages 2007, 2007
![[x]](../../../icons/eks.png)
Tennison, J. Beginning
XSLT 2.0: From Novice to Professional. Apress, 2005
![[x]](../../../icons/eks.png)
Witt, A. Multiple Hierarchies:
New Aspects of an Old Solution. In: Usdin, B. T. (ed.) Proceedings of Extreme Markup
Languages 2004, 2004
![[x]](../../../icons/eks.png)
Witt, A.; Goecke, D.; Sasaki, F.
and Lüngen, H. Unification of XML Documents with Concurrent Markup. Literary
and Linguistic Computing, 2005, 20, 103-116
![[x]](../../../icons/eks.png)
Witt, A.; Schonefeld, O.; Rehm, G.;
Khoo, J. and Evang, K. On the Lossless Transformation of Single-File Multi-Layer
Annotations into Multi-Rooted Trees. In: Usdin, B. T. (ed.). Proceedings of Extreme
Markup Languages 2007, 2007
![[x]](../../../icons/eks.png)
Witt, A.; Rehm, G.; Hinrichs, E.;
Lehmberg, T. and Stegmann, J. SusTEInability of Linguistic Resources through Feature
Structures. Literary and Linguistic Computing, 2009, 24, 363-372
![[x]](../../../icons/eks.png)
Wörner, K.; Witt, A.; Rehm, G.
and Dipper, S. Modelling Linguistic Data Structures. In: Usdin, B. T. (ed.).
Proceedings of Extreme Markup Languages 2006, 2006
TEI Feature Structures as a Representation Format for Multiple Annotation and Generic XML
Documents
Andreas Witt
Institute for the German Language (IDS), Mannheim
Abstract
Feature structures are mathematical entities (rooted labeled directed acyclic graphs) that
can be represented as graph displays, attribute value matrices or as XML adhering to the
constraints of a specialized TEI tag set. We demonstrate that this latter ISO-standardized
format can be used as an integrative storage and exchange format for sets of multiple annotation
XML documents. This specific domain of application is rooted in the approach of multiple
annotations, which marks a possible solution for XML-compliant markup in scenarios with
conflicting annotation hierarchies. A more extreme proposal consists in the possible use as a
meta-representation format for generic XML documents. For both scenarios our strategy concerning
pertinent feature structure representations is grounded on the XDM (XQuery 1.0 and XPath 2.0
Data Model). The ubiquitous hierarchical and sequential relationships within XML documents are
represented by specific features that take ordered list values. The mapping to the TEI feature
structure format has been implemented in the form of an XSLT 2.0 stylesheet. It can be
characterized as exploiting aspects of both the push and pull processing paradigm as
appropriate. An indexing mechanism is provided with regard to the multiple annotation documents
scenario. Hence, implicit links concerning identical primary data are made explicit in the
result format. In comparison to alternative representations, the TEI-based format does well in
many respects, since it is both integrative and well-formed XML. However, the result documents
tend to grow very large depending on the size of the input documents and their respective markup
structure. This may also be considered as a downside regarding the proposed use for generic XML
documents. On the positive side, it may be possible to achieve a hookup to methods and
applications that have been developed for feature structure representations in the fields of
(computational) linguistics and knowledge representation.
TEI Feature Structures as a Representation Format for Multiple Annotation and Generic XML
Documents
Balisage: The Markup Conference 2009
August 11 - 14, 2009
The materials listed below were provided by the speaker as supplements to a
presentation at Balisage. These materials may include the slides or visuals used in the
presentation; supplementary material, such as code samples or a demonstration application;
and/or the paper underlying the presentation (if it has not been provided in XML). These
materials have been zipped for easy download and are identified by a brief description of
the contents. The materials themselves are untouched
, that is, they
have not been tested or edited by Balisage: The Markup Conference or by Mulberry
Technologies, Inc. As such, they are included on this website AS IS
,
i.e., as provided by the speaker, with no warranties, express or otherwise, made by Balisage
or Mulberry.
Slides and Materials
Author's keywords for this paper: Overlapping Structures; Multiple Hierarchies; Multiple Annotation; TEI; Text Encoding Initiative; Feature Structures