Alink, W., Bhoedjang, R., de
Vries, A. P., and Boncz, P. A. Efficient XQuery Support for Stand-Off
Annotation. In: Proceedings of the 3rd International Workshop on XQuery
Implementation, Experience and Perspectives, in cooperation with ACM SIGMOD, Chicago, USA,
2006
Alink, W., Jijkoun, V., Ahn,
D., and de Rijke, M. Representing and Querying Multi-dimensional Markup
for Question Answering. In: Proceedings of the 5th EACL Workshop on NLP and XML
(NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing, Trento,
2006
Bird, S. and Liberman, M.
Annotation graphs as a framework for multidimensional linguistic data
analysis. In: Proceedings of the Workshop "Towards Standards and Tools for
Discourse Tagging", pages 1–10. Association for Computational Linguistics, 1999
Bird, S. and Liberman, M.
A formal framework for linguistic annotation. Speech
Communication, 33(1–2): pages 23–60, 2001. Doi: 10.1016/S0167-6393(00)00068-6
Bird, S., Chen, Y., Davidson, S.,
Lee, H. and Zheng,Y. Designing and Evaluating an XPath Dialect for
Linguistic Queries. In: Proceedings of the 22nd International Conference on Data
Engineering (ICDE), Atlanta, USA, 2006. Doi: 10.1109/ICDE.2006.48
Burnard, L., and Bauman,
S. (eds.). TEI P5: Guidelines for Electronic Text Encoding and
Interchange. published for the TEI Consortium by Humanities Computing Unit,
University of Oxford, Oxford, Providence, Charlottesville, Nancy, 2008
Carletta, J., Kilgour, J.,
O’Donnel, T. J., Evert, S. and Voormann, H. The NITE Object Model
Library for Handling Structured Linguistic Annotation on Multimodal Data Sets.
In: Proceedings of the EACL Workshop on Language Technology and the Semantic Web (3rd Workshop
on NLP and XML (NLPXML-2003)), Budapest, Ungarn, 2003
Carletta, J.; Evert, S.;
Heid, U. and Kilgour, J. The NITE XML Toolkit: data model and query
language. In: Language Resources and Evaluation, Springer, Dordrecht, 2005, 39,
pages 313-334. Doi: 10.1007/s10579-006-9001-9
Cowan, J., Tennison J., and Piez,
W. LMNL update. In: Proceedings of Extreme Markup Languages,
Montréal, Québec, 2006
DeRose, S. J. Markup Overlap: A Review and a Horse. In: Proceedings of Extreme Markup
Languages, Montréal, Québec, 2004
Dipper, S. XML-based stand-off representation and exploitation of multi-level linguistic
annotation. In: Proceedings of Berliner XML Tage 2005 (BXML 2005), pages 39–50,
Berlin, Germany, 2005
Dipper, S., Götze, M., Küssner,
U. and Stede, M. Representing and Querying Standoff XML. In:
Rehm, G., Witt, A. and Lemnitzer, L. (eds.), Datenstrukturen für linguistische Ressourcen und
ihre Anwendungen. Data Structures for Linguistic Resources and Applications. Proceedings of
the Biennial GLDV Conference 2007, pages 337–346, Tübingen, 2007. Gunter Narr
Verlag
Durusau, P. and
O'Donnell, M.B.. Concurrent Markup for XML Documents. In:
Proceedings of the XML Europe conference 2002.
Durusau, P. &
O'Donnel, M. B. Tabling the Overlap Discussion. In:
Proceedings of Extreme Markup Languages, Montréal, Québec, 2004
Goecke, D., Lüngen, H.,
Metzing, D., Stührenberg, M. and Witt, A. Different Views on Markup.
Distinguishing levels and layers. In: Linguistic modeling of information and
Markup Languages. Contributions to language technology. Springer, 2009. To
appear
Gleim, R., Waltinger, U., Ernst,
A., Mehler, A., Esch, D., and Feith, T. The eHumanities Desktop
– An Online System for Corpus Management and Analysis in Support of Computing in
the Humanities. In: Proceedings of the Demonstrations Session of the 12th
Conference of the European Chapter of the Association for Computational Linguistics EACL 2009,
30 March – 3 April, Athens, 2009
Huitfeldt,
C. and Sperberg-McQueen, C. M. TexMECS: An experimental markup
meta-language for complex documents. Markup Languages and Complex Documents
(MLCD) Project, February 2001
Iacob, I. E. and Dekhtyar,
A. Processing XML documents with overlapping hierarchies In:
JCDL '05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, ACM Press,
2005, pages 409-409. Doi: 10.1145/1065385.1065513
Iacob, I. E. and
Dekhtyar, A. Towards a Query Language for Multihierarchical XML:
Revisiting XPath. In: Proceedings of the 8th International Workshop on the Web
& Databases (WebDB 2005), 2005, pages 49-54
Ide, N. and Romary, L. Towards International Standards for Language Resources. In: Dybkjaer,
L., Hemsen, H., and Minker, W., (eds.), Evaluation of Text and Speech Systems, pages 263-284.
Springer
Ide, N. and Suderman, K.
GrAF: A Graph-based Format for Linguistic Annotations. In:
Proceedings of the Linguistic Annotation Workshop, pages 1-8, Prague, Czech Republic.
Association for Computational Linguistics, 2007
ISO/IEC 19757-2:2003. Information technology – Document Schema Definition Language (DSDL) –
Part 2: Regular-grammar-based validation – RELAX NG (ISO/IEC 19757-2).
International Standard, International Organization for Standardization, Geneva,
2003
ISO/IEC 19757-3:2006.
Information technology – Document Schema Definition Language (DSDL)
– Part 3: Rule-based validation – Schematron. International standard,
International Organization for Standardization, Geneva, 2006
Jagadish, H. V.,
Lakshmanany, L. V. S., Scannapieco, M., Srivastava, D. and Wiwatwattana, N. Colorful XML: One hierarchy isn’t enough. In: Proceedings of ACM
SIGMOD International Conference on Management of Data (SIGMOD 2004), pages 251–262, Paris,
June 13-18 2004. ACM Press New York, NY, USA. Doi: 10.1145/1007568.1007598
Kay, M. XSL
Transformations (XSLT) Version 2.0. World Wide Web Consortium. 2007. –
W3C Recommendation
Kay, M. XSLT 2.0 and
XPath 2.0 Programmer’s Reference. Wiley Publishing, Indianapolis, 4th edition,
2008
Marinelli, P., Vitali,
F., and Zacchiroli, S. Towards the unification of formats for
overlapping markup. In: New Review of Hypermedia and Multimedia, 14(1): pages
57-94, 2008. Doi: 10.1080/13614560802316145
Marcoux, Y. Graph characterization of overlap-only texmecs and other overlapping markup
formalisms. In: Proceedings of Extreme Markup Languages, Montréal, Québec,
2008
Marcoux, Y. Variants of GODDAGs and suitable first-layer semantics. Presentation given at the
GODDAG workshop, Amsterdam, 1-5 December 2008
Pianta, E. and
Bentivogli., L. Annotating Discontinuous Structures in XML: the
Multiword Case. In: Proceedings of LREC 2004 Workshop on ”XML-based richly
annotated corpora”, pages 30–37, Lisbon, Portugal.
Poesio, M., Diewald, N.,
Stührenberg, M., Chamberlain, J., Jettka, D., Goecke, D. and Kruschwitz, U. Markup Infrastructure for the Anaphoric Bank, Part I: Supporting Web
Collaboration. In: Mehler, A., Kühnberger, K.-U., Lobin, H., Lüngen, H., Storrer,
A. and Witt, A. (eds.), Modelling, Learning and Processing of Text Technological Data
Structures, Dordrecht: Springer, Berlin, New York. To appear
Schonefeld, O. XCONCUR and XCONCUR-CL: A constraint-based approach for the validation of
concurrent markup. In: Rehm, G., Witt, A., Lemnitzer, L. (eds.), Datenstrukturen
für linguistische Ressourcen und ihre Anwendungen. Data Structures for Linguistic Resources
and Applications. Proceedings of the Biennial GLDV Conference 2007, Tübingen, Germany, 2007.
Gunter Narr Verlag
Sperberg-McQueen, C. M., Huitfeldt, C. and Renear, A.. Meaning and
Interpretation of markup. Markup Languages – Theory & Practice,
2, pages 215-234, 2000. Doi: 10.1162/109966200750363599
Sperberg-McQueen, C. M., Dubin, D., Huitfeldt, C. and Renear, A. Drawing inferences on the basis of markup. In: Proceedings of Extreme Markup
Languages, 2002
Sperberg-McQueen, C. M. and Huitfeldt, C. GODDAG: A Data Structure for
Overlapping Hierarchies. In: King, P. and Munson, E. V. (eds.), Proceedings of
the 5th International Workshop on the Principles of Digital Document Processing (PODDP 2000),
volume 2023 of Lecture Notes in Computer Science, pages 139–160. Springer, 2004
Sperberg-McQueen,
C. M. Rabbit/Duck grammars: a validation method for overlapping
structures. In: Proceedings of Extreme Markup Languages, Montréal, Québec,
2006
Sperberg-McQueen,
C. M. Representation of overlapping structures. In:
Proceedings of Extreme Markup Languages, Montréal, Québec, 2007
Sperberg-McQueen, C. M. and Huitfeldt, C. Markup Discontinued
Discontinuity in TexMecs, Goddag structures, and rabbit/duck grammars. Presented at Balisage: The Markup Conference 2008, Montréal, Canada, August 12 - 15, 2008. In: Proceedings of Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies, vol. 1 (2008). Doi: 10.4242/BalisageVol1.Sperberg-McQueen01
Sperberg-McQueen, C. M. and Huitfeldt, C. GODDAG. Presented at the Goddag workshop, Amsterdam, 1-5 December 2008
Stührenberg, M.,
Goecke, D, Diewald, N., Cramer, I. and Mehler, A. Web-based annotation
of anaphoric relations and lexical chains. In: Proceedings of the Linguistic
Annotation Workshop (LAW), pages 140–147, Prague. Association for Computational Linguistics,
2007
Stührenberg, M.
and Goecke, D.SGF – An integrated model for multiple
annotations and its application in a linguistic domain. Presented at Balisage: The Markup Conference 2008, Montréal, Canada, August 12 - 15, 2008. In: Proceedings of Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies, vol. 1 (2008). Doi: 10.4242/BalisageVol1.Stuehrenberg01
Tennison, J. Layered Markup and Annotation Language (LMNL). In: Proceedings of Extreme Markup
Languages, Montréal, Québec, 2002
Tennison, J. Creole: Validating Overlapping Markup.In: Proceedings of XTech 2007: The
Ubiquitous Web Conference, 2007
Thompson, H. S. and
D. McKelvie. Hyperlink semantics for standoff markup of read-only
documents. In: Proceedings of SGML Europe ’97: The next decade – Pushing the
Envelope, pages 227–229, Barcelona, 1997
Waltinger, U., Mehler, A.
Mehler, and Stührenberg, M. An Integrated Model of Lexical Chaining:
Application, Resources and its Format. Proceedings of the 9th Conference on
Natural Language Processing (KONVENS 2008)
Walsh, N., Milowski, A., and Thompson, H.
S. (2009). XProc: An XML Pipeline Language. W3C Candidate Recommendation 28 May 2009, World
Wide Web Consortium.
Witt, A. Meaning
and interpretation of concurrent markup. In: Proceedings of ALLC-ACH2002, Joint
Conference of the ALLC and ACH, 2002
Witt, A. Multiple
hierarchies: New Aspects of an Old Solution. In: Proceedings of Extreme Markup
Languages, 2004
Witt, A., Rehm, G., Hinrichs, E.,
Lehmberg, T. and Stegmann, J. SusTEInability of Linguistic Resources through Feature
Structures. In: Literary and Linguistic Computing, 24(3): pages 363-372, 2009. Doi: 10.1093/llc/fqp024
Witt, A., Stührenberg, M.,
Goecke, D. and Metzing, D. Integrated Linguistic Annotation Models and
their Application in the Domain of Antecedent Detection. In: Mehler, A.,
Kühnberger, K.-U., Lobin, H., Lüngen, H., Storrer, A. and Witt, A. (eds.), Modelling, Learning
and Processing of Text Technological Data Structures, Dordrecht: Springer, Berlin, New York.
To appear
A toolkit for multi-dimensional markup
The development of SGF to XStandoff
Maik Stührenberg
Daniel Jettka
Abstract
In this paper we describe the extended standoff approach defined by XStandoff (the
successor of the Sekimo Generic Format, SGF), together with the accompanied collection of
XSLT stylesheets. SGF has undergone further developments after its first presentation (cf.
Stührenberg and Goecke, 2008) which resulted into the new development version called
XStandoff containing different changes addressed in this paper. In addition, refinements
have been made to the already available transformation scripts that help generating SGF and
XStandoff instances and newly developed stylesheets have been added for the deletion of
single XStandoff annotations and the conversion into inline representations.
A toolkit for multi-dimensional markup
The development of SGF to XStandoff
Balisage: The Markup Conference 2009
August 11 - 14, 2009
The materials listed below were provided by the speaker as supplements to a
presentation at Balisage. These materials may include the slides or visuals used in the
presentation; supplementary material, such as code samples or a demonstration application;
and/or the paper accompanying the presentation (if it has not been provided in XML). These
materials have been zipped for easy download and are identified by a brief description of
the contents. The materials themselves are untouched
, that is, they
have not been tested or edited by Balisage: The Markup Conference or by Mulberry
Technologies, Inc. As such, they are included on this website AS IS
,
i.e., as provided by the speaker, with no warranties, express or otherwise, made by Balisage
or Mulberry.
Slides and Materials
Author's keywords for this paper: Concurrent Markup; Overlapping Markup; SGF; XStandoff