Balisage Paper: Refining the Taxonomy of XML Schema Languages. A new Approach for Categorizing XML Schema Languages in Terms of Processing Complexity

Balisage: The Markup Conference 2010
August 3 - 6, 2010

The materials listed below were provided by the speaker as supplements to a presentation at Balisage. These materials may include the slides or visuals used in the presentation; supplementary material, such as code samples or a demonstration application; and/or the paper accompanying the presentation (if it has not been provided in XML). These materials have been zipped for easy download and are identified by a brief description of the contents. The materials themselves are untouched, that is, they have not been tested or edited by Balisage: The Markup Conference or by Mulberry Technologies, Inc. As such, they are included on this website AS IS, i.e., as provided by the speaker, with no warranties, express or otherwise, made by Balisage or Mulberry.

Slides and Materials

×

Abiteboul, S., P. Buneman, and D. Suciu (2000). Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann Publishers, San Francisco, California.

×

Ansari, M. S., Zahid, N., and K.-G. Doh. A Comparative Analysis of XML Schema Languages. In Slezak, D., Kim, T., Zhang, Y., Ma, J., and K. Chung, eds., Database Theory and Application. International Conference, DTA 2009, Held as Part of the Future Generation Information Technology Conference, FGIT 2009, Jeju Island, Korea, December 10-12, 2009. Proceedings, volume 64, pages 41– 48. Springer, Berlin, Heidelberg, 2009. doi:https://doi.org/10.1007/978-3-642-10583-8_6.

×

Balmin, A., Papakonstantinou, Y., and V. Vianu (2004). Incremental validation of XML documents. ACM Transactions on Database Systems (TODS), 29(4):710–751. doi:https://doi.org/10.1145/1042046.1042050.

×

Bauman, S., (2008). Freedom to Constrain: where does attribute constraint come from, mommy? In Proceedings of Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies, vol. 1 (2008). doi:https://doi.org/10.4242/BalisageVol1.Bauman01.

×

Bex, G. J., Gelade, W., Martens, W. and F. Neven (2009). Simplifying XML Schema: Effortless Handling of Nondeterministic Regular Expressions. In SIGMOD ’09: Proceedings of the 35th SIGMOD international conference on Management of data, pages 731–744, New York, NY, USA, ACM. doi:https://doi.org/10.1145/1559845.1559922.

×

Brüggemann-Klein, A., and D. Wood (1992). Deterministic Regular Languages. In Finkel, A. and M. Jantzen, eds., STACS 92. 9th Annual Symposium on Theoretical Aspects of Computer Science Cachan, France, February 13–15, 1992 Proceedings, volume 577 of Lecture Notes in Computer Science, pages 173–184. Springer, Berlin, Heidelberg. doi:https://doi.org/10.1007/3-540-55210-3_182.

×

Brüggemann-Klein, A. (1993). Formal Models in Document Processing. Habilitation, Albert-Ludwig-Universität zu Freiburg i. Br.

×

Brüggemann-Klein, A., and D. Wood (1997). One-unambiguous regular languages. Information and computation, 142:182–206. doi:https://doi.org/10.1006/inco.1997.2695.

×

Brüggemann-Klein, A., and D. Wood (2002). The parsing of extended context-free grammars. HKUST Theoretical Computer Science Center Research Report HKUST-TCSC-2002-08, The Hong Kong University of Science and Technology Library.

×

Brüggemann-Klein, A., and D. Wood (2004). Balanced context-free grammars, hedge grammars and pushdown caterpillar automata. In Proceedings of Extreme Markup Languages, Montréal, Québec.

×

Buck, L., Goldfarb, C. F., and P. Prescod (2000). Datatypes for DTDs (DT4DTD) 1.0. W3C Note 13 January 2000, World Wide Web Consortium.

×

Carey, B. M. (2009). Meet CAM: A new XML validation technology. Take semantic and structural validation to the next level. IBM developerworks, IBM Corporation. http://www.ibm.com/developerworks/xml/library/x-cam/?S_TACT=105AGX54&S_CMP=C0924&ca=dnw-1036&ca=dth-x&open&cm_mmc=6015-_-n-_-vrm_newsletter-_-10731_131528&cmibm_em=dm:0:13962324.

×

Chomsky, N. (1955). Logical Syntax and Semantics: Their Linguistic Relevance. Language, 31(1):36–45, 1955. doi:https://doi.org/10.2307/410891.

×

Chomsky, N. (1956). Three Models for the Description of Language. IRE Transactions on Information Theory, 2:113–124, 1956. doi:https://doi.org/10.1109/TIT.1956.1056813.

×

Clark, J. (2001). TREX – Tree Regular Expressions for XML Language Specification. Technical report, Thai Open Source Software Center Ltd.

×

Clark, J., J. Cowan, and M. Murata, (2003). Relax NG Compact Syntax Tutorial. Working Draft 26 March 2003, OASIS –- Organization for the Advancement of Structured Information Standards. http://relaxng.org/compact-tutorial-20030326.html.

×

Comon, H., M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi (2007). Tree Automata Techniques and Applications. Release November, 18th 2008. http://www.grappa.univ-lille3.fr/tata.

×

Costello, R. L., and R. A. Simmons (2008). Tutorials on Schematron: Two Types of XML Schema Language. http://www.xfront.com/schematron/Two-types-of-XML-Schema-Language.html.

×

Møller, A. (2005). Document Structure Description 2.0. Technical report, BRICS (Basic Research in Computer Science, Aarhus University), 2005. http://www.brics.dk/DSD/dsd2.html.

×

Fiorello, D., Gessa, N., Marinelli, P., and F. Vitali. DTD++ 2.0: Adding support for co-constraints. In Proceedings of Extreme Markup Languages, Montréal, Québec.

×

Gelade, W, Martens, W., and F. Neven (2009). Optimizing Schema Languages for XML: Numerical Constraints and Interleaving. SIAM Journal on Computing, 38(5):2021–2043. doi:https://doi.org/10.1137/070697367.

×

Goldfarb, C. F. (1978). DCF GML User’s Guide (IBM SH20-9160). IBM, 1978.

×

Goldfarb, C. F. (1991). The SGML Handbook. Oxford University Press, Oxford.

×

Gécseg, F., and M. Steinby (1997). Tree languages. In Handbook of Formal Languages, volume 3, pages 1-68. Springer, New York.

×

Hopcroft, J., R. Motwani, and J. Ullman (2000). Introduction to Automata Theory, Languages, and Computation. 2nd edition. Addison Wesley Longman, Amsterdam.

×

Jeliffe, R. (2009). Is Schematron a rules language? Online: http://broadcast.oreilly.com/2009/01/is-schematron-a-rules-language.html.

×

Kilpeläinen, P., and R. Tuhkanen (2007). One-unambiguity of regular expressions with numeric occurrence indicators. Information and Computation, 205(6):890–916. doi:https://doi.org/10.1016/j.ic.2006.12.003.

×

Klarlund, N., T. Schwentick, and D. Suciu (2003). XML: Model, Schemas, Types, Logics and Queries. In Chomicki, J., R. van der Meyden, and G. Saake, eds., Logics for Emerging Applications of Databases, pages 1-41. Springer, Berlin, Heidelberg.

×

Kracht, M. (to appear). Modal Logic Foundations of Markup Structures in Annotation Systems. In Mehler, A., Kühnberger, K.-U., Lobin, H., Lüngen, H., Storrer, A., and A. Witt, eds., Modeling, Learning and Processing of Text Technological Data Structures, Studies in Computational Intelligence. Springer, Dordrecht.

×

Lee, D. and W. Chu. Comparative Analysis of Six XML Schema Languages. ACM SIGMOD Record, 29(3):76–87, September 2000. doi:https://doi.org/10.1145/362084.362140.

×

Maler, E., and J. E. Andaloussi (1995). Developing SGML DTDs: From Text to Model to Markup. Prentice Hall, Upper Saddle River, New Jersey

×

M. Mani (2001). Keeping chess alive: Do we need 1-unambiguous content models? In Proceedings of Extreme Markup Languages, Montréal, Québec.

×

Mani, M., and D. Lee (2002). XML to Relational Conversion using Theory of Regular Tree Grammars. In Proceedings of the 28th VLDB Conference, Hong Kong, China.

×

Marcoux, Y. (2008). Graph characterization of overlap-only TexMECS and other overlapping markup formalisms. In Proceedings of Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies, vol. 1. Montréal, Québec. doi:https://doi.org/10.4242/BalisageVol1.Marcoux01.

×

Martens, W., Neven, F., and T. Schwentick (2005). Which XML Schemas Admit 1-Pass Preorder Typing? In Eiter, T., and L. Libkin, eds., Database Theory – ICDT 2005, volume 3363 of Lecture Notes in Computer Science, pages 68–82. Springer, Berlin, Heidelberg, 2005. doi:https://doi.org/10.1007/978-3-540-30570-5_5.

×

Martens, W., Neven, F., Schwentick, T., and G. Bex (2006). Expressiveness and Complexity of XML Schema. ACM Transactions on Database Systems (TODS), 31(3):770–813. doi:https://doi.org/10.1145/1166074.1166076.

×

Martens, W., Neven, F. and T. Schwentick (2007). Simple off the shelf abstractions for XML schema. SIGMOD Rec., 36(3):15–22. doi:https://doi.org/10.1145/1324185.1324188.

×

Martens, W., Neven, F. and T. Schwentick (2009). Complexity of Decision Problems for XML Schemas and Chain Regular Expressions. SIAM Journal on Computing, 39(4):1486–1530. doi:https://doi.org/10.1137/080743457.

×

Møller, A., and M. Schwartzbach (2006). An Introduction to XML and Web Technologies, chapter Schema Languages, pages 92–187. Addison-Wesley, Harlow, England.

×

Murata, M., D. Lee, and M. Mani (2001). Taxonomy of XML Schema Languages using Formal Language Theory. In Proceedings of Extreme Markup Languages, Montréal, Québec.

×

Murata, M., D. Lee, M. Mani, and K. Kawaguchi (2005). Taxonomy of XML Schema Languages Using Formal Language Theory. ACM Transactions on Internet Technology, 5(4):660–704. doi:https://doi.org/10.1145/1111627.1111631.

×

Nentwich, C. (2005). CLiX – A Validation Rule Language for XML. Presented by Anthony Finkelstein at W3C Workshop on Rule Languages for Interoperability, 27-28 April 2005, Washington D.C. http://www.w3.org/2004/12/rules-ws/paper/24/.

×

ISO/IEC 19757-4:2006. Information technology — Document Schema Definition Languages (DSDL) — Part 4: Namespace-based Validation Dispatching Language (NVDL), International Standard, International Organization for Standardization, Geneva.

×

Odgen, W. (1968). A Helpful Result for Proving Inherent Ambiguity. In Mathematical Systems Theory, 2(3):191–194. doi:https://doi.org/10.1007/BF01694004.

×

Pawson, D. (2007). ISO Schematron tutorial. http://www.dpawson.co.uk/schematron/.

×

Papakonstantinou, Y., and V. Vianu (2000). DTD inference for views of XML data. In PODS ’00: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 35–46, New York, NY, USA, ACM. doi:https://doi.org/10.1145/335168.335173.

×

Piez, W. (2001). Beyond the “descriptive vs. procedural” distinction. In Markup Languages – Theory & Practice, 3(2):141–172. doi:https://doi.org/10.1162/109966201317356380.

×

ISO/IEC TR 22250-1:2002. Information technology – Document description and processing languages – Regular Language Description for XML – part 1: RELAX Core. International Standard, International Organization for Standardization, Geneva.

×

ISO/IEC 19757-2:2008. Information technology – Document Schema Definition Language (DSDL) – Part 2: Regular-grammar-based validation – RELAX NG (ISO/IEC 19757-2). International Standard, International Organization for Standardization, Geneva.

×

ISO/IEC 19757-2:2008. Information technology – Document Schema Definition Language (DSDL) – Part 2: Regular-grammar-based validation – RELAX NG (ISO/IEC 19757-2). Second Edition. International Standard, International Organization for Standardization, Geneva.

×

Rizzi, R. (2001). Complexity of context-free grammars with exceptions and the inadequacy of grammars as models for XML and SGML. Markup Languages – Theory & Practice, 3(1):107–116. doi:https://doi.org/10.1162/109966201753537222.

×

Rogers, J. (2003). Syntactic Structures as Multi-dimensional Trees. In Research on Language and Computation, 1(3-4):265–305. doi:https://doi.org/10.1023/A:1024695608419.

×

Sasaki, F. (2010). How to avoid suffering from markup: A project report about the virtue of hiding xml. In XML Prague 2010 Conference Proceedings, pages 105–123, Prague, Czech Republic, March 13–14 2010. Institute for Theoretical Computer Science.

×

ISO/IEC 19757-3:2006 Information technology — Document Schema Definition Languages (DSDL) — Part 3: Rule-based validation — Schematron. International Standard, International Organization for Standardization, Geneva.

×

ISO 8879:1986. Information Processing — Text and Office Information Systems — Standard Generalized Markup Language. International Standard, International Organization for Standardization, Geneva.

×

Sperberg-McQueen, C. M. (2003). Logic grammars and XML Schema. In Proceedings of Extreme Markup Languages, Montréal, Québec.

×

Sperberg-McQueen, C. M. and C. Huitfeldt (2004). GODDAG: A Data Structure for Overlapping Hierarchies. In King, P. and E. V. Munson, eds. Proceedings of the 5th International Workshop on the Principles of Digital Document Processing (PODDP 2000), volume 2023 of Lecture Notes in Computer Science, pages 139–160. Springer, 2004

×

Stührenberg, M. and D. Jettka (2009). A toolkit for multi-dimensional markup: The development of SGF to XStandoff. In Proceedings of Balisage: The Markup Conference 2009. Balisage Series on Markup Technologies, vol. 3 (2009). Montréal, Québec. doi:https://doi.org/10.4242/BalisageVol3.Stuhrenberg01.

×

Vitali, F., Amorosi, N., and N. Gessa. Datatype- and namespace-aware DTDs: A minimal extension. In Proceedings of Extreme Markup Languages, Montré́al, Québec.

×

van der Vlist, E. (2001). Comparing XML Schema Languages, 12 December 2001. http://www.xml.com/pub/a/2001/12/12/schemacompare.html.

×

van der Vlist, E. (2003). RELAX NG. O’Reilly, Sebastopol.

×

Extensible Markup Language (XML) 1.0. W3C Recommendation, World Wide Web Consortium, 10 February 1998. http://www.w3.org/TR/1998/REC-xml-19980210.

×

Extensible Markup Language (XML) 1.0 (Fifth Edition). W3C Recommendation, World Wide Web Consortium, 26 November 2008. http://www.w3.org/TR/2008/REC-xml-20081126/.

×

Namespaces in XML 1.0 (Third Edition). W3C Recommendation, World Wide Web Consortium, 8 December 2009. http://www.w3.org/TR/2009/REC-xml-names-20091208/.

×

XML Schema Part 0: Primer Second Edition. W3C Recommendation, World Wide Web Consortium, 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/.

×

XML Schema Part 1: Structures Second Edition. W3C Recommendation, World Wide Web Consortium, 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/.

×

XML Schema Part 2: Datatypes Second Edition. W3C Recommendation, World Wide Web Consortium, 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/.

×

W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures. W3C Working Draft, World Wide Web Consortium, 3 December 2009. http://www.w3.org/TR/2009/WD-xmlschema11-1-20091203/.

×

W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. W3C Working Draft, World Wide Web Consortium, 3 December 2009. http://www.w3.org/TR/2009/WD-xmlschema11-2-20091203/.

×

XSL Transformations (XSLT) Version 2.0. W3C Recommendation, World Wide Web Consortium, 23 January 2007. http://www.w3.org/TR/2007/REC-xslt20-20070123/.

×

XQuery 1.0: An XML Query Language. W3C Recommendation, World Wide Web Consortium, 23 January 2007. http://www.w3.org/TR/2007/REC-xquery-20070123/.

Author's keywords for this paper:
XML; Formal Language Theory; Schema Languages