How to cite this paper

van der Vlist, Eric. “XML instances to validate XML schemas.” Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). https://doi.org/10.4242/BalisageVol9.Vlist02.

International Symposium on Quality Assurance and Quality Control in XML
August 6, 2012

Balisage Paper: XML instances to validate XML schemas

Eric van der Vlist

Dyomedea

Eric is an independent consultant and trainer. His domain of expertise include Web development and XML technologies.

He is the creator and main editor of XMLfr.org, the main site dedicated to XML technologies in French, the author of the O'Reilly animal books XML Schema and RELAX NG and a member or the ISO DSDL (http://dsdl.org) working group focused on XML schema languages.

He is based in Paris and you can reach him by mail (vdv@dyomedea.com) or meet him in one of the many conferences where he presents his projects.

Published under the Creative Commons "cc by" license

Abstract

Ever modified an XML schema? Ever broken something while fixing a bug or adding a new feature? As with any piece of engineering, the more complex a schema is, the harder it is to maintain. In other domains, unit tests dramatically reduce the number of regressions and thus provide a kind of safety net for maintainers. Can we learn from these techniques and adapt them to XML schema languages? In this workshop session, we develop a schema using unit test techniques, to illustrate their benefits in this domain.

Table of Contents

Step 1: Getting started
Step 2: Adding a schema
Step 3: Adding list title elements
Step 4: Adding to-do item elements
Step 5: Modularizing the schema
Want to try it?

The workshop is run as an exchange between a customer (played by Tommie Usdin) and a schema expert (played by Eric van der Vlist).

The customer, who needed a schema for her to-do list XML application, is puzzled by the "test first programming" technique imposed by the schema expert.

At the end of the day (or workshop), will she be converted to this well-known agile or extreme programming technique adapted to the development of XML schemas?

Step 1: Getting started

Hi Eric, can you help me to write a schema?

— Customer

Hi Tommie, yes, sure, what will the schema be about?

— Expert

I need a vocabulary for my to-do lists, with to-do item...

— Customer

OK, you've told me enough, let's get started.

— Expert (interrupting his customer)

Get started? but I haven't told you anything about it!

— Customer

Right, but it's never too soon to write tests when you do test first programming!

— Expert

Note

In test first programming (also called test driven development), developers create test case (usually unit test cases) before implementing a function. The test suite is run, code is written based on the result of these tests and the test suite and code are updated until all the tests pass.

Test suite (suite.xml):

<tf:suite xmlns:tf="http://xmlschemata.org/test-first/" xmlns:todo="http://balisage.net/todo#" title="Basic tests">
    <tf:case title="Root element" expected="valid" id="root">
        <todo:list/>
    </tf:case>
</tf:suite>

Note

The vocabulary used to define these test cases has been inspired by the SUT (XML Schema Unit Test) project. It's a simple vocabulary (composed of only three different elements) allowing to pack several XML instances together with the outcome validation result. It uses conventions that you'll discover during the course of this workshop.

Figure 1: Test results

Note

The test suite is run using a simple Orbeon Forms application. The rendering relies on Orbeon Forms XForms' implementation while the test suite is run using an Orbeon Forms' XPL pipeline.

Step 2: Adding a schema

You see, you can already write to-do lists!

— Expert

Hold on, we don't have any schema!

— Customer

That's true, but you don't have to write a schema to write XML documents.

— Expert

I know, but you're here to write a schema! Furthermore, right now we accept anything. I don't want to have XML documents with anything as a root element!

— Customer

That's a good reason to write a schema, but before that we need to add a test in our suite first.

— Expert

Test suite (suite.xml):

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite xmlns:tf="http://xmlschemata.org/test-first/" xmlns:todo="http://balisage.net/todo#" title="Basic tests">
    <tf:case title="TODO list toot element" expected="valid" id="root">
        <todo:list/>
    </tf:case>
    <tf:case title="Other root element" expected="error" id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>

Now that we've updated the test suite, we run it again.

— Expert

Figure 2: Test results

This result was expected, and we can now proceed to create a schema and attach it to the test suite.

— Expert

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    elementFormDefault="qualified"
    targetNamespace="http://balisage.net/todo#"
    xmlns="http://balisage.net/todo#">
    
    <xs:element
        name="list"/>
</xs:schema>
todo.xsd

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite
    xmlns:tf="http://xmlschemata.org/test-first/"
    xmlns:todo="http://balisage.net/todo#"
    title="Basic tests">
    <tf:validation
        href="todo.xsd"
        type="xsd"/>
    <tf:case
        title="TODO list toot element"
        expected="valid"
        id="root">
        <todo:list/>
    </tf:case>
    <tf:case
        title="Other root element"
        expected="error"
        id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>
suite.xml

It's time to test again what we've done.

— Expert

Figure 3: Test results

Step 3: Adding list title elements

I am happy to see some progress, at last, but I don't want to accept any content in the to-do list element. Can you add list title elements?

— Customer

Sure, back to the test suite...

— Expert

Test suite (suite.xml):

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite
    xmlns:tf="http://xmlschemata.org/test-first/"
    xmlns:todo="http://balisage.net/todo#"
    title="Basic tests">
    <tf:validation
        href="todo.xsd"
        type="xsd"/>
    <tf:case
        title="TODO list root element"
        expected="valid"
        id="root">
        <todo:list/>
    </tf:case>
    <tf:case
        title="TODO list with a title"
        expected="valid"
        id="list-title">
        <todo:list>
            <todo:title/>
        </todo:list>
    </tf:case>
    <tf:case
        title="Other root element"
        expected="error"
        id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>

Now that we've updated the test suite, we run it again.

— Expert

Figure 4: Test results

You see? We do already support list title elements!

— Expert

Sure, but I don't want to accept any content in my to-do list. And the title element should be mandatory. And it should not be empty but have at least one character!

— Customer

Back to the test suite, then...

— Expert

Test suite (suite.xml):

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite
    xmlns:tf="http://xmlschemata.org/test-first/"
    xmlns:todo="http://balisage.net/todo#"
    title="Basic tests">
    <tf:validation
        href="todo.xsd"
        type="xsd"/>
    <todo:list>
        <tf:case
            title="Empty list element"
            expected="error"
            id="root-empty"/> 
        <todo:title>
            <tf:case title="empty title" expected="error" id="empty-title"/>
            <tf:case title="non empty title" expected="valid" id="non-empty-title">A title</tf:case>
        </todo:title>
        <tf:case
            title="Un expected element"
            expected="error"
            id="unexpected">
            <todo:foo/>
        </tf:case>
    </todo:list>
    <tf:case
        title="Other root element"
        expected="error"
        id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>

Note

This is the first example with non-top-level tf:case elements. To understand how this works, we must look in more detail to the algorithm used by the framework to split a test suite into instances. The algorithm consists in two steps:

  • Loop over each tf:case element

  • Suppression of the tf:case elements and of the top level elements which are not ancestors of the current tf:case element.

This description may look complex, but the result is a rather intuitive way to define sub-trees that are common to several test cases.

Now that we've updated the test suite, we run it again.

— Expert

Figure 5: Test results

Looks good, now we can update the schema.

— Expert

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    elementFormDefault="qualified"
    targetNamespace="http://balisage.net/todo#"
    xmlns="http://balisage.net/todo#">
    
    <xs:element
        name="list">
        <xs:complexType>
            <xs:sequence>
                <xs:element
                    name="title">
                    <xs:simpleType>
                        <xs:restriction
                            base="xs:token">
                            <xs:minLength
                                value="1"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
todo.xsd

And run the test suite again.

— Expert

Figure 6: Test results

Step 4: Adding to-do item elements

Good. Now I want to add to-do items. And lists should have at least one of them, by the way.

— Customer

Sure, back to the test suite...

— Expert

Test suite (suite.xml):

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite
    xmlns:tf="http://xmlschemata.org/test-first/"
    xmlns:todo="http://balisage.net/todo#"
    title="Basic tests">
    <tf:validation
        href="todo.xsd"
        type="xsd"/>
    <tf:case
        title="Empty list element"
        expected="error"
        id="root-empty">
        <todo:list/>
    </tf:case>
    <todo:list>
        <!-- Testing title elements -->
        <todo:title>
            <tf:case
                title="empty title"
                expected="error"
                id="empty-title"/>
            <tf:case
                title="non empty title"
                expected="valid"
                id="non-empty-title">A title</tf:case>
        </todo:title>
        <todo:item>
            <todo:title>A title</todo:title>
        </todo:item>
        <tf:case
            title="Un expected element"
            expected="error"
            id="unexpected">
            <todo:foo/>
        </tf:case>
    </todo:list>
    <todo:list>
        <!-- Testing todo items -->
        <todo:title>Testing todo items</todo:title>
        <tf:case
            title="No todo items"
            expected="error"
            id="no-items"/>
        <todo:item>
            <tf:case
                title="empty item"
                expected="error"
                id="empty-item"/>
            <tf:case
                title="item with a title"
                expected="valid"
                id="item-title">
                <todo:title>A title</todo:title>
            </tf:case>
        </todo:item>
    </todo:list>
    <tf:case
        title="Other root element"
        expected="error"
        id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>

Let's see what we get before any update to the schema.

— Expert

Figure 7: Test results

It's time to update the schema and fix what needs to be fixed.

— Expert

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    elementFormDefault="qualified"
    targetNamespace="http://balisage.net/todo#"
    xmlns="http://balisage.net/todo#">
    
    <xs:element
        name="list">
        <xs:complexType>
            <xs:sequence>
                <xs:element
                    name="title">
                    <xs:simpleType>
                        <xs:restriction
                            base="xs:token">
                            <xs:minLength
                                value="1"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:element>
                <xs:element
                    maxOccurs="unbounded"
                    name="item">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element
                                name="title">
                                <xs:simpleType>
                                    <xs:restriction
                                        base="xs:token">
                                        <xs:minLength
                                            value="1"/>
                                    </xs:restriction>
                                </xs:simpleType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
todo.xsd

And now we can check if we get it right.

— Expert

Figure 8: Test results

Step 5: Modularizing the schema

Eric, you should be ashamed, it's a pure Russian doll schema, not modular at all! Why not define the title and list elements globally?

— Customer

Sure, we can do that! If we just change the structure of the schema, we don't need to update the test suite and can work directly on the schema.

— Expert

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    elementFormDefault="qualified"
    targetNamespace="http://balisage.net/todo#"
    xmlns="http://balisage.net/todo#">
    
    <xs:element
        name="list">
        <xs:complexType>
            <xs:sequence>
                <xs:element
                    ref="title"/>
                <xs:element
                    maxOccurs="unbounded"
                    ref="item"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:element
        name="title">
        <xs:simpleType>
            <xs:restriction
                base="xs:token">
                <xs:minLength
                    value="1"/>
            </xs:restriction>
        </xs:simpleType>
    </xs:element>
    <xs:element
        name="item">
        <xs:complexType>
            <xs:sequence>
                <xs:element
                    ref="title"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>
todo.xsd

But of course, each time we update the schema we must check if we've not introduced any bug.

— Expert

Figure 9: Test results

Waoo, what's happening now?

— Customer

Now that our elements are global in the schema, we accept a valid title as a root element. Is that what you want?

— Expert

No way, a title is not a valid list!

— Customer

Then we have a number of options... We can go back to local elements, and we can also add a schematron schema to check this constraint.

— Expert

Schematron is fine, we'll probably find many other constraints to check in my to-do lists anyway...

— Customer

OK. We still don't have to update the test suite since we've not changed our requirement. Let's write this Schematron schema then.

— Expert

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <ns uri="http://balisage.net/todo#" prefix="todo"/>
    <pattern>
        <rule context="/*">
            <assert test="self::todo:list">The root element should be a todo:list</assert>
        </rule>
    </pattern>
</schema>
todo.sch

Saying that we don't have to update the test suite wasn't totally accurate because the schemas are referenced in ths test suite.

— Expert

Test suite (suite.xml):

<?xml version="1.0" encoding="UTF-8"?>
<tf:suite
    xmlns:tf="http://xmlschemata.org/test-first/"
    xmlns:todo="http://balisage.net/todo#"
    title="Basic tests">
    <tf:validation
        href="todo.sch"
        type="sch"/>
    <tf:validation
        href="todo.xsd"
        type="xsd"/>
    <tf:case
        title="Empty list element"
        expected="error"
        id="root-empty">
        <todo:list/>
    </tf:case>
    <todo:list>
        <todo:title>
            <tf:case
                title="empty title"
                expected="error"
                id="empty-title"/>
            <tf:case
                title="non empty title"
                expected="valid"
                id="non-empty-title">A title</tf:case>
        </todo:title>
        <todo:item>
            <todo:title>A title</todo:title>
        </todo:item>
        <tf:case
            title="Un expected element"
            expected="error"
            id="unexpected">
            <todo:foo/>
        </tf:case>
    </todo:list>
    <todo:list>
        <todo:title>Testing todo items</todo:title>
        <tf:case
            title="No todo items"
            expected="error"
            id="no-items"/>
        <todo:item>
            <tf:case
                title="empty item"
                expected="error"
                id="empty-item"/>
            <tf:case
                title="item with a title"
                expected="valid"
                id="item-title">
                <todo:title>A title</todo:title>
            </tf:case>
        </todo:item>
    </todo:list>
    <tf:case
        title="Other root element"
        expected="error"
        id="other-root">
        <todo:title>A title</todo:title>
    </tf:case>
</tf:suite>

Time to see if we've fixed our issue!

— Expert

Figure 10: Test results

Great, we've made it, thanks!

— Customer

Want to try it?

The application used to run the test suite and display its result is available at http://svn.xmlschemata.org/repository/downloads/tefisc/.

If you just want to understand how the test suite is split into XML instances, you can have a look at http://svn.xmlschemata.org/repository/downloads/tefisc/orbeon-resources/apps/tefisc/ .

In this directory:

  • split-tests.xsl is the XSLT transformation that splits a test suite into top level element test cases. This transformation has no dependence on Orbeon Forms and can be manually run against a test suite.

  • run-test.xpl is the XPL pipeline that runs a test case.

  • list-suites.xpl is the XPL pipeline that gives the list available test cases.

  • view.xhtml is the XForms application that displays the results.

To install this application:

  • Install Orbeon Forms

  • Copy the orbeon-resources/ directory under /WEB-INF/resources/apps/ in your orbeon webapp directory

  • Or, alternatively, copy the tefisc/ directory wherever you want, edit web.xml.sav to replace <param-value>/home/vdv/projects/tefisc/orbeon-resources</param-value> by the location of this directory on your filesystem, replace /WEB-INF/web.xml by this file and restart your application server.