Miłowski, R. Alexander, and Norman Walsh. “How to survive the coming namespace winter.” Presented at Balisage: The Markup Conference 2014, Washington, DC, August 5 - 8, 2014. In Proceedings of Balisage: The Markup Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). https://doi.org/10.4242/BalisageVol13.Milowski01.
Balisage: The Markup Conference 2014 August 5 - 8, 2014
Balisage Paper: How to survive the coming namespace winter
Is XML condemned to be an orphaned syntax with a dimly lit future within the Web
browser? What can information providers with rich sources of XML do, other than
down-translate to HTML? The evolving Web Components environment may provide a solution!
With some simple translations, stylesheets and scripts, it will be possible to wrap
custom XML in a minimum amount of HTML and serve it over the Web. The browsers will
never know they’re being tricked into delivering XML.
It was a late night, again, at XML Prague, and Norm Walsh,
John Snelson, Charles Greer, and I were walking along attempting
to find dinner. We had been discussing the Web Components
session that had occurred earlier in the day. We expressed our
dismay and depression that we couldn't just have XML. Then it
occurred to us, like a light being turned on (or being
whacked on the back of the head with a ruler), Web Components
are just markup and pretty close to XML. All we needed to do was
use a hypen rather than a colon, and all was well. It is a
compromise and likely the best we will get anytime soon. We get
to put our own pointy brackets into the browser and give it
semantics—accept it and move on.
— Alex Miłowski recounting XML Prague 2014
Forward from Failure
A publisher that has a large amount of information in XML documents has little
recourse in today's world but to transform this information into HTML for delivery
Web or within EPUB ebooks. The ability for the common Web browser to load and process
information, with similar processing semantics to HTML, isn't available; links will
identified, styles and local transformations are fraught with problems, media will
loaded or rendered, and scripts will not execute to provide extensible behaviors.
At the 2009 Balisage Conference, in XML in the Browser: the Next
balisage-2009, Miłowski enumerated the issues with delivering XML to
the browser and many, if not all, of those issues remain unsolved in 2014. The various
browser vendors have since all but abandoned processing XML except as a legacy format.
many ways, it only remains as a serialization format for HTML5 html5
and as a mechanism for receiving data within a Web application.
It was argued that there are intrinsic and non-intrinsic formats for the Web. In
of markup languages, HTML, SVG, and MathML were identified as the triad of intrinsic
languages. This assessment is somewhat validated by the integration of SVG and MathML
the HTML5 specification.
This leaves generic XML as an orphaned syntax with dimly lit future within the Web
browser. If the writings on the walls of various mailing lists are any indication,
a strong desire for less or complete removal of the native XML processing that remains
within the browser. While current applications and backlash have prevented such removal,
the days of XML in the browser feel numbered.
Meanwhile, XML has served a purpose for many information publishers. Tag sets, both
custom and standardized, have been developed to encode enormous amounts of data. Within
enterprises, processing pipelines that produce, validate, manipulate, and otherwise
this data have had their benefits. It has become very normal to
transform these documents into the appropriate HTML markup for delivery to whatever
consumer is on the other end of that HTTP connection.
Yet, as Web developers and browser vendors seem to be moving away from custom markup,
they seem to realize they are missing something. Making the Open Web
Platform extensible means that behaviors that need to accompany information
need to packaged as reusable components. That is, information needs to have markup
identifies it as a specific kind of information whose scripts, templates, and styling
identifiable and loadable over the Web.
Hyphens to the Rescue
Once the desire for extensible markup, outside of the direct control of either the
or browser vendors, was recognized, the concept of custom elements was introduced
eventually formalized custom-elements. For HTML parsing purposes, the
essential distinction is that a custom element's name contains a hyphen—not a colon.
allows custom element names to be distinguished from those within HTML itself and
notable exceptions are the handful of element names in SVG and MathML that contain
In common usage, custom elements of the same origin share a common
prefix followed by a hyphen (see Figure 1). That
prefix currently has no registration or association with any URI. As such, it is unlike
namespace prefixes which must be declared before being used.
The use of custom elements goes beyond just syntax as it also provides an API for
registering behaviors with the browser for the markup. During parsing, the DOM construction
process assigns certain classes to recognized markup (e.g.
HTMLParagraphElement is used for the p element). When an
unrecognized element is encountered (i.e. a custom element), it is initially constructed
A script can register with the document a prototype that defines a new behavior or
assigns an existing HTML behavior to a custom element. For example, the
db-para could simply be registered as an HTML paragraph as shown in Figure 2. The DOM object for the element is subsequently replaced with a
new instance of the appropriate type and the behaviors of that element are now
In simple cases, an element registered as a custom element with one of the available
HTML prototypes inherits some of the custom behaviors. In testing, it is unlikely
default styling will automatically be applied (e.g. using
HTMLPreElement.prototype doesn't guarantee pre element
styling). Yet, in some cases, styling does occur and so the behavior is inconsistent
seems to be implementation defined. One can imagine that a consistent, reliable behavior
is the goal and this will sort itself with time.
Moreover, registration can go far beyond such simple associations of name to pre-defined
prototypes. A script can register a custom prototype to provide specific behaviors.
prototype provided must contain a function via a createdCallback property that
will perform any additional initialization of the element. Other similar mechanism
available for maintaining the element throughout its life cycle.
syntax highlighter (highlight.js
highlightjs) to the contents of the element. Once the element is
re-created within the DOM with this prototype, the callback function executes with
value of this assigned to the element. In this particular example, this means
the db-programlisting element is constructed with the prototype and the
callback adds the syntax highlighting.
Often, the structured information of an element doesn't directly match the desired
rendering. The use of HTML Templates (part of the HTML5 specification) provides the
ability to package and use structured layouts for the display of custom elements.
template is a portion of markup that is wrapped by a template element that can
be used to construct new content programmatically. One main use for templating is
manual construction of elements by either parsing or direct DOM method calls.
For example, in Figure 4, the template for a figure is listed.
The content element specifies where contained content should be placed. In
this example, the select attribute is used to specify which child elements
should be used. The result of this example is reordering the children of
db-figure so that the title is last.
The registered prototype must use the template and the Shadow DOM
shadowdom to affect the rendering of the element. The Shadow DOM
provides the ability to create a rendering based on elements not shown to the user.
the user inspects the displayed element (or its source), they will only see the custom
element. Inside the browser, a "shadow element" is used to structure and render the
information where the shadow element is only accessible via scripting or styling embedded
within the template.
An example of using a template for the db-figure element is shown in Figure 5. The callback constructs a Shadow DOM for the current
element and appends content. The content is structured via the template shown in Figure 4. The consequence is the current sub-tree for
db-figure is rendered using the newly constructed Shadow DOM.
Finally, we can package our script, templates, and any styling via HTML
html-imports. The imported document is simply another HTML document
whose scripts, styles, and templates become available to the current document. The
is invoked by a simple link element with rel attribute value of
import in the importing document (see Figure 6).
The imported document packages the Web Component by linking to the necessary scripts
stylesheets while containing any templates that are used by those scripts. The example
Figure 7 shows the structure used to package the previous examples.
The scripts and stylesheets for the highlighter are included using the same mechanism
already known to Web developers.
As a nuance, the script registering the custom elements and the templates are in
collusion within this imported document. At the very start of the example in Figure 5, the expression
document.currentScript.ownerDocument is used to obtain the correct document
for retrieving the templates. If the component is packaged differently, retrieving
template might be more difficult or impossible.
In summary, Web Components relies on four essential features:
Custom Elements — a specification that is in Last
Call and may enter CR in 2014.
Shadow DOM — a specification that is a working
HTML Imports — a specification that is a working draft and
As the features of Web Components coalesce and become part of the commonly deployed
browser, there is little anyone can do to prevent their use. An author can simply
Web Component of their choice, custom or shared, and the browser can do little more
execute the associated semantics within the bounds of the Open Web Platform. That
anyone to develop custom markup to encapsulate their information in much the same
hoped for with XML.
There are two notable differences between now (2014) and 1998:
The browser, as a component of the Open Web Platform, is much more stable,
technologically advanced, and well understood.
Web Components utilize the Open Web Platform to package semantics in a much more
extensive way that is compatible with how browsers actually
An unscientific look at the current opinions of the use of Web Components indicates
may become hugely popular. While only time will actually determine the outcome, the
DOM and HTML Templates are very useful. Accessing them within Custom Elements provides
needed encapsulation to Web applications and so their intended use in that context
lot of sense.
Yet, we don't have to use Web Components to package semantics for custom markup that
limited to specialized uses. That is, with relative ease, we can transliterate whole
documents into custom elements, wrap them with a few lines of HTML markup, and the
will load and process the custom elements as specified. Is this abuse, a practice
isn't recommended, or should a thousand custom elements bloom?
Let's open Pandora's box and see whether what is inside is truly evil. We will take
DocBook, a known vocabulary for documents (books, articles, etc.), and turn the markup
a set of Web Components. We will demonstrate how easy the transliteration is to perform
show a few interesting results.
The DocBook Web Component
Turning any arbitrary XML document into an HTML document as a Web Component requires
three essential steps:
Prefix every element with a constant prefix and hyphen that can be associated with
the element's namespace.
Develop stylesheets, templates, and scripts that encapsulate the desired
Wrap the document in the minimum amount of HTML bootstrapping necessary to deliver
the Web Component to the browser.
For example, in the specific case of DocBook, we would do the follow:
Transform the document by changing every DocBook element name to a name with
db- prefix with no namespace. Also, copy any MathML
or SVG to the output and pay specific attention to the serialization (HTML without
namespace or XHTML with a namespace).
Implement Web Components for common constructions like xref,
mediaobject/imageobject/imagedata, link, etc. and develop CSS stylesheets for the
rest. Package this component as a single document (see Figure 7).
Wrap the document in the minimum markup (see Figure 6).
In addition, we'd like to retain some aspect of identity of the namespace from the
original XML. To do so, we will add an RDFa rdfa
typeof attribute on the root element whose value is the namespace URI. This
will allow a consuming application to identify the custom element by type rather than
fixed prefix. Hence, on the root custom element for DocBook (e.g. db-article),
a typeof attribute will contain the value
This process was implemented using the simple XProc xproc pipeline
shown in Figure 8 where the transformed document is inserted in
the wrapper (see Figure 9) as a replacement for the content
element. The transformation is simply a set of renaming rules with the main two rules
in Figure 10.
In terms of what these custom elements might provide to a user, some behaviors for
DocBook that require scripting are:
Links (e.g. link or xref).
Auto-numbering of sections, figures, etc.
Display of media objects (e.g. imageobject/imagedata).
Generated text for cross references (e.g. turn xref into "Figure 2.1 ...").
Auto-generation of a table of contents and other navigation.
Syntax highlighting in programlistings and other code.
These features were implemented and tested in Chrome (the only browser currently implementing Web
and a 67 line HTML document with none of these resources having been compressed or
otherwise optimized. The implementation also includes highlight.js via the
HTML import and programmatically adds MathJax mathjax for rendering
At present, there are some notable issues implementing a set of Web Components and
MathJax was not able to be included via the import. The method it uses to
determine the base URI cannot find the script reference in the imported document.
MathJax isn't HTML import aware at this point in time. As such,
MathJax added scripts and stylesheets aren't hidden in the imported document but,
instead, are programmatically added to the importing document.
Implementing links was harder than expected. Just associating the prototype
HTMLAnchorElement with the element does not induce some minimal
linking behavior. Further, using a template that wraps the content with an HTML
anchor in the Shadow DOM is more complicate as there is no way to automatically copy
attributes (e.g. the URI in the href attribute) and some default
behaviors (e.g. a mouse pointer) aren't automatic. Further, clicking had no effect
and a custom event handler had to be added.
The division between the stylesheet within each template and the overall
stylesheet is a bit tricky.
There is a lot more to be done to handle the full life cycle of the elements. That
is, if other scripts manipulate the custom elements in
situ, the components (e.g. the auto-generated navigation) may need
to update themselves.
Web components can also be used within other browsers by using the Polymer Platform
Web Components specifications for the Firefox, Safari, and IE browsers. Unfortunately,
this time (July 2014), this library fails to work with the DocBook example:
Firefox crashes almost immediately. This seems to have something to do with the
generation of the table of contents navigation.
The Evolving Web
Web Components is a promising technology for delivering packaged semantics for general
markup. It succeeds in many places where previous attempts with XML in the browser
failed. That it is somewhat of a reality today is ever more exciting.
Yet, the mechanisms for which a browser or resource consumer can recognize the use
particular set of custom elements is fraught with problems. The inability to identify
prefix used in constructing the element names, associate that prefix with some URI,
protect content from collisions with other custom elements is going to be an immediately
painful experience. Authors and publishers will want to mix content from different
outside of their control and custom elements will make that increasingly harder.
XML has a partial solution for identifying and uniquely naming elements to avoid
collisions. Yet, that solution allows arbitrary complexity without sufficient gains
functionality and was rejected by many in the various Web developer communities. Yet,
can't help but feel like a colon was swapped for a hyphen and we lost something in
In the end, Web Components lets us deliver XML documents, transliterated, and packaged
with their semantics. The mechanisms of the Shadow DOM and scripting allow the markup
for rendering to have a interactive and integrated mechanism for live manipulation
the browser. HTML imports and templates enabling packaging of these semantics into
Even though Web Components, HTML5, and scripting isn't necessarily how we all may
imagined XML on the Web in 1998, their combination is sufficient to accomplish real
with markup within the Open Web Platform. The Web has evolved and XML may be evolving
with it. It is a reality that we affectionately call the Prague
He put on his skis, straightened himself up, and remained standing there for some
time; as he pulled on his mittens he took one glance homeward. He could just make
the house in the dim distance. Then the whiteness all around it thickened—rose up
cloud—seemed to be piling in. ... Perhaps it wasn't so dangerous, after all. The wind
had been steady all day, had held in the same quarter, and would probably keep on
Oh, well—here goes!
On one of the hillsides stood an old haystack which a settler had left there when
found out that the coarse bottom hay wasn't much good for fodder. One day during the
spring after Hans Olsa had died, a troop of young boys were ranging the prairies,
search of some yearling cattle that had gone astray. They came upon the haystack,
stood transfixed. On the west side of the stack sat a man, with his back to the
mouldering hay. This was in the middle of a warm day in May, yet the man had two pairs
of skis along with him; one pair lay beside him on the ground, the other was tied
back. He had a heavy stocking cap pulled well down over his forehead, and large mittens
on his hands; in each hand he clutched a staff ... To the boys, it looked as though
man were sitting there resting while he waited for better skiing ... His face was
and drawn. His eyes were set toward the west.
— Giants in the Earth: A Saga of the Prairie, O. E. Rölvaag (1924)