Ford, Katherine, and Will Thompson. “An Adventure with Client-Side XSLT to an Architecture for Building Bridges with
Balisage: The Markup Conference 2018 July 31 - August 3, 2018
Balisage Paper: An Adventure with Client-Side XSLT to an Architecture for Building Bridges with
Will Thompson leads software development on core technologies for O’Connor’s Online,
the web-based legal research platform that complements the company’s expanding library
mostly home-grown (and mostly Texas-based) legal books. Will works on a wide array
software, from back-end editorial toolchains to customer-facing search engines.
This paper describes the development process we undertook to extend the capabilities
an XML-based authoring and publishing system. Originally designed to deliver content
print and the web, we transformed it into one that delivers fully interactive web-based
wizards whose steps are generated automatically based on logic encoded into the source
documents. To meet our requirements for the application, we rejected conventional
strengths of each.
Despite being underutilized as a client-side technology, XSLT is still a valuable
in the development of modern web applications. Its expressive nature, continuing support
browsers, and ability to integrate with a modern virtual DOM-based user interface
allowed us to build a complex legal forms application that was simpler and more productive
than more conventional approaches. Our application demonstrates opportunities for
with client-side XSLT that has potential beyond legal forms and for an architecture
implications beyond XSLT.
Two decades ago, XSLT was a rising client-side star. XSLT, Java applets, and a array
other client-side technologies held great promise to fulfill dreams for the Web that
given the broad enthusiasm, not one of these technologies prevailed—none even survived!
technologies, but it should be at least somewhat surprising that everything else simply
headed for a polyglot client development environment, but a monoculture evolved—and
Like Java, XSLT has lived on as a server-side technology, albeit in more specialized
environments. Unlike Java, however, client-side XSLT is not forbidden, just forgotten.
Remarkably, browser-native XSLT implementations have survived decades of rapid technological
change, leaving traces of a future unrealized.
XSLT in the client is of course not entirely forgotten. The vital necessity of XML
motivates an ecosystem of XML-centric tools that extend into the browser, including
Certainly something must have been lost in siloing these innovative communities. We
look back two decades and imagine a different future, one where multiple languages
technologies survived, stimulating virtuous cycles of interaction. Two decades in
tech is an
especially long time, but are we really that far from that possibility now, and were
Throughout the long and somewhat winding road to implementing a complex new feature,
present justification, both practically and philosophically, not only for client-side
we demonstrate, generally, the benefits of pursuing bridges across these divides.
Improving an application
As a legal publisher, one of our main product categories is template-style forms for
attorneys, which have undergone a limited evolution in their path to electronic delivery.
have traditionally delivered forms to customers as printed books of forms and subordinate
forms for the customer to fill in and manually assemble into a complete document.
recently, we launched a subscription-based online service to deliver all of our content
the web. A departure from a book-oriented concept, the online service joins all of
content together, so users can search across the complete corpus, browse content organized
by topic or practice area, annotate, and download forms.
Despite the improved convenience in finding and using digitized forms, the fundamental
format of the forms themselves remains print-oriented. The forms include several variations
for each section, from which the user selects the most relevant. They are modular,
written instructions describing conditions under which to include other relevant subforms.
It is up to the user to find and fill in blank fields, many of which repeat throughout
form, select the appropriate sections, and assemble all of the necessary subforms
master document. As users generate collections of practice-specific forms, they are
responsible for maintaining their personal collection of forms for future reference
Unlike our other content types, with forms we aim to provide to the user not just
relevant information, but a path to a completed document. Our goal in improving our
forms application was to bring the user further down that path, shifting the overhead
managing field data and conditional form structures into the software. We wanted to
streamlined, interactive experience.
Of the many systems already built for assembling documents interactively, we observed
two common concepts. The first is a document-centric user interface, in which interactive
elements are inserted or overlaid to help the user identify what to complete in a
view. The second, known as the wizard (or assistant) pattern is an abstraction from
document that transforms it into an ordered sequence of steps.
Simpler documents are easily presented in WYSIWYG views, similar to our print documents,
replacing empty fields in the print document with interactive fields on screen. Entering
information into one field automatically updates corresponding fields throughout the
This approach is simple and elegant; however, it fails to scale as the rules for
generating a document become more complex. Specifically, as the amount of conditionally
included content increases, changes to variables with dependencies may have broad,
effects across a document. The simplicity of the document-centric user interface begins
break down as it becomes harder to visually convey these cascading effects or their
underlying dependencies. The document is interactive, but unstable and cluttered.
Changes may occur anywhere in document (above or below), and without visual cues or
directives to compensate for the volatility, reasoning from one change to the next
more nebulous, manifesting as cognitive overhead for the user.
Using a wizard, the larger task is simplified by automating the work of breaking down
complex decision tree into linear ordered steps. Changes resulting in cascading effects
simple to handle and in most cases are completely hidden from the user. Because the
interface only presents a single step at a time, inputs resulting in drastic structural
changes to the document are not automatically thrust upon the user.
If changes suddenly reveal new incomplete content, the new content is processed into
steps, which can be appended to the active sequence without disturbing the user. The
to maintain linear user interface complexity as the underlying problem grows non-linear
Providing this interface for documents, however, subdivides the problem of generating
user interface into two problems. Before the graphical part of the interface can be
from steps, it must first be possible to automatically generate steps from an incomplete
document. The complexity problem manifesting in the user-interface in the WYSIWYG
hidden, but not eliminated. A software component is needed to transform incomplete
Faced with the trade-off between building this new component or dealing with the
inadequacies of a document-centric UI, there was very little to consider. Many of
are dozens of pages long and highly variable, and it was clear that only the wizard
interface would be able to scale to meet our needs.
Authoring system integration: We had already heavily customized Arbortext Editor to
author our complete library of products, each of which is marked up using an in-house
schema. Our authors would be responsible for encoding the data required by the form
defining relationships between the data and the structure of the forms, so it was
crucial that the authoring for automation integrated well into our existing authoring
and publishing workflow. The solution must use our existing authoring tools and a
Online product integration: The solution must integrate with our existing online
content platform and with any new features for managing user data. The customer
experience of interacting with forms should be streamlined, but none of the existing
functionality should be removed. The full text of the forms must still be viewable
searchable online. The fully assembled, filled-in forms must be downloadable in editable
word processor formats. Additionally, data entered by the user in other applications
must be available to use in forms.
UI/UX consistency: The user interface and user experience should fall in line with
the conventions set by our existing applications. It should be fast, responsive, and
look and feel modern to customers.
Sensible: It must be feasible for our small team to implement and maintain the
entire system. The responsibilities of the software would be broad and complex,
including extensions to authoring tools, the automation engine backend, a sophisticated
user interface, and integration with our other applications. Its scale was daunting
A third-party solution
The scope of work to build the system we envisioned was immense, so we looked to
third-party products that might reduce that scope to something more suitable for our
research led us to a product that provided an end-to-end solution, from document authoring
application to web server. It was modular and customizable enough to supply the components
that we did not already have, and its documents were XML, so it seemed feasible to
its automation and UI components with our own authoring tools.
The authoring environment became the first focus of our integration effort, and we
with the goal of substituting their editor, designed specifically to work in their
with ours, using an in-house composite schema that overlays automation logic over
The composite schema became the canonical format for these documents. Where in our
previous system print and authoring shared the same schema, now print would become
our online application, treating it as an intermediate format and relying on pipelines
transform documents from the canonical source to its application-specific target schema.
also built an entirely new pipeline for the automation workflow, executed in XProc,
generate documents needed by the third-party automation engine.
The integration added extra steps and rough edges to the authoring and publishing
workflow, but it was successful to the extent that we were able to work using our
and automate a wizard using the third party application. With the authoring back-end
running, we began encoding documents and testing automation, and our software development
efforts shifted focus to integrating the wizard into our vision for the customer-facing
The automation engine was well equipped to accommodate extension and customization
APIs and extensibility hooks, and we intended to augment the wizard with new features
integrate it into an array of tools that would make up the complete document automation
data management system. This meant writing a number of software extensions to manage
connections between the wizard system and our tools and database. We advanced quickly
prototype; however, as we transitioned from application plumbing to refining details,
serious flaw in our plan became clear. Out of the box, the wizard’s features were
close to what we needed, but we found ourselves pushing the limits of the software’s
extensibility interfaces past their intended use case.
As the integration advanced, it accumulated layers around the wizard engine, which
was mostly opaque as our insight was limited to verbose and unfamiliar log files.
Our team had
grown accustomed to an agile process of quick iterations, but as we moved from one
the next, each step became more of a slog. We found ourselves, in effect, trying to
fully-featured application as if it were a software library. The result was that ordinary
refactoring and design updates snowballed in difficulty as the connective components
causing debugging to become unwieldy, and productivity dwindled.
This led to a significant shift in our expectations. The benefit of using the third-party
engine became uncertain, and it was conceivable that we could build our own engine
in the time
it might take to complete the integration. We made a hard decision to shift course
and set the
integration project aside to develop our own engine.
We sunk many hours into the integration process, but discarding the automation engine
not send us back to the drawing board. The Arbortext customizations, composite schema,
pipelines would all be useful scaffolding for the development of our own system, and
experience led us to shift our priorities towards minimalist architecture and faster,
While we were happy to unburden ourselves of the responsibility for both working around
and integrating a third party engine into our existing system, in its place was the
task of green-fielding an entirely new replacement. However, it was an opportunity
many aspects of the previous system with which we were dissatisfied.
Despite the fact that we were pushing the previous system beyond its limits, the process
left us squeamish about the prospect of building yet another ungainly system. We therefore
felt it was important to prioritize certain high level design goals to safeguard against
In an effort to reduce dependencies and maximize flexibility as we developed the system,
chose to emphasize loose coupling across major components and to build the automation
specifically with a focus on efficiency and portability. Practically speaking, we
avoid redoing work, so it seemed prudent to protect the parts of the system with the
exposure to changes.
Thinking about designs for a wizard engine, its overall task had a state machine feel:
form documents would contain references to missing data—some inline, some as the result
satisfying a condition—and given new data, a document would be transformed to either
references to data that needed to be requested, or it would be complete.
Considering that our documents were already XML and that generating wizard steps from
document would frequently require walking document trees, the problem seemed like
optimized to run in web browsers, but through Node.js and other similar implementations
grown into a very capable cross-platform server environment, and it has even permeated
has expanded to the point that it would be hard to imagine a more portable language.
However, this component would be the bedrock of our overall application, so it was
especially important that the language we chose align well with the task we needed
accomplish, and we shared the opinion that “... in the areas where XSLT is strong,
is at its weakest. Simple document transformation tasks … are painfully tortuous.”
[Delpratt and Kay 2013a]. In fact, not only is XSLT widely supported in server environments, but
and Saxon if
XSLT 2.0 or 3.0 is needed. We therefore sought to develop our new engine using XSLT,
capitalize on its idiomatic approach to XML processing and because it satisfied our
The three environments in our technology stack, MarkLogic Server, Microsoft .NET,
interactive, we preferred a client-side solution to reduce architectural complexity
network chattiness. Running the automation engine in the browser would reduce the
round-trip requests to the server, increasing the responsiveness of the application
reducing server load. We did not yet know whether that would be feasible, either using
browser-native or third-party XSLT processors, but we thought the advantages made
Next, this application would be one of several that make up our online platform, and
had to consider how it would fit into our development and maintenance process. It
universal priority that our product UIs look and feel modern to customers, so despite
reliance on XML technologies to deliver most of the functionality in our products,
polish that we think is commensurate with the quality of our editorial content and
formatting. Although a small but useful ecosystem of web frameworks and libraries
interactive UIs with XML exists, including Saxon-JS, XForms, and more, we predicted
light of our existing software stack, the conveniences gained from developing a new
native XML technologies would probably be overwhelmed by the burden of managing an
and more esoteric library. So we added another constraint: the XSLT implementation
Finally, one can only evaluate the suitability of a design to the extent that it is
possible to know the scope of its requirements, and throughout the development of
application we discovered new requirements and unforeseen complications. What initially
like a straightforward and intuitive document-oriented design gradually increased
complexity. Far into development, this led us to question the suitability of our design
our choice to use XSLT to implement it. We will demonstrate how we eventually determined
the best way to overcome our complexity barrier was by rethinking our approach to
rather than by changing the technology.
Our ideal solution would run in XSLT in the browser, but we did not know whether the
be able to
meet our requirements. To our surprise, we discovered that developing for browser-native
XSLT 1.0 processors was not only feasible but much less difficult than expected, and
significant performance benefits.
We first researched the state of browser XSLT processors and found mostly old or
conflicting information, but there were only three third-party XSLT implementations
consider: frameless.io, Saxon-CE, and Saxon-JS. We could not confirm that the frameless.io
processor was still actively being maintained, so it was ruled out. We were concerned
Saxon-CE’s large library size would result in slow loading times [Delpratt and Kay 2013b] or that it may not perform adequately for our stylesheets, and its replacement, Saxon-JS, had not yet had its first major release. Given
our imperfect options, we theorized that native XSLT 1.0 processors should be simple
adopt, considering they were already native to every component in our stack, fast,
adequate functionally to build a prototype. Though we were not convinced it would
possible to fully implement our application using only browser-native XSLT
implementations, we decided that because little work would need to be redone if we
it was worth exploring, keeping Saxon-JS in mind as a fallback if we discovered browsers’
implementations were too broken, or if the limitations of XSLT 1.0 made it too difficult
XSLT 1.0 support
When we started the project in December 2016 we were not even sure if all desktop
browsers still included fully working native XSLT processors. We found results of
as recent as 2015 [Reschke 2015], and our own testing verified that the results
had not changed in current releases. Since the level of support had been stable for
a decade (despite an effort to remove XSLT support from Google’s Blink browser engine
2013 [Barth 2013], we felt cautiously confident in the stability of
browser-native XSLT processors going forward.
Desktop browsers would not be the end of the road for this project, however. While
view mobile phones as an inadequate form factor for an application like ours, we think
tablets have the potential to be as productive as desktop computers. On Apple’s iOS
Google’s Android platforms, mobile and tablet devices share essentially the same browser,
and it was important that we build on technology supported by both of those platforms
avoid a dead end. We confirmed through our own testing that mobile versions of Chrome,
Firefox, and Safari have indeed reached the same standard of XSLT support as their
Having established XSLT 1.0 processor support in every major desktop and mobile
browser, we back-ported our prototype to XSLT 1.0 and set out to test the level of
completeness for each. None of the missing features in the browser processors were
enough to scuttle our experiment, and the same proved true of XSLT 1.0 itself. Though
lacks the expressiveness and the convenience of later XSLT versions, we produced a
functional implementation of our design with minimal frustration.
Adjusting to limitations
Two specific limitations of the browser XSLT implementations necessitated adjustments
to module boundaries and responsibilities within our design. Ostensibly due to vendor
concerns about security, browser XSLT environments have been heavily restricted in
ability to (1) make requests for external resources and (2) receive external input,
The first limitation stemmed from incomplete support for document() in
Apple’s WebKit and Google’s Blink XSLT processors. The Chrome security model does
allow its XSLT processor to access any local or external resources, but we required our application to request external data and to submit user
progress to our server. We moved responsibility for executing these requests and
of easier request handling for JSON data.
The second limitation, due to another years-old WebKit bug (pre-dating Blink and therefore still present in both Chrome and Safari),
prevented us from supplying external parameters to the XSLT processor in any type
than a string. Without support in XSLT 1.0 for sequences or advanced string processing
functions to deserialize string input, submitting new information to the application
application to submit new data by storing inputs directly in the source XML document,
further extending the message-passing system.
Other, smaller limitations were resolved by preprocessing our stylesheets. To work
around a broken xsl:import/xsl:include implementation in Safari, we created a new
stylesheet to merge the main stylesheet and its included modules into a single
file. Though Microsoft browsers support node-set(), their
implementation is in the msxsl (urn:schemas-microsoft-com:xslt) namespace instead
otherwise universal ext (http://exslt.org/common) namespace, so when the engine stylesheet
was requested by a Microsoft browser, we replaced it with the msxml namespace, also
a stylesheet to perform the transformation.
Unfortunately, browsers are not friendly environments for debugging. Lack of support
for the only standard debugging feature in XSLT, xsl:message, was universal across the
browsers we tested. Microsoft Edge was the only browser that output any error logging
the console at all. All other browsers failed silently.
Though the lack of certain XSLT 2.0 features was inconvenient, they were mostly
ergonomic, and we found reasonable alternatives. The most conspicuously absent features
relevant to the needs of our application were string-join(),
tokenize(), case-insensitive comparisons, range expressions, regular
expressions, and xsl:for-each-group. Examples of our XSLT 1.0 alternatives to
these functions are listed below:
<!-- returns a set of N <n/> elements to replace [1 to N] -->
<xsl:param name="counter" select="1"/>
<xsl:when test="$counter > $n"/>
<xsl:when test="$counter = $n">
<xsl:with-param name="n" select="$n"/>
<xsl:with-param name="counter" select="$counter + 1"/>
Process server-side and replace with more template-friendly XML. Some simpler
regexes can be approximated using contains(),
substring-before(), and substring-after(), but we do not
advise that approach generally.
Double Muenchian grouping with a second, compound key
In our initial design, we considered the ordering of questions displayed by the wizard
to be a grouping problem, prompting a serious discussion about switching to an XSLT
implementation for its superior grouping support. The double Muenchian grouping technique
with compound keys, noted above, had become increasingly elaborate and difficult to
maintain, as we needed to order questions on several different levels.
This led us to step back and question assumptions we made about the problem we were
solving generally, and we determined that we needed to revisit our data model. As
explain in more detail later in this paper, instead of focusing solely on the form
documents that were mostly pertinent as input and output to the interview process,
changed the model to center around the interview process itself. In our new model,
could more easily manage relationships between data, which allowed us maintain questions
in the correct order throughout processing and incidentally avoid the need for grouping
entirely. Working in XSLT 1.0’s more constrained environment forced us to acknowledge
address flaws in our initial design that the added conveniences of XSLT 2.0 might
allowed us to work around, and we arrived at a more robust solution sooner than if
pressed on with band-aid fixes.
The workarounds needed to achieve anything useful in multiple environments are real
impediments, but our judgment is that relative to the work required to set up nearly
modern programming environment, the scale is small. With awareness of these issues
upfront, getting up and running should take a matter of hours, not days. One
disappointment throughout this experiment, however, was the consistency with which
vendors have apparently de-prioritized fixing XSLT bugs. Most of the bugs have lain
dormant in issue trackers for years, many including the tag “WontFix." However, that
not universally true. From the time we started this project to the time we began drafting
charts for this paper, Microsoft fixed the Edge bug preventing the use of xsl:import. Progress is possible.
The results of our performance testing strongly affirmed our decision to stick with
browser-native XSLT processors, to a remarkable degree. We had not expected desktop
browser processors to perform comparably to Saxon-EE, but they did and even reliably
outperformed Saxon-EE for one of our test cases. We tested Saxon-JS with SEFs generated
using Saxon 9.8 and found that it was at best an order of magnitude slower than the
Note: Saxon-JS Update
Before publication, we shared our tests and results with Saxonica. After investigating
workload, they attributed the disparity in XSLT evaluation times primarily to repeated
DOM subtrees and assured us that they are working to address this in future versions
Even mobile browsers’ native XSLT processors returned results that,
while slower than desktop equivalents, were acceptable for our application. Using
client-side XSLT 1.0 provided not only more minimal architecture and lower server
round-trip latencies but comparable and sometimes much faster processing times than
other solution we considered.
Note: Testing environments
"macOS 10.13 notebook" hardware: MacBook Pro (Retina, 13-inch, Early 2015), 16GB
RAM. "Windows 10 desktop" hardware: Dell OptiPlex 7040, 3.4 GHz Intel Core i7 (4-core),
16 GB RAM. Saxon-EE tests were run using Java 1.8.0. Test documents: Test 1: 980 kB
& complex XML, Test 2: 588 kB & simple XML, Test 3: 700 kB & complex XML,
Test 4: 189 kB & simple XML.
We had not expected our experiment using browser-native XSLT processors to be
ultimately successful. It would have been easy to assume that the combination of XSLT
1.0's limited feature set and an accumulation of browser processor bugs was enough
disqualify the browser-native XSLT environment for any serious project, and that
perception is certainly real. But the breadth and complexity of our application and
ability to achieve better performance with more minimal architecture is strong evidence
counter that perception.
complexity, either in the management of the UI or in the development of the automation
modern in the context of JS technologies is a contentious subject and one with a moving
target. We attempt to define it as simply as possible, based on the observation that
pattern has emerged in web development. Most recently, front-end web development was
dominated by monolithic two-way data binding frameworks like AngularJS or EmberJS
[Allen 2018]. Now, those large frameworks are giving way to preferences for a looser
mix-and-match approach to assembling tailor-made frameworks centering around a virtual
DOM-based view library like React or Vue.
The virtual DOM is an abstraction of the browser DOM, and its key innovation is its
ability to very quickly compare an updated virtual DOM to the current one, then calculate
optimal steps to update the browser DOM based on the computed difference. This unburdens
front end developer in at least two important ways. First, it completely abstracts
the problem of managing explicit browser DOM updates—a revolution for front end developers
and its primary appeal. Second, and most relevant to our application, it enables us
build UIs with functional architectures.
Virtual DOM libraries are intentionally narrowly focused architecturally on only the
view component, whereas the previous frameworks provided a full architecture like
MVC or MVVM. Several popular functional reference architectures have emerged to fill the
void, available as minimalist frameworks that wrap a virtual DOM library and manage
application state. They all share a common theme: strict unidirectional data flow,
significant departure from the bidirectional data flow of the previous generation
frameworks. The first such framework from the creators of ReactJS, Flux, demonstrates
architecture that separates concerns using this new idea.
Because the virtual DOM engine will automatically apply surgical updates to the
browser DOM based only on what has changed in the view, developers are free to reason
about a view as if the page is being completely re-rendered every time. For some,
this design may seem familiar. If you go back to the early Web, pre-Web 2.0, this
essentially how page interaction worked. A form was submitted, sending a payload of
parameters describing the action to the server, and the server built a complete webpage
and returned it to the user. Unidirectional flow was enforced automatically by the
limitations of the technology. Now, most of this process occurs in the client instead
on the server, and it is designed to update the view at 60 fps, but the architecture
similar because the fundamental assumptions we can make about rendering a view is
What’s old is new again.
The Flux pattern, using a virtual DOM-based view, provides a functional pipeline for
include a functional pipeline for processing XML data based in XSLT. The architectural
equivalence gives us a bridge to unify these two-year-old and two-decade-old
Under the Flux pattern, Stores are responsible for the application state and logic,
performing updates and transformations in response to Action inputs from the view
external actions. Our XML documents represent the state of the document automation
process, a subset of the application state, and the XSLT engine is our logic for
transitioning from one state to the next, so Stores were the most appropriate injection
We established a pattern for working with XSLT within a Store. The application
initializes by requesting a payload from the server that contains the document to
automated and a set of initial user data. On the initial page load, the engine XSLT
transforms the document, using the user-supplied data, into a new state. Then the
XSLT transforms the updated document into an XHTML element, which is appended to the
virtual DOM. Subsequent actions are handled by the framework, and upon re-entering
store, changes to user data captured by the UI are marshaled into the user data XML
element, followed again by the XSLT pipeline.
processes application state on XML data in a JS-based framework without much ceremony,
and we use it to transform from hard-to-use (in the JS environment) generic XML to
XHTML, which is useable directly in the view.
In a traditional implementation of the Flux pattern, however, no components of a
view would be rendered in a store, so one could argue that our design leaks abstractions
between the store and the view. Our justifications for this are practical. This
application will be responsible for rendering large preview documents, and XSLT is
simply much better suited for performing that work. Abstractions are imperfect, and
sometimes allowing a leak is the only sensible solution to prevent overcomplicating
or bad performance.
Before coming to this conclusion, we considered another pattern to give complete
control over rendering to the virtual DOM for tighter integration into the traditional
virtual DOM architecture, but we rejected it because it added complexity with only
marginal benefit. However, the trade-off may be appropriate or necessary when composable
component hierarchies using combinations of XSLT-generated and virtual DOM-native
components are needed—not possible with XHTML-rendered components—or where it is
important to take advantage of virtual DOM-rendering performance optimizations.
The goal of this pattern is to render the view using only virtual DOM-native
components, so the XHTML-rendering step is replaced by a step that transforms
view-relevant XML data into JSON. Each virtual DOM library has a slightly different approach to rendering,
but they have in common supplying JSON properties as input to a functional
transformation that renders a component. Compared to an XSLT transformation, properties
are analogous to an input document, and the component (in some virtual DOM
implementations, literally a function) is analogous to the XSLT. The JSON output can
either be used wholesale to render the component, or it can be further reorganized
to maintain a greater separation between data models and view models.
A polyglot web
patterns resembling early web architecture, and we found it useful as a bridge to
another artifact of the early web, client-side XSLT. Using this architecture, we had
to break boundaries between the two environments in ways that would complicate overall
application development and maintenance. But we see implications for its use beyond
With the recent adoption of the WebAssembly standard by major browsers, we are hopeful for a future where language symbiosis resembling server-side polyglot environments, like the Oracle JVM and the Microsoft CLR, can flourish in the browser.
Many of the difficulties we encountered working with native client-side XSLT were
from limitations of the language or the environment, but the result of a fundamental
problem. We assumed that processing documents in their canonical form would be the
and most idiomatic solution, and successfully working under that assumption in early
of prototyping led us to establish a design centered on the wizard’s final output,
completed form, rather than the information needed to process the wizard itself. As
application and its requirements expanded, those assumptions failed, and it eventually
became clear we needed a different model. After overhauling the design to reflect
insight, we were finally confident that we had arrived at the appropriate idiom for
our document transformation problem, affirming our choice to implement the engine
The canonical documents were isomorphic to our initial conception of document processing
—walk the document tree, applying known information from other documents, and stop
information is needed, just as a human would—and we were satisfied using them to manage
state of the interview through our first stages of requirements. But these documents
explicitly model relationships between data—the form, questions, and their answers
separate self-contained data structures—and those relationships had to be reestablished
every iteration of the processor. As a direct consequence of this separated design,
employed increasingly complicated strategies to avoid repeating significant amounts
processing at every step. But the engine ultimately wasn't transforming a incomplete
into a completed form; it was transforming an incomplete form into an interview, and
needed sensible and distinct data models for each.
We understood from the outset what problems could arise from applying a document-centric
model to a UI, but it wasn't until much later into development that we understood
fundamentally the same problem inside our engine. As we needed to support higher variability
within a form document, it became harder to step from one final form state to another.
bulk of the time spent processing each answer was for handling cascading effects,
approach scaled poorly with the complexity of the forms. We needed to shift the work
upfront, transforming our data into a structure that would allow quick handling of
information and keeping all related data in one place.
Our document processor is supplied three types of input: a form document, questions,
answers. The engine assumes the form document includes all conditional content and
questions needed to satisfy them. As the interview progresses, answers are added to
of documents, and the form document is reevaluated to discover new references to fields
conditionals. The central objective of the form processor is to identify and resolve
In the original design of our application, these inputs doubled as the data model,
they were a poor reflection of that central objective. As the processor walked the
document tree, it needed to re-evaluate conditions and identify form references to
what content was included.
The dependency relationships between the references, or which references controlled
visibility or evaluation of what other references, were not explicit. As a direct
consequence, the information needed to respond to each reference's discovery or resolution
required tracing recursively through three three different data structures, a level
indirection that was painful to manage at run-time.
As we expanded our prototype to meet the demands of more sophisticated documents,
application became unwieldy, and implementing new features became disproportionately
difficult. The design had become hard to understand, sluggish, and we still had more
features to implement. Significant changes required carefully working around several
elaborate features, such as the grouping problem described above. The need to trace
cascading changes after new answers were provided to previously answered questions
The change would have required building a new dependency-tracking layer on top of
implementation already suffering from complex layers of indirection. Suddenly we were
experiencing deja vu. What had begun as a liberating and productive process, without
baggage of a third-party system, declined into a slow and frustrating affair. Our
again, needed to change. It was clear that the dependency-tracking information we
needed for tracing changes was so fundamental to the objective of the processor that
data and processing model should be redesigned instead of expanding the run-time patchwork
of indirection built on top of our current one. The data model should target an idiom
centered around the information-gathering process, not the form document.
To redesign the data model around references, we inverted the documents in a
post-authoring step. The new data structure formed a sort of stack, with elements
be processed sequentially to trace their dependencies, inserting new dependencies
their references in the stack.
Figure 18 illustrates the issue that finally motivated the redesign of our data model:
cascading effects handling changed answers. Working with our original data model,
if a user
changes their answer to A to value y, there is no way to determine that B and C are
longer required without reprocessing the entire form. By focusing only on the relevant
of the document, the problem becomes simpler and avoids repeating a great deal of
ordering requirement on the reference stack ensures that preceding siblings of reference
will not be affected by A's changed answer. Only the references between A and the
unconditional reference need to be reevaluated.
One of our early stated principles was to maintain simplicity in our design for the
document engine, but we had misjudged the implications of avoiding a transformation
canonical document format into an engine-specific document. This was quite ironic,
considering that document transformation is the most powerful aspect of the language
chosen to work in. Compounding the irony, our missing insight about data modeling
something that had been obvious to us about the UI all along: the structure of the
document was only incidental in the problem we needed to solve. Focusing on the progression
of information, rather than its consequences to the final document, eliminated the
complexity that led us to question our choice to write the engine in XSLT, and
post-redesign, we had an even stronger case for its use.
Over the last twenty years, the Web has grown increasingly separated from its early
now converging on architectures that are fundamentally capable of bridging the gap
languages. Out of a desire to test that theory, and having a suitable use case for
emerged a great opportunity to put browser XSLT engines through their paces—and we
nearly as capable as ever!
there are strong incentives for combining technologies, each with their best foot
possible. The emerging WebAssembly standard is being adopted quickly by browsers and
to bring dozens of new languages and ecosystems into the fray. Years of virtuous growth
have been lost by separating the XML and Web communities, but it seems possible the
The prospect of a future unconstrained by browser technology also underscores the
a lucid understanding of the problems we are trying to solve. We are taught to empathize
the end users of technology who blame themselves for its flaws. But maybe as developers
should be more hesitant to adopt that perspective. The problem might not be the
Denicola, Domenic. “Non-Extensible Markup Language.” Presented at Symposium on
HTML5 and XML: Mending Fences, Washington, DC, August 4, 2014. In Proceedings of the Symposium
on HTML5 and XML: Mending Fences. Balisage Series on Markup Technologies, vol. 14 (2014). doi:https://doi.org/10.4242/BalisageVol14.Denicola01.
 Several convenient methods exist to generate JSON from XML. Badgerfish is a
are available, giving you the choice to decide on which side of the fence the
conversion is best suited. Similarly, the JsonML format is designed for round trip
other languages). The conversion can also be done in XSLT 3.0 natively, without the
need for any libraries [Kay 2016].
Denicola, Domenic. “Non-Extensible Markup Language.” Presented at Symposium on
HTML5 and XML: Mending Fences, Washington, DC, August 4, 2014. In Proceedings of the Symposium
on HTML5 and XML: Mending Fences. Balisage Series on Markup Technologies, vol. 14 (2014). doi:https://doi.org/10.4242/BalisageVol14.Denicola01.
Delpratt, O'Neil, and Kay,
Michael. “Interactive XSLT in the browser.” Presented at Balisage: The Markup Conference
Montréal, Canada, August 6 - 9, 2013. In Proceedings of Balisage: The Markup Conference 2013.
Balisage Series on Markup Technologies, vol. 10 (2013). doi:https://doi.org/10.4242/BalisageVol10.Delpratt01.