Graphical user interfaces in the X stack

Zahra Al-Awadai

Technical University of Munich (TUM)

Anne Brüggemann-Klein

Technical University of Munich (TUM)

Christina Grubmüller

Technical University of Munich (TUM)

Philipp Ulrich

Technical University of Munich (TUM)

Copyright ©2019 by the authors. Used with permission.

expand Abstract

Balisage logo

Preliminary Proceedings

expand How to cite this paper

Graphical user interfaces in the X stack

Balisage: The Markup Conference 2019
July 30 - August 2, 2019


As we have claimed before [B16,ABCES17], current XML technologies provide a full stack of modeling languages, implementation languages, and tools for web applications that is stable, platform independent, and based on open standards. A particular strong point of what we call the X stack is that data are encoded with XML end-to-end and that XML technologies can be used where-ever XML data need to be processed.

We are interested in the X stack for web applications for three reasons. First, its practices and techniques support development processes that are driven by models, particularly by domain models [E04]. This is relevant in the context of a research agenda of generating instances of serious games and learning analytics. Second, the complete X stack serves as a vehicle to teach XML technology to students through the backdoor of engaging web applications such as games. The austerity requirement we impose on students in a lab course on XML technology, namely to implement a game on the web with XML technology alone, no JavaScript or frameworks allowed, forces students to become proficient with SVG, XQuery, XSLT and other XML technologies. It also reinforces their knowledge of software engineering principles and teaches basic web application architecture that is not clouded by some specific and potentially short-lived framework. Third, the practices we suggest with the X stack pave the way for XML experts to develop XML-based applications on the web.

Graphical user interfaces (GUIs) are crucial components of applications with which users interact directly. They present information about the state of the domain entities in the application and provide methods of interaction to manipulate them. In this paper, we investigate a spectrum of GUI technologies in web applications and how they fit into the X stack. We explore a number of technologies that independently address some aspect of a GUI in the X stack.

Throughout the paper, we provide code to illustrate the techniques that we introduce. We have also applied our principles and strategies in a case study, the game Guess the Number (GN). which is documented in a separate document that is available on request. This case study is intentionally kept simple, so that we can focus on principles without being distracted by more complex XML processing. We have student projects from the XML Technology Lab at TUM for Tic-Tac-Toe, Scissor-Paper-Rock, Blackjack, Memory, Mancala and the early GameX [SKB14] that follow the same principles as they evolved and that are technically more complex. The case studies demonstrate how end-user developers who are conversant with XML technologies can create their own web applications.

This paper is organized as follows:

After this introduction, section “Requirements for GUI technologies” briefly describes the responsibilities of a GUI component before honing in on our two main requirements for GUI technologies in the X stack that fit into a model-driven approach. The two requirements are that the GUI and the application core must communicate through XML-encoded declarative data and that the GUI component itself must be defined in a declarative manner.

We then investigate a number of GUI technologies that contribute to the two main requirements.

First, in section “XHTML with XForms and SVG”, we discuss XForms in the context of HTML and SVG. XForms is the prototypical GUI technology for our first requirement. We also investigate to which extent it supports the second requirement and how it compares to HTML forms.

One topic that has always been and still is present when discussing GUI technologies is a component-based approach that allows for composition and reuse of GUI components. The Web Components specification [WC19] brings the component idea to HTML by enabling web developers to define their own reusable custom HTML elements as components that are marked up like regular HTML elements and that encapsulate custom behaviour and style. In section “Web Components”, we present custom elements and their potential with respect to our main requirements. This prepares a later discussion on a custom element that we have developed called WebSocket Element that enables GUIs to handle WebSocket communication in the X stack for multi-client systems.

Modern web applications, particularly games, are often multi-client systems. Multi-client applications require communication patterns that let clients talk to each other or that allow a server to push messages to clients without prior requests, as supported by the HTTP extension protocol WebSocket. Multi-client applications require both servers and clients, read GUIs in our context, to support such a protocol.

In section “Server support for multi-client applications”, we briefly present work that supports the WebSocket protocol and that recently was taken up and extended by the team at BaseX. The main contribution of this paper, in section “WebSocket Element”, is to present the client side of the WebSocket equation. This is a declarative custom element according to the Web Components specification WC19 named WebSocket Element that can be used in an HTML-based GUI just like a built-in HTML element. The WebSocket Element encapsulates code to initiate a WebSocket connection and to handle incoming declarative XML data that control the GUI. In the definition of a GUI, the WebSocket Element in its HTML surface form just declares the parameters for a WebSocket connection and how to handle XML data that arrive through the connection. Handling the incoming data can mean just to render HTML or SVG data or it may involve applying an XSL transformation and rendering the result. The WebSocket Element demonstrates how our two requirements for GUI technologies can be supported in the X stack for a multi-client web application.

GUI components are commonly considered to be event-driven systems whose functionality is triggered by events, mostly from user interactions. The classical tool to model the behaviour of an event-driven system is statecharts. Statecharts have arrived on the XML scene through the encoding language SCXML and a number of SCXML processors. In section “Statecharts and SCXML” we present the integration of the Apache Commons SCXML Interpreter, which is implemented in Java, into XQuery modules that implement event-driven applications that are run in BaseX. We expect to adapt that work so that we can integrate JavaScript SCXML interpreters into GUIs. That would contribute to our second requirement of having a declarative definition of GUI components in the X stack.

We conclude the paper with a number of discussion points and some final remarks.

Requirements for GUI technologies

The general task of a GUI is to represent application data or domain entities, to provide ways for the user to interact with this information and to save, on user request, changes that were made in the course of the interaction. In the web context, we consider GUI components that run in a web browser and whose communication with the application core is mediated by a web server via HTTP or via HTTP extensions such as WebSocket.

Our reference architecture for web applications in the Model-View-Controller (MVC) architectural style [ABCES17] has a one-to-one relationship between the Model and Controller components. The Model provides an API that is only used by the Controller. There are no outside influences on the Model. State changes in the Model are only triggered through API use by the Controller. The Controller has its own API that is used by any number of View components. The View components, which run in web browsers, and the Controller, which runs as an XQuery module in BaseX, communicate via HTTP through a mediating web server: On receiving an HTTP request from a View, the web server triggers a RestXQ-annotated method in the Controller module and sends the return value of that method call back to the view as an HTTP response.

For the purposes of this paper, a GUI is a View component in the MVC architectural style, and the Controller and Model together form the application core.

Figure 1: Web application architecture

jpg image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-001.jpg

Our first and central requirement is that the GUI and the application core exchange declarative data that are encoded in XML.

In the life cycle of interactions between the GUI and the application core, initially the GUI receives from the application core over HTTP declarative XML-encoded information about the data that it needs to present and about the interactions it needs to offer. While the user interacts with the GUI, the GUI may signal user events to the application core and receive declarative data again. In a multi-client system, it may also receive push data from the core application without prior request. Eventually, the user closes the GUI with a final interaction that the GUI may also signal to the application core.

The central requirement leaves many options open: The communication between GUI and core application can be synchronous or asynchronous. It may follow the request-response cycle of HTTP or allow for push data from the application core over HTTP extensions such as WebSocket. The GUI may integrate the information it receives into its current display like in a single-page web application or it may build a completely new display. The GUI may be a simple form-based interface that collects user input or it may have elaborate displays and interaction methods and perform its own computations on user request in Rich Internet Application (RIA) fashion.

In this paper, we focus on GUIs that follow the WIMP (Window, Icon, Menue, Pointer) paradigm with discrete actions. For a discussion on WIMP and Post-WIMP styles see a paper by van Dam [D97].

The GUI as a system is responsible for constructing a visual representation of the data it receives, to render that representation onto a canvas within the boundaries of a viewport, and to handle general as well as application-specific user interactions such as resizing the viewport or scrolling as well as form field entries or button clicks. Finally, it needs to handle the communication to the application core. As to computations within the GUI, they range from input validation, formatting and interaction support to arbitrary computations that are part of the application.

As our second requirement, we only consider GUIs that can be defined by configuration or that can be programmed in a declarative manner. Towards that requirement, component technologies for GUIs are particularly promising.

To illustrate, let us look at a simple case study that we have used before, the game Guess the Number (GN). The game GN has two types of actors: Player and Game. Upon start of a game, Game thinks of a secret number (secret) between 1 and some upper bound (range). Player guesses repeatedly what the secret number is and receives feedback from Game whether the guess is high, low or correct. There is a limit to the number of guesses allowed (maxGuesses) that depends on range. Player wins if they guess the secret number correctly within the maximal number of guesses allowed; Game wins otherwise. There are no ties.

We model the information that the GN GUI receives from the GN application core in the following UML class diagram. Specific information instances are translated into a canonic XML encoding. The debugging section of the GUI screenshot in Figure 4 demonstrates the XML encoding.

Figure 2: Class diagram for data presented to GN GUI

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-002.png

The type attribute with one of the values welcomeScreen, firstGuessScreen, furtherGuessScreen, resultScreen and goodByeScreen informs the GN GUI component implicitly which screen to display, which interactions to offer and which requests to send back to the GN application core on user request. This domain model for the GN GUI is defined in the following table.

Figure 3: Domain model for GN GUI

Table I

Screen type Information Interaction Request
welcomeScreen   fill in range newGame[range]
firstGuessScreen id fill in next guess guess[id,guess]
  guessesSoFar (static, 0) submit  
furtherGuessScreen id fill in next guess guess[id,guess]
  guessesSoFar submit  
  evaluation last guess    
resultScreen id play again playAgain
  guessesSoFar quit quit
  evaluation game    

In the simplest case, the GN GUI just offers input fields for entering data and buttons to indicate user choice. We can also imagine that a richer GN GUI keeps track of the history of user guesses and advises the user on a guessing strategy.

In the next few sections, we discuss a number of GUI technologies that support our main requirements in a number of ways.

XHTML with XForms and SVG

Let us start out by reporting on a group of well-known technologies and discuss how they stack up against our requirements.

XForms is the classical GUI technology that supports XML-encoded data representation and exchange as demanded by our first requirement. An XForms GUI is controlled by XML data in instances within an XML-encoded model component. GUI widgets bind to XML elements within the instances as defined by XPath expressions. They can be used to read and write element data.

An XForms model expresses type constraints for instance data and defines derived values via XPath. It also defines activities that are triggered by user or system events. Most importantly, it configures submissions that are triggered by submit widgets in the GUI. The configuration declares which parts of the instances to submit to which service using which HTTP method; it also defines what to do with the data that are returned after the submission and how to handle errors. A typical XForms submission transfers part of XML-encoded instance data in the body of an HTTP request and replaces instance data in AJAX fashion with the XML-encoded response data. Hence, XForms satisfies our first requirement.

XForms also supports a declarative definition of the graphical and dynamic side of a GUI. XForms widgets are declarative XML-encoded components that are bound to elements in instances via XPath, as we have mentioned. The XForms processor handles the data exchange between widgets and instance data, resolving any dependencies. It also performs input validation as defined through XPath expressions and XML Schema data types. XPath widgets have a uniform and declarative system for hints and labels. They each have a clearly defined presentation-independent functionality, such as accepting a typed input value, selecting a menu item or triggering an event. XForms widgets approximately cover the range of HTML form elements.

The widgets are hosted by an XHTML document and can therefore be placed and styled with HTML and CSS. Since HTML5, the HTML host document may include SVG code that can also position and style XForms widgets as foreign objects.

We have implemented a GUI for GN as a single XHTML page with embedded XForms components.

The XForms model holds in its main instance the current screen type and the information for the current screen as specified in Figure 2. In two separate instances, it holds the information that needs to be edited in the screen and transferred to the server as detailed in Figure 3. There is one separate instance to fill in the range and another one to fill in the next guess. The latter copies the id of the current game from the main instance since that needs to be retransmitted back to the Controller component, which is stateless and handles any number of games concurrently. The copying accommodates the fact that an XForms submission can only submit data from a single instance.

The XForms model also defines all submit actions that GN requires. A submit action triggers a GET or a POST HTTP request for static or dynamic requests, respectively. A POST request submits the appropriate instance in the body of the request. Each response replaces the main instance with the HTTP response data.

In effect, the GN application core sends XML elements to the GN GUI that describe the data that specify the type of screen and the information that the GUI is supposed to display next. Again, see Figure 2 and Figure 3 for clarification. The specific information that is to be displayed for each screen type is specified in the domain model for component the GN GUI in Figure 3.

The body of the XHTML page holds a section for each screen type with XForms widgets that interact with the XForms model. Information about the current state is displayed in a table using XForms output widgets; user input is accepted through XForms input widgets and buttons that trigger XForms submissions. Only the screen type area that matches the main instance's current screen type is visible. The XForms model has a helper instance with a CSS attribute "display: none" that is dynamically read into each section that is inactive.

A less tabular and more graphical GUI for GN uses XForms widgets linked to the same XForms model and includes them into an SVG graphic. The widgets are included into the SVG code as HTML-encoded foreign objects that can be styled through CSS and positioned and transformed through SVG. In this variant of the GUI, there are no direct representations of the conceptual screens. Instead, the widgets themselves know when to present themselves depending on the information in the XForms model.

Below, we include a screenshot of the two versions of the GN GUI side by side.

Figure 4: Two variants of component View

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-003.png

Let us summarize our experience with XForms.

First, XForms fully satisfies our first requirement that the GUI and the application core exchange declarative data in XML format.

Second, XForms satisfies the second requirement that the GUI itself can be defined declaratively up to a point.

  • Due to the set of widgets that XForms offers, XForms GUIs are restricted to form-based interfaces.

  • The XForms widgets that a GUI uses are defined in a declarative way, which includes binding to the XML-encoded instance data. Their functionality is supported by the XForms processor.

  • The positioning and styling of XForms widgets can be done in a flexible and declarative way using HTML, CSS and SVG.

  • There is an annoying limitation for data access within the XForms GUI: Instance data, when presented in the GUI, are always wrapped into XForms widgets. They are not directly part of the HTML. That means that, for example, the arched text in the graphical GN GUI is literal text in the SVG code that is displayed on some condition in the XForms instance data. It cannot, as far as we know, be taken directly from XForms instance data, so that it can be typeset along the arch by SVG.

  • Finally, tool support for XForms is adequate but not ideal. HTML browsers do not support XForms natively. We are using XSLTForms, which depends on XSLT 1 support in browsers. It is reliable and supports most if not all features of XForms. It does not appear to be in active development, and browser support even for XSLT 1 is not guaranteed. A newer option is the XForms processor that is written in SaxonJS and that only depends on the ever-present JavaScript support in browsers.

We briefly contrast use of XForms with use of HTML forms in the context of HTML and SVG. On the surface, XForms and HTML appear similar since they have similar sets of widgets. And HTML forms as part of HTML have great browser support. HTML pages can accept XML data in an HTTP response and can display them, styled by CSS, for example in a frame. The crux, however, is that HTML forms need to use JavaScript to bind to these data for editing or for submission. Hence, the pure combination of HTML, HTML forms and SVG completely fails our first requirement for GUI technologies and falls short of the second one in central aspects.

Naturally, there are JavaScript frameworks that fill that gap. Please note that we exclude them for reasons discussed in section “Requirements for GUI technologies”.

Web Components

Compositional and reusable components are a promising idea in GUI development that have found their way into HTML. It is common practice to use arbitrary XML elements in an HTML context. Current browsers classify such elements as HTMLUnknownElement, insert them into the DOM and format them as inline elements like span elements. They even apply CSS styles to them. The Web Components specification [WC19] builds on that practice by classifying custom elements as "proper" HTMLElement objects and by extending the behaviour of such elements, thus turning them into real components.

The Web Components specification enables developers to define custom elements with custom attributes that are used just like built-in elements in an HTML document and that are treated just like build-in elements by HTML browsers. They are seamlessly integrated into the DOM and available to JavaScript code through the DOM HTML API. They can be styled with CSS, observed by event listeners, go in and out of focus according to keyboard events etc. The real innovation is that custom elements can have their own custom behaviour that is defined by JavaScript. They are also capable of encapsulating their own data and style through a shadow DOM. A custom list element for a todo list, for example, can offer ways to tick items or to collapse and expand sublists.

On the simplest level, a custom element can just expand the HTML vocabulary, as demonstrated in Figure 5. [TODO: Change class name to Todo_List and element name to todo-list.] The thus-defined custom element todo-list behaves just like a span element but has semantic meaning built into its name. Behaviour and style are added by extending the class of the custom element with lifecycle functions.

Figure 5: Defining a Custom Element

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-004.png

Obviously, in the context of GUIs new widgets can be defined as custom elements. It is even conceivable to define a system of widgets that are bound to XML data, reinventing XForms.

Ulrich in his Bachelor Thesis [U18] has defined a custom element ws-stream for HTML GUIs that establishes a WebSocket connection to a server and then handles XML data that are pushed over that connection, thus supporting our first requirement for GUI technologies. We summarize this work in section “WebSocket Element”.

As a component approach, custom elements within the HTML Components context certainly support reuse of components. Custom elements can also be composed from lower-level components. They are lacking a system of parameter passing and communication, though, that is the hallmark of composition support in React.

Server support for multi-client applications

Modern web applications, particularly games, are often multi-client systems. Multi-client applications require communication patterns that let clients talk to each other or that allow a server to push messages to clients without prior requests. The HTTP extension protocol WebSocket supports these patterns. It has been identified as the best current technology for these purposes with respect to support and functionality [C17].

Multi-client applications require both servers and clients to support the new protocols. In this section, we address WebSocket support for BaseX, the server system that we use in our projects.

In previous work [ABCES17] we have outlined how to integrate server-push into the X stack, based on a modified form of BaseX that was first presented by Conrads [C17]. The concepts and implementations were later refactored and better integrated into the BaseX server by Ulrich [U18] who also proposed a client-side solution as a counterpart to the server. As thesis work at University of Konstanz, Finckh [F18] collaborated with the BaseX team to integrate a WebSocket implementation into the BaseX production system.

Today, BaseX natively supports WebSocket with RestXQ-like annotations to react to different WebSocket events (onConnect, onMessage) on the server side. The BaseX documentation [B19] describes the usage and application of the new annotations and the new WebSocket XQuery module used to send messages or set WebSocket parameters.

The WebSocket protocol is low-level with little ex-ante support for commonly required features. Hence, it has been extended to STOMP, which supports channels and explicitly defines message formats. STOMP support is part of a BaseX development version that has not been officially released yet.

In section “WebSocket Element”, we discuss a new HTML component for GUIs called WebSocket Element that handles the client side. The WebSocket Element can interface with any server component that supports STOMP over WebSocket. Our demo applications use the BaseX development version that supports STOMP over WebSocket.

WebSocket Element


A GUI that participates in a multi-client web application needs two capabilities: First, a method to initiate a connection with a server through WebSocket. Second: a way to receive and process data through this connection from the server.

The HTML 5 WebSocket API provides these capabilities through JavaScript code. Ulrich [U18] encapsulates these tasks in a new, purely declarative HTML component that he calls WebSocket Element. A WebSocket Element in an HTML page looks just like a built-in HTML element that is configured through attributes. WebSocket Element is defined, however, as a custom element in the Web Components framework that was introduced in section “Web Components” and, hence, has interesting behaviour.

The abstract idea of a custom HTML element that wraps the client side goes back to the Master Thesis of Conrads [C17]. Custom elements as defined in the Web Components specification turn out to be the perfect fit to implement this concept. The implementation defines the functionality of the WebSocket Element with JavaScript and uses the WebSocket protocol to allow synchronous bidirectional communication. In fact, our implementation uses the STOMP protocol with its predefined message formats and channel concepts. After initiating the bidirectional connection with a server on load of the HTML page, the WebSocket Element can then receive declarative data in the form of XML to which it can apply its own XSL transformation or it can receive and render SVG or HTML data that were generated by the server. Hence, the WebSocket Element contributes to our two main requirements for GUI technologies in a multi-client scenario.

Basic WebSocket Element

In its most basic form the WebSocket Element looks like the following:

Figure 6: Basic WebSocket Element (HTML code)

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-005.png
Behind the scenes, the WebSocket Element is a custom element as explained in section “Web Components”]. The attributes configure the functionality of the custom element and are used by its JavaScript implementation. It has an id like other HTML elements which allows us to identify and style it with CSS or to dynamically add and remove WebSocket Elements to and from the DOM through JavaScript. Furthermore, the WebSocket Element needs to know to which location the WebSocket connection should be established to. The attribute url lets us specify the server address. To separate different applications and to create channels within a web application the subscription attribute can be set to a path to which the WebSocket Element automatically subscribes after the connection is established. It then listens to WebSocket messages on the subscribed paths or channels and inserts the data it receives into its own content. Since the page doesn't have to be reloaded and the content is streamed continuously, the application has the look and feel of single-page applications.

The usage of the WebSocket Element is as simple as importing the necessary JavaScript files and defining the element somewhere on the HTML page. As soon as the page is loaded by the browser the WebSocket Element connects to the WebSocket server, handles the subscription process with the server and initiates the element based on the given configuration.

Imperative and declarative approach

Let us constrast the simple declarative use of the WebSocket Element to the imperative appraoch that we have used previously [C17] with one of our modified versions of the BaseX server to implement server-push with channels. The JavaScript code in Figure 7 demonstrates that we had to instantiate a custom endpoint object and parameterize it with callback functions that map to WebSocket events onMessage, onClose and onOpen. The endpoint object, when started, using the callback function configured for onOpen, would call the WebSocket API of the browser behind the scenes to open a connection to the modified BaseX server. The callback function would need to create a JSON object used as a subscribe frame to tell the server which channel to subscribe to.

Figure 7: JavaScript to connect to the server

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-006.png

The most obvious difference to our declarative WebSocket Element approach is the higher complexity in terms of length and code. While this code may not be hard to understand for more experienced developers it does present an entry barrier to web development for domain experts and people without any significant coding experience who could use their domain expertise to build web applications with XML [B16]. No knowledge of JavaScript callback functions, variables or loops is needed and necessary security checks to ensure correctly formed attributes are done in the background using the WebSocket Element. The imperative solution becomes even more complicated when multiple WebSocket connections to different destinations must be opened, as this requires duplicate and more complex code. The WebSocket Element can be declared and configured multiple times on the same page like any other HTML element. Like other HTML elements it conveys meaning in the tag itself and encapsulates its functionality which could be extended easily in the future by adding more attributes. The modularity of the declarative approach makes it flexible yet easy to use as many attributes are optional.

Advanced functionality

Our basic WebSocket Element contains only the mandatory attributes to establish a WebSocket connection. There are many additional parameters which can be used to extend the functionality as needed. These include settings about automatic reconnection, ping frequency, client side XSLT support and initial data to load. In the example above the data received from the server will not be altered, only inserted into the page for rendering as content for the WebSocket Element. This is suitable for HTML with CSS, SVG or other pre-generated formats supported by the browsers. The following code shows a WebSocket Element which receives raw XML data from the server which is then transformed by the browser given an XSL stylesheet and parameters. Additionally the geturl attribute specifies a relative URL from which the first state of the element should be fetched. One important goal was to make the element easy to use but extendable for advanced tasks.

Figure 8: Advanced parameters

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-007.png


The figure below demonstrates the interactions between the different components in a multi-client web application using the WebSocket Element and a WebSocket enabled server in an MVC architectural style. In a multi-client application many different views and WebSocket Element instances can exist. For the sake of simplicity, the figure shows the communication from only one client's perspective. From the GUI perspective, WebSocket Elements are part of the View and incoming messages will update the display according to how the View is realized. In the figure, the WebSocket Element and the View are shown as separate components to illustrate the interaction.

Figure 9: Multi-client web application architecture

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-008.png

This architecture extends our reference architechture as shown in Figure 1. A View communicates with the web server through HTTP but is not using the HTTP response. Instead, the web server triggers a method in the Controller through RestXQ. The Controller defines through the WebSocket module a response that it wants to be sent to all subscribing WebSocket Element instances. The web server follows through and the WebSocket Element inserts and potentially processes the response by transforming it and updates the view. The bidirectional channel between the WebSocket Element and the web server is used for ping and pong messages to ensure the liveness of the connection and to handle the subscription to channels.

Example application: Tic-Tac-Toe

We have implemented a two-player demo application for the game Tic-Tac-Toe. The demo uses BaseX on the server side. In this case, the application core transforms declarative state description into SVG and sends that to the individual clients. A version that passes the state description to the WebSocket Element in each of the clients and performs the XSL transformation into SVG on the client side is equally possible.

Some important methods and concepts of using the WebSocket Element will be highlighted in this chapter to show the practical usage in a multi-client web application. The Tic-Tac-Toe game is implemented on the BaseX server (including STOMP support). It uses XML technologies:

  • XQuery as the serverside programming language.

  • XSLT to generate HTML and SVG to render in the browser.

  • XML as data format to describe a Tic-Tac-Toe game.

Furthermore it implements the following functionality:
  • Playable by 2 players in a distributed way.

  • Only one instance of the game, so only one game server to play on.

We will look at two methods which are important for a multi-client application and show how the WebSocket Element is created and the communication channels established.

Before players can play a game together they first need to join the game.

Figure 10: The join method

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-009.png

As seen in Figure 10 the join method is using RESTXQ and awaits a POST request on the specified path. Its main purpose is to build a HTML page for the calling client and create a WebSocket Element with the specified playerID. The method uses BaseX's request module to get the hostname and port component of the incoming HTTP request. It then constructs the WebSocket path to which the WebSocket Element will later connect, the URL where the first state should be retrieved from (getURL) and the subscription attribute for the WebSocket Element. By using the request module no parameters like "localhost" or the port have to be hard coded and the game can be played in different network configurations not only on localhost.

The join method then proceeds to create the HTML page which contains all necessary dependencies (JQuery and STOMP) and the JavaScript file for the WebSocket Element itself. Inside the body of the HTML our newly defined WebSocket Element is defined and configured using parameters. After that the method returns the HTML to the client, the browser starts parsing the site and connects via WebSocket to the URL we specified in our join method. Furthermore, the subscription attribute is evaluated to join for example the channel "/ttt/X". The client will be reachable through the channel "/ttt/X" while another client can join channel "/ttt/O". Finally the WebSocket Element will issue a GET request to our getURL, which in the case above is our draw method. This will trigger the server to push the current state to all clients. We have now successfully established a WebSocket connection with the BaseX server by calling our join method.

As we now have clients connected through WebSocket Elements to our BaseX server, we can now proceed to send messages to them, to inform them of the state of the game. This is accomplished using a generic draw method, shown in Figure 11. The main purpose of the method is to send the current game state through WebSocket to all connected clients.

Figure 11: The draw method

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-010.png

The drawGame method doesn't change the state of our game and is annotated using RestXQ's GET, awaiting requests to the specified path. The game is described by a XML model which is transformed into HTML and SVG by the stylesheet "ttt_game.xsl". The method gets the stylesheet and the wsID's of all currently connected clients. The getIDs() method uses BaseX's ids() method from the WebSocket module. Inside the return the method iterates through all connected WebSocket clients, gets their respective ids and generates for each of them a visual presentation of the game using the before mentioned stylesheet. In a last step drawGame uses the send method, which sends the transformedGame to the clients using the sendChannel($data, $path) method introduced with BaseX's STOMP server.

Inside the browser on the client side the WebSocket Element receives the HTML and SVG sent by the drawGame method and updates it's view accordingly, by merely inserting it into it's own content. The clients can now issue an action in the game, which ultimately triggers the drawGame method to propagate the change of state to all connected clients. Figure 12 shows the user interfaces of two clients playing against each other.

Figure 12: User Interface of the Tic-Tac-Toe

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-011.png

The two functions showed some important aspects while working with the WebSocket Element and BaseX. It showed that building a multi-client application within the XStack doesn't result in complicated long code and only some additional methods have to be used to handle the WebSocket connection. However, many functions used by a locally playable Tic-Tac-Toe can be reused without further modifications in a multi-client application.

Final remarks

WebSocket is a widely supported technology which all major browsers implement. The custom element (V1) specification is fully supported by Chrome and Firefox, whereas Safari and Opera only implement the special case on which the WebSocket Element is built [CIU19]. Microsoft Edge does not implement custom elements yet, but support for this often requested feature is marked as "in development" [M19].

The WebSocket Element works in conjunction with HTML, CSS and SVG on the client side. We are currently investigating how the WebSocket Element can be integrated into other GUI technologies such as XForms and Saxon JS.

The WebSocket Element simplifies the development of multi-client web applications in the X stack, as shown by the proof of concept Tic-Tac-Toe, which is implemented by using BaseX's STOMP WebSocket implementation on the server side and the WebSocket Element on the client side.

Statecharts and SCXML

GUI components are commonly considered to be event-driven systems whose functionality is triggered by events, mostly from user interactions. Typically, due to the nature of those systems, there are constraints to the legal sequences of events, and the specific activity that is triggered may be dependent only on a specific pattern in the history of previous events. The classical tool to model such abstract “behaviour” of an event-driven system is statecharts. Statecharts have been first introduced by Harel as documented in a book [HP98] he co-authored with Politi. They have later, in the object-oriented variant of state diagrams, become part of UML2; see [SSHK15] for a textbook introduction and [H99] for an extensive discussion of the use of statecharts in software engineering and how they help to cut the complexity of models and of model-driven implementations.

Most recently, with SCXML (State Chart XML) [B15], an XML encoding language for statecharts has been standardized, bringing statecharts into the realm of XML technologies. SCXML fully supports the semantics of statecharts defined by Harel and furthermore, specifies additional elements, for example for communication to external systems or for execution of specific activities. A number of research papers discuss the use of SCXML in particular, among them the Bachelor Thesis of Roxendal [R10], invited expert to the W3C committee that defined SCXML.

The introduction of SCXML has led to a need and rise of systems that are able to interpret or run an SCXML-encoded statechart that models the behaviour of a system, calling system activities when changing state as defined by the statechart and therefore making SCXML executable in a system. Such SCXML processors directly execute models for application behaviour, interfacing with application activities. Grubmüller [G18] discusses a number of such SCXML processors which mainly differ in their programming languages as well as in their functionality of supporting the standardized semantics of SCXML.

In a web application that is modeled in the MVC architectural style, the Model component is another event-driven system that may have dynamically instantiated sub-components that are again also event-driven. In a game, for example, we might have a single lounge in which players assemble for games and a sub-component for each game that is currently active. Events are API calls for the Model in the form of function calls or HTTP requests that are typically issued by the Controller component. In the X stack, the Model component is implemented as an XQuery module that runs, for example, in the BaseX database system [B16]. That module needs to be able to instantiate SCXML processors for SCXML-encoded statecharts at runtime, to forward events to these SCXML processors and eventually to delete the SCXML processors. Obviously, the module also supports an API to handle activities that are triggered by the statecharts.

In a previous study [ABCES17], we have investigated the possibility of using an SCXML processor that is implemented in XQuery [S15,E17] to support the implementation of a Model component in a web application. Whereas it is attractive to use an SCXML processor that is implemented in XQuery in the X stack, the limitations in functionality of current XQuery implementations have led us to a different approach, namely to integrate the Apache Commons SCXML Interpreter [A17] into the X stack [G18]. Apache Commons SCXML Interpreter is a stable, functionally complete and well documented SCXML interpreter that is implemented in Java as a project of the Apache Software Foundation.

How can the Apache Commons SCXML Interpreter, which is implemented in Java, be connected to a Model component that is implemented as an XQuery module and runs in the BaseX database system? Grubmüller [G18] provides a solution that is based on Java bindings as offered by BaseX to make Java classes available to XQuery modules. The solution is illustrated in the figure below.

Figure 13: Architecture with SCXML interpreter

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-012.png

Therefore, a new XQuery module handles the communication between the XQuery module of the Model component and the SCXML interpreters, which are Java objects. The particulars of Java bindings in BaseX require a second communication module written in Java that manages the different SCXML interpreters that are active at any time such that every instance of the SCXML interpreter maintains its own state, for example, the state of one specific game instance. The current state as well as the next possible events out of this state of every interpreter are additionally saved in XML-format to retrieve this information by keeping Java Binding calls as minimal as possible.

Two case studies with different levels of complexity in terms of system behaviour, Tic-Tac-Toe and Blackjack, demonstrate the validity of the approach. In both cases, the declarative modeling of the behaviour of the Model component as statecharts, the representation of the behaviour models as SCXML documents and their interpretation with the Apache Commons SCXML interpreter facilitate a model-driven approach which also highlights the added value of statecharts. First of all, using statecharts for describing and modeling the behaviour of a system fully captures how the system should behave under certain conditions and events in a standardized manner. This creates a clear picture for everyone dealing with the system where no room for misunderstanding is left. This can be demonstrated when modeling the behaviour of the game Blackjack [G18] which is shown in the figure below.

Figure 14: Statechart for the game Blackjack

png image ../../../vol23/graphics/Bruggemann-Klein01/Bruggemann-Klein01-013.png

As the game Blackjack has non-trivial behaviour (there are several actions players can take under certain conditions), statecharts are the appropriate tool for modeling as they provide a rich set of features like higher order states that allow the creation of a logical hierarchy of states. One example for this are the two states “GAME_RUNNING” and “GAME_OVER” which are obviously on the same level of abstraction while “GAME_RUNNING” consists of several lower order states describing when the game Blackjack is currently being played. Another benefit of this approach is that all necessary main functions which are needed to implement the system are kind of predefined within the statechart in the form of events that connect the states. This improves the understanding of the system and allows for a more structured way to implement it.

SCXML supports all the semantics introduced by statecharts and thus, the encoding to SCXML is straightforward. Furthermore, the implementation of Apache Commons SCXML allows to call Java functions within state transitions out of an SCXML  [A17]. This is important as every interpreter instance has to be able to call the corresponding application functions which are located in an XQuery module in our case. This is realized by sending HTTP requests from a custom Java function to the function in the XQuery module by using RestXQ annotations [G18].This approach also allows to send any number of HTTP requests within one state transition meaning that several XQuery functions can be called independently. As a result the SCXML interpreter fully controls how the system behaves and which functions are called when a certain event occurs and states are changed.

This model-driven approach achieves a strict separation between behaviour and implementation of the system logic as the behavioural component is completely expressed within the statechart and the encoded SCXML. Through this separation the implementation of the system logic gets much clearer and more compact as the behaviour is controlled separately. This effect was observed even with a comparatively simple system like TicTacToe but gets even more impressive and useful for more complex systems.

We currently investigate how to transfer the work that was done for Model components that run in an XML database system on the server to GUI components that run in a web browser. The goal, again, is a declarative approach that generates code from models. We intend to model the behaviour of GUI components with statecharts, encode the statecharts with SCXML and have them executed by SCXML interpreters. Since there are a number of promising SCXML processors written in JavaScript [G18], the language of web browsers, we expect the integration of SCXML processors into the client side of the X stack to be straightforward.

Conclusion and future work

In this paper, we have extended our coherent and coordinated set of practices for developing XML-powered web applications from models by taken a closer look at technologies for graphical user interfaces. The practices draw on previous work and have been and are being vetted with case studies. As always, proven principles and practices from software engineering have been a source of inspiration.

We have presented a number of GUI technologies that are useful for our purposes. They have different strengths and weaknesses, so we still need to come up with a framework to mix and match these technologies.

  • We have looked at GUI technologies in browsers in the context of the trusted MVC architectural style, which allows to decouple the user interface from the other components of an application. We have defined requirements for GUI technologies and we have investigated and evaluated a number of specific technologies.

  • A number of GUI technologies are still under consideration, most notably SaxonJS and React. We are also interested in ways to deal with time in GUIs and for enriching GUIs with post-WIMP interaction methods and computational capabilities.

  • Previously, we have shown how RestXQ annotations of XQuery functions enable us to rely on pure HTTP communication between clients and servers (no frameworks!) We have extended this declarative approach to communication over the STOMP and WebSocket protocols for server push, introducing a newly developed HTML component called WebSocket Element [U18] that initiates communication to a server and handles the incoming XML data. This is a declarative way to integrate AJAX-like calls into a web page with the added feature of allowing for incoming data from servers that are not preceded one-on-one by requests from the client. The BaseX server is now enabled for server-push communication through RestXQ-like annotations and a new WebSocket module. This work is was inspired by previous theses at TUM [C17,U18]

  • The complexity of event-based systems such as model or user interface components can be reduced by introducing the concept of behaviour that is modeled by statecharts. We have demonstrated how SCXML-encoded statecharts can be instantiated dynamically and executed with XML technology in a model component by interfacing to the fully functional SCXML processor of the Apache Commons SCXML project [G18]. Further work will look into JavaScript SCXML processors that can be integrated into a GUI component in a browser.

All solutions are based on W3C or industry standards and use freely available software components.

The main motivation for this work is to be able to generate serious games as web applications from models. Systematic analysis of user interactions with these games are used to determine variants of games in an adaptive fashion, to improve usability and to further learning.

Another motivation for this line of work has been to support XML experts as end-user programmers

In teaching a lab course on XML technology, we have continued to make the experience that no-frills web applications, reduced to essentials, which do not require any frameworks, are a useful and appreciated pedagogical approach for teaching computer science students. Students who only use X stack technologies for web application projects develop practical skills in SVG, XQuery and XSLT. Our austerity requirement proves to be an effective pedagogical tool for raising the level of conceptual knowledge, appreciation and practical competence in the area of XML technologies. This is the not-so-hidden agenda in the lab. We consider this outcome more valuable than instructing students in another short-lived web application framework.


[A17] The Apache Software Foundation. Apache Commons SCXML. [online]. [cited 5 December 2017].

[ABCES17] Zahra Al-Awadai; Anne Brüggemann-Klein; Michael Conrads; Andreas Eichner; Marouane Sayih. XML Applications on the Web: Implementation Strategies for the Model Component in a Model-View-Controller Architectural Style. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies, vol. 19 (2017). [online]. [cited 2 April 2019]. doi:

[B15] Jim Barnett (Editor-in-Chief). State Chart XML (SCXML): State Machine Notation for Control Abstraction. W3C Recommendation 1 September 2015. [online]. [cited 11 April 2016].

[B16] Anne Brüggemann-Klein. The XML Expert's Path to Web Applications: Lessons Learned from Document and from Software Engineering. In Proceedings of XML In, Web Out: International Symposium on sub rosa XML. Balisage Series on Markup Technologies, vol. 18 (2016). [online]. [cited 22 July 2017]. doi:

[B19] BaseX Team. WebSockets documentation. [online]. [cited 9 April 2019].

[BD09] Bernd Brügge; Allen Dutoit. Object-Oriented Software Engineering Using UML, Patterns, and Java. Prentice Hall, 2009.

[BRHS12] Anne Brüggemann-Klein; Jose Tomas Robles Hahn; Marouane Sayih. Leveraging XML Technology for Web Applications. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). [online]. [cited 22 July 2017]. doi: Updated version available on request from

[BSW00] Jan Bosch; Clemens Szyperski; Wolfgang Weck. Component-Oriented Programming. In European Conference on Software and Data Technologies. Springer, 2000.

[C17] Michael Conrads. Multi-Client Web Applications with XML Technologies. Master Thesis Technical University of Munich, 2017.

[CIU19] Alexis Deveria. Can I use Custom Elements? [online]. [cited 9 April 2019].

[D97] Andries van Dam. Post-WIMP User Interfaces. In Communications of the ACM Vol. 40 No. 2, 1997. [online]. [cited 5 July 2019]. doi:

[E04] Eric Evans. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2004.

[E13] Jens Erat. Fine Granular Locking in XML Databases. Bachelor Thesis University of Konstanz, 2013.

[E17] Andreas Eichner. SCXML in Web-Based Applications. Master Thesis Technical University of Munich, 2017.

[F00] Roy Thomas Fielding. Architectural Styles and the Design of Network-based Software Architectures. PhD Thesis University of California, Irvine 2000.

[F02] Martin Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley, 2002.

[F18] Johannes Finckh. Erweiterung der Client-Kommunikation in BaseX um die Funktionalität von WebSockets. Master Thesis University of Konstanz, 2018.

[G10] Florent Georges. HTTP Client Module. [online] [cited 3 April 2017].

[G17] Christian Grün and Team BaseX. The XML Database. [online]. [cited 28 March 2017].

[G18] Christina Grubmüller. Statecharts in XML-Based Web Applications. Bachelor Thesis Technical University of Munich, 2018.

[H99] Ian Horrocks. Constructing the User Interface with Statecharts. Addison-Wesley, 1999.

[HP98] David Harel; Michal Politi. Modeling Reactive Systems with Statecharts: The STATEMATE Approach. McGraw-Hill, 1998. [online] [cited 19 2016]

[HW12] Tom Hughes-Croucher; Mike Wilson. Node: Up and Running: Scalable Server-Side Code with JavaScript. O'Reilly, 2012.

[M19] Microsoft. Edge Platform Status Custom Elements. [online]. [cited 9 April 2019].

[R10] Johan Roxendal. Managing Web Based Dialog Systems Using StateChart XML. Bachelor Thesis University of Gothenburg 2010.

[R11] Jonathan Robie et al. (Editors). XQuery Update Facility 1.0. W3C Recommendation 17 March 2011. [online]. [cited 22 July 2017].

[R14] Jonathan Robie et al. (Editors). XQuery 3.0: An XML Query Language. [online]. [cited 22 July 2017].

[S15] Christoph Schütz. An SCXML Interpreter in XQuery. [online]. [cited 7 April 2017].

[SKB14] Marouane Sayih; Martin Kuhn; Anne Brüggemann-Klein. GameX — Event-Based Programming with XML Technology. In Proceedings of Balisage: The Markup Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). [online]. [cited 20 April 2016]. doi:

[SSHK15] Martina Seidl; Marion Scholz; Christian Huemer; Gerti Kappel. UML@Classroom. An Introduction to Object-Oriented Modeling. Springer-Verlag, 2015.

[U18] Philipp Ulrich. Model-Driven Development of Multi-Client Web Applications with XML Technology. Bachelor Thesis Technical University of Munich, 2018

[WC19] Webcomponents Team. Webcomponents. [online]. [cited 5 April 2019].

[WSM13] Vanessa Wang; Frank Salim; Peter Moskovits. The Definite Guide to HTML5 WebSocket. APress 2013.

Author's keywords for this paper: Web Application; XML Technology; X Stack; Graphical User Interface (GUI); XForms; Web Components; Custom Element; Multi-client Web Application; WebSocket; STOMP; Server Push; WebSocket Element; Statecharts; SCXML; Document Engineering; Software Engineering; Declarative Technologies; Model-Driven Architecture; Domain-Driven Design; End-User Computing