How to cite this paper

Gotti, Fabrizio, Kevin Heffner and Guy Lapalme. “XSDGuide - Automated Generation of Web Interfaces from XML Schemas: A Case Study for Suspicious Activity Reporting.” Presented at Balisage: The Markup Conference 2015, Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015). https://doi.org/10.4242/BalisageVol15.Gotti01.

Balisage: The Markup Conference 2015
August 11 - 14, 2015

Balisage Paper: XSDGuide – Automated Generation of Web Interfaces from XML Schemas: A Case Study for Suspicious Activity Reporting

Fabrizio Gotti

Fabrizio Gotti is a researcher at the Laboratory for Applied Research in Computational Linguistics (RALI) at the Université de Montréal.

Kevin Heffner

Kevin Heffner is President of Pegasus Research & Technologies, a Montreal-based company specialized in flight simulation and training, constructive simulations, unmanned/autonomous systems and command & control.

Guy Lapalme

Guy Lapalme is Professor of Computer Science at the Université de Montréal (Laboratory for Applied Research in Computational Linguistics), where he has been a faculty member since 1980. He is a leading expert in the computer processing of human language. He has published on many aspects of the subject including spelling correction, dictionary editing, text generation, automatic summarization, information extraction, opinion mining and machine translation tools. His career combines innovative research and outreach to the practical world through long-term collaboration with partners from both the academic and industrial worlds. Recently, he was awarded an Honorary Doctorate from the Université de Neuchâtel (Switzerland) and Lifetime Achievement Award from the Canadian Artificial Intelligence Association.

Copyright © 2015 by the authors. Used with permission.

Abstract

This article presents XSDGuide, a software prototype aimed at facilitating the creation of user interfaces consistent with a data model expressed as a set of XML schemas. XSDGuide was developed while researching intelligent user interfaces for data entry associated with the production of Suspicious Activity Reports (SARs) conforming to NIEM-SAR, an XML-based information-dissemination framework. These SARs communicate potentially suspicious or unlawful incidents to the appropriate authorities. The XSD schemas defining a specific SAR are fed to XSDGuide, which then automatically creates user interface guides, rendered on a web page. The user can interact with this application to populate the report’s fields, validate the SAR being created and save the report as a valid XML instance. Validation is a two-step process, where a JavaScript ruleset created from the schema pre-validates the document in the browser before it is sent for full validation to the back end, which relies on a traditional full-fledged validator. Despite the prototype’s limitations, the HTML interfaces that are generated allow users to inspect and become familiar with complex schemas and also to produce validated XML instance documents for the purposes of experimentation and testing.

Table of Contents

Introduction and Context
NIEM-SAR and Suspicious Activity Reporting
Information Exchange Package Documentation and XML Schemas
The Case for an Enhanced User Interface
XSDGuide’s General Architecture
HTML Rendering and Data Entry
Application Architecture
Implementation Details
Interface Guides
General Principle
Element Nesting
Element Documentation
Number of Occurrences
Enumerations
Data Entry Widgets
xs:choice and Substitution Groups
XML Validation
Validation Carried out by the Front End, in the User’s Browser
Validation Carried out by the Back End, XSDGuide’s Java Engine
Saving the Suspicious Activity Report
Schema Management
Current Limits and Perspectives
XSD Rules to Implement
Additional Features
SAR Loading
Validation Feedback
Other Schemas
Conclusion

Introduction and Context

Suspicious activity reporting refers to the process by which members of the law enforcement and public safety communities as well as members of the general population communicate potentially suspicious or unlawful incidents to the appropriate authorities. This reporting has been identified as one part of a broader Information Sharing Environment (ISE) as defined by [1]. The ISE establishes a framework to support reporting, tracking, processing, storage and retrieval of terrorism-related suspicious activity reports (SARs). The ISE initiative builds upon the foundational work by the US Departments of Justice and Homeland Security that have collaborated to create the National Information Exchange Model (NIEM), which has received approval from the governments of the US [2] and of Canada [5].

SAR is one of a set of messages that is supported by the NIEM. In particular, the NIEM has developed a specific model for suspicious activity reporting, the NIEM-SAR [3]. Preliminary NIEM-SAR prototypes have shown great promise for information sharing for a broad range of activities, but several areas requiring improvement were noted in December 2011[1]. In particular, faster response times are needed to get information into the system, to process the information and to make it available to users. It is noteworthy that SAR capabilities already are functional in an operational capacity in the US in some local jurisdictions. However these systems lack the ability to process information automatically and therefore require significant manual intervention in data centers. The current work proposes the use of adaptive user interfaces as a potential means for reducing the workload related to producing and processing SAR data.

The main functionality of the XSDGuide prototype presented in this paper is to assist the user in the creation of valid suspicious activity reports (SARs) compliant with the NIEM-SAR framework. In so doing, it allows the user to become familiar with the business rules in a manner that is more efficient than browsing the XSD documents.

The design and implementation presented here do not make any assumptions about who the user is, although they fall into two broad categories. The first category includes users registered with an agency and who have known skills, proficiency and expertise. They have specific access and privileges according to the role that they play in their organization, e.g. a police officer, an airport security agent. The second category consists of users not registered with an agency and who may or may not have Public Safety and Security domain-specific knowledge or skills. It is assumed that the generalized case for an unauthorized user is that it is someone from the general public.

In the following sections, first the NIEM-SAR framework is described, including the implementation constraints faced by a SAR authoring tool. Then the general architecture of the prototype is presented as well as the various interface guides that are created based on the input XSD schemas. Afterward, the XML validation and file saving steps are described. Finally, the last section describes XSDGuide’s limitations and suggests perspectives for future work.

NIEM-SAR and Suspicious Activity Reporting

Information Exchange Package Documentation and XML Schemas

Using the NIEM-SAR framework involves creating and using an IEPD (Information Exchange Package Documentation). An IEPD is designed to transmit informational needs for a given domain, and is typically created by experts of this field. Their task is to build a data model describing the environment in which suspicious activity reporting occurs. Dedicated software tools are used to create IEPDs, for instance Cameo Enterprise Architecture with the NIEM plugin[2].

The IEPD is a zip archive containing XSD schemas capturing the data model, as well as extensive documentation. Instances of these schemas are the actual suspicious activity reports (SARs).

Only a few NIEM-SARs are freely available on the web. Here are some examples:

  • “Suspicious Activity Reporting (SAR) for Local and State Entities IEPD v1.1.1” from the Bureau of Justice Assistance (BJA).[3]

  • “ISE-FS-200-version-1.5 Suspicious Activity Reporting (SAR)” [4]

  • “Suspicious Activity Report” from the “Texas Department of Public Safety, Crime Records Service” [5]

They are quite complex, both because of the large number of types they define and because the SARs they propose are quite intricate. To give an idea of the elaborateness of this architecture, the IEPD “ISE-FS-200-version-1.5 Suspicious Activity Reporting (SAR)” mentioned above contains 74 XSD files defining 196 simple types and 658 complex types. IEPDs make heavy use of inheritance (with abstract classes) and substitution groups. The package is organized so as to include numerous libraries from the NIEM core types, from which these customized classes are derived or augmented.

Reconciling the need to comply to such a standard with the need for timely creation of SARs is quite delicate.

The Case for an Enhanced User Interface

The creation of XML instances meeting the constraints expressed in XML schemas is not a new problem. Existing solutions go from the very simple text editor to the dedicated IDE.

Over the last few years, the Eclipse IDE[6] has developed extensive tools to manipulate XML, including the creation of XML instances validated by schemas. XML editors like Oxygen[7] are extremely helpful in guiding the creation of XML instances (like SARs). Oxygen notably offers context-dependent autocomplete suggestions, documentation and live validation of the document. The latter is an excellent solution for IT specialists, but becomes quite difficult for the average person. Oxygen does offer an MS-Word-like author mode that works very well for a set of recognized schemas (such as Docbook), but reverts to tag-based editing when a document associated with an arbitrary schema is created.

XSDGuide’s General Architecture

The general principle behind the prototype XSDGuide is shown in Figure 1.

Figure 1: XSDGuide’s general architecture. The back end is XSDGuide’s Java server and the front end (HTML rendering) appears in the user’s browser.

An XML schema (XSD format) is first selected by the user who wants to create a suspicious activity report conforming to that schema[8]. As mentioned in the introduction, these schemas are rarely standalone and, for instance, in the NIEM ecosystem they usually come packaged as IEPDs. These IEPDs are zip archives that contain, among other documents, the SAR schema as well as any necessary XSD schemas imported through import or include statements. These imported documents are the required libraries on which the SAR schemas are built, and act somewhat as an SDK. As long as the XSD imported schemas can be found using the specified absolute or relative URLs, XSDGuide can readily manage such an archive.

Once the schemas are read, two additional inputs are needed: The user is prompted to specify which one of the schemas contain the root element, and what this element is within the file. XSDGuide uses the schemas provided to build three components (middle of Figure 1).

User interface guides are created for elements defined in the schemas. These guides are at the heart of the prototype. Each one holds all the information required to create a user-friendly UI element. They mostly correspond to information for a given element (XML schema base type, cardinality, etc.) but not always. For instance, an <xsd:choice> corresponds to a UI guide allowing the user to pick one of the elements proposed by the choice. Importantly, the guides maintain information about their child guides too. This hierarchical organization mirrors that of the XML schema. It is noteworthy that these guides are independent of the rendering medium. They are abstract in their nature, and could be rendered in, say, a standalone application or a web page. Their HTML rendering is described in Section section “HTML Rendering and Data Entry”.

An XML document can be created on demand whenever the user needs the prototype to create an actual instance document (a SAR in our case), driven by the schema provided earlier. Each element of the document is tied to the specific user interface guide (see previous paragraph) that facilitated its creation. Ultimately, the user interacts with the rendered UI guide in order to inject values into the corresponding XML element.

A validator built on Java’s XML validation library is also constructed from the schema(s) provided. This is the most straightforward use of such a schema, and it allows the prototype to validate the SAR being built by the user. Validation messages are presented to the user, in order to elicit an appropriate response.

HTML Rendering and Data Entry

XSDGuide strives to facilitate the creation of suspicious activity reports (SARs). To achieve this, the various UI guides created from the underlying XML schema(s) must be rendered in a user-friendly way, while at the same time enforcing the various constraints expressed in the schema.

It is important to note that our library is not tied to any specific rendering of the interface guides. Indeed, one of XSDGuide’s design principles was to create a Java library that would handle most of the processing associated with the tasks at hand. The user-visible part could then either be materialized as a standalone, Swing-like application or, as we did here, as a web application.

We opted for the latter solution, because we felt that an HTML rendering lent itself naturally to the representation of nested elements (the XML nodes). Since there were also time constraints to the coding of the application, HTML provided a way to fast-track the development of an aesthetically pleasing GUI. A web application has other advantages, including portability, across all operating systems and most hardware, including smart phones. This portability is typically difficult to achieve using the usual GUI SDKs, including Swing. Moreover, the majority of users are already familiar with web applications (e.g. Gmail, Facebook, etc.).

The main drawback of this approach is the necessary separation of the implementation logic between server-side and client-slide elements, as well as the additional networking component between the two.

Figure 2 shows the complete web interface created by XSDGuide from a single XML schema file (SAR-RALI.xsd, see code below for an excerpt). We created this schema for illustration purposes in this article. It is a simplified schema allowing the creation of basic SARs, while preserving the general philosophy and terminology of the SAR schemas found in IEPDs. It is worth noting that XSDGuide can fully process the latter IEPDs.

The same interface adapts itself to smartphone screens, as seen Figure 3.

Its navigation bar features the following menus.

More rendering examples are available on our website[9].

    <!-- Excerpt of schema SAR-RALI.xsd used as the running example here. Two complex elements are listed. -->

    <xs:element name="SuspiciousActivityReport">
        <xs:annotation>
          <xs:documentation>A structure that describes a SAR Report </xs:documentation>
        </xs:annotation>
        <xs:complexType>
          <xs:sequence>
            <xs:element ref="sarrali:Metadata" />
            <xs:element ref="sarrali:Data"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    
      <xs:element name="Metadata">
        <xs:annotation>
          <xs:documentation>A structure that describes Metadata about a related SAR</xs:documentation>
        </xs:annotation>
        <xs:complexType>
          <xs:sequence>
            <xs:element ref="sarrali:UniqueId"/>
            <xs:element ref="sarrali:Title"/>
            <xs:element ref="sarrali:SubmissionSystem"/>
            <xs:element ref="sarrali:Author"/>
            <xs:element ref="sarrali:CreationDateTime"/>
            <xs:element name="DisseminationCriteria" type="sarrali:DisseminationCriteriaType"/>
            <xs:element ref="sarrali:RelatedSarList" minOccurs="0"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>

Figure 2: Screenshot of XSDGuide’s interface on a desktop computer after generating the interface for the schema SAR-RALI.xsd (see listing above).

Figure 3: Screenshot of XSDGuide’s interface on a smartphone.

Application Architecture

Figure 1 shows the general client-server architecture implemented by XSDGuide. The back end implements most of the XML-related logic, including the creation of the XML document, the management of schemas and the exploration of their constraints. These features are made possible by Java libraries from the Apache Xerces™ Project[10]. The back end also includes a lightweight web server (implemented with Apache Jetty) responding to queries made from the front end, written in JavaScript and leveraging the popular frameworks JQuery[11] and Bootstrap[12].

Implementation Details

For this project, even if we are writing a web application, we opted to implement most of the logic within the Java backend. This means that most of the models for the XML document and the corresponding schemas are maintained there, and that the HTML rendering is carried out in the backend. This is consistent with the fact that we wanted to be able to render the UI guides within the Java library we created, instead of relying too heavily on the client-side JavaScript to carry out this task. Moreover, we wanted this rendering step to be as “close” as possible to the Java models and validators to simplify the design.

The interaction steps involved when creating a new document can further explain XSDGuide:

  1. The user visits the page and the Jetty server produces an interface essentially made out of static elements. HTML elements are laid out with Bootstrap, which simplifies the creation of the UI, and provides an elegant responsive interface (e.g. for smartphones).

  2. When the user clicks “New report”, JQuery is used to create an AJAX request to the backend to create a new element. The request specifies the name of the XSD file stored in the backend (e.g. SAR-RALI.xsd) and the root element of the XML document (e.g. http://rali.iro.umontreal.ca/sarrali:SuspiciousActivityReport).

  3. The server parses the corresponding schema, and creates a new XML document in memory. This is done using the Apache Xerces API and the javax.xml package, which both play key roles in this project. The server then returns the id of the new document, as well as the id of the newly created root element.

  4. The JavaScript client asks for the HTML rendering of the elements it wants to display (here, the root element as well as any non-optional child elements).

  5. The server replies with a snippet of HTML for each element. This snippet includes the rendering of the associated interface guide (see following section), as well as additional information on the possible child elements and attributes. The JavaScript library is responsible for parsing and positioning this code snippet within the HTML document. It also adds event listeners to the guide in order to validate the data the user enters into the newly created controls.

The backend offers a few simple services, called with AJAX from the user’s browser. These include the standard CRUD operations on an element, as well as the validation operation on the document. The validation process is described in Section section “XML Validation”.

When implementing the HTML form elements needed by the prototype, we briefly considered using XForms[13], which provides sophisticated form markup to gather, validate and process XML data within web pages (among other document types). However, to our knowledge, none of the popular web browser supports XForms natively, and the W3C recommendation seems to have been eclipsed in part by HTML5 controls. The latter were also investigated and, while they provide valuable support for the validation of some constraints, they cannot implement some of the simplest XSD rules. For instance, an HTML5 input control with a type of “number” cannot constrain the number of digits after the decimal point (fractionDigits in XSD). For these reasons, and simply because we wanted to retain full control over this implementation, we used traditional web forms augmented with JavaScript controls.

Interface Guides

This section explains how XSDGuide transforms XSD-defined constraints into usable UI guides for the end user. The rendering of these constraints is a—sometimes difficult—compromise between enforcement of the constraint expressed in the schema and the need for an accessible user interface.

General Principle

The general principle of the UI guides is to establish a mapping between an element or attribute type (whether it be named or anonymous) and a specific UI widget or widget group. For instance, if an element type is derived (either directly or not) from the base type xs:string, then it makes sense to render it as text field in HTML. Furthermore, if additional constraints (e.g. a regular expression pattern) control the content of the element, it is desirable that these constraints be present when the user fills out the fields so as to provide valid information as early as possible in the SAR authoring process.

Element Nesting

Element nesting (e.g. a complex type containing sub-elements) is presented as a set of nested html elements, so that the user understands the compositionality of the complex elements. For instance, in Figure 2, a SuspiciousActivityReport root element is composed of an element Metadata, clearly visible as a nested box inside the element SuspiciousActivityReport. The element Metadata also contains sub-elements. The simple-typed sub-elements (e.g. UniqueId) appear as simple text fields inside Metadata, while complex ones (e.g. SubmissionSystem) is a sub-box.

To allow the user to customize this nested view, a triangle icon to the right of a complex element’s name hides or shows the child elements.

Element Documentation

Element documentation included in xs:documentation schema elements are presented to the user either as text (under the element heading) or as tooltips when hovering over fields corresponding to element information. When collecting xs:documentation, we traverse the complete type hierarchy for a given element to gather as much documentation as possible, from the base class down to the current element. Additional help is provided by the prototype itself, for instance when an xs:choice is encountered, in order to help the user make sense of the choice that is presented to them (see Section section “xs:choice and Substitution Groups”).

Number of Occurrences

XML schemas specify the number of occurrences of attributes and elements, through different types of rules. Examples of this are xs:sequence rules, where each child element can appear from 0 to any number of times. By default, the minimal number of occurrences and the maximum number of occurrences is set at 1. They can often be overridden using minOccurs and maxOccurs attributes.

These limits are not all explicitly stated to the user. For instance, when a new element is created, all elements whose minimal number of occurrence is greater or equal to n are also created n times. In Figure 2, for instance, the creation of the SuspiciousActivityReport causes the creation of one Metadata element and one Data element. The Metadata element is also populated likewise recursively.

When the schema allows the user to pick the number of elements, the user can click on links like the one labeled “Add new RelatedSarList (optional)” in Figure 2. The element is then dynamically added to the current report (and possibly populated with mandatory sub-elements). If the user tries to add more elements than allowed by the schema, a warning appears.

The user can also delete unwanted elements by clicking a “Trash” visible when the user hovers over an element. The latter is then removed dynamically from the report. The user cannot remove more elements than the schema allows.

The Java back end naturally mirrors the changes made in the web page, by creating and deleting elements in its in-memory representation of the SAR.

The same is true of attributes whose presence or absence can be customized (in this case, the occurrence count is either 0 or 1).

Enumerations

In suspicious activity reports, there are numerous places in the schema where experts in safety and security have elaborated exhaustive lists pertaining to the description of entities. For example, there are 26 possible colors of gun finishes defined by NIEM-SAR. It is critical that such enumeration be presented in a user-friendly way. The current implementation translates enumerations into simple dropdown lists. A tooltip presents the documentation for each element of the enumeration, when it is available.

Data Entry Widgets

One way of minimizing the risk of entering invalid data in a SAR is to provide widgets and UI cues guiding the input of valid values in fields. These widgets can also alert the user when a value is incorrect as soon as a field loses focus.

We put a lot of effort in detecting the base type for most simple elements in order to implement these UI guides. For instance, a field based on an xs:id type will alert the user when the id provided is not unique in the document. Figure 4 shows some UI guides for some of the primitive types referenced by an XML schema.

Figure 4: Examples of UI cues and widgets for types xs:id (top), xs:dateTime (middle) and xs:string with regular expression pattern (bottom).

xs:choice and Substitution Groups

xs:choice rules and substitution groups are schema constraints that differ in nature but are rendered similarly in the user interface. This is an interesting instance where the potential complexity of the schemas is hidden from the user, who sees two different constraint types expressed in the same way: a simple logical disjunction (an or).

A choice model group (xs:choice) is used within a complex type to specify a set of element types from which a single element can be selected. A substitution group consists of a set of element types. When an element type associates itself with a substitution group (by specifying a substitutionGroup attribute), it is a valid substitution for the referenced element type.

Figure 5 shows the listing (top) defining the type LengthType for our schema. This type includes an xs:choice alternation. The figure shows the interactions the user can have with the control derived from this type. The user can either specify the height of an individual as a MeasurePointValue (a single value) or as a MeasureRangeValue (a range).

Figure 5: Figure showing the UI guide for a type xs:choice whose listing is shown in the top frame. (a) The initial display of the control. (b) The user specifies a 6 foot height. (c) The user changes his mind, switches from MeasurePointValue to MeasureRangeValue and specifies a 6-to-7 foot height.

XML Validation

XML Validation ensures that the suspicious activity report being written conforms to the underlying XML schema or schemas. In XSDGuide, it is a two-step process. It first involves the logic built into the front end, then, if no errors are found, that of the back end. Figure 1 shows the two constituents.

Validation is invoked when the user clicks on the navigation bar item “Validate XML”. The user is oblivious to whether the error messages emanate from the front end or the back end. In both cases, they are shown at the top of the page.

Validation Carried out by the Front End, in the User’s Browser

When the front end is built, not only are visual elements laid out for the user to interact with, but validation rules are created in the JavaScript logic running in the user’s browser. These rules are built client-side by relying on information provided by the server indicating the base type of the field (XSD’s built-in datatypes), as well as additional restrictions.

Here are some examples of the rules implemented for elements and attributes.

  • When elements have a number of occurrences of at least one, or when attributes are marked required, then the front end will check for their presence.

  • Types deriving from xs:id are checked to make sure they are well-formed and unique in the document

  • xs:idrefs should reference an existing xs:id in the document

  • Decimal and floating-point numbers are deemed valid if they are consistent with possible minimum and maximum values.

  • Regular expressions restricting the content of text-based data are used to validate strings.

The validation feedback for an element of type xs:id is shown in Figure 4 (top). Whenever an error is found for a specific field, it is highlighted in red and a short description of the problem is presented to the user.

Validation taking place in the front end is especially concerned with the data entered in the different fields provided to the user. In other words, the structure of the document itself, e.g. the nesting of elements and their respective number of occurrences is not validated client-side. Indeed, the user would be hard-pressed to find a way to circumvent those rules while creating a report, since interactions that would create such validation errors are prohibited.

Consequently, when validation is invoked by the user, the front end checks if data entered in each field is consistent with the rules found for it. These rules were manually crafted for most data types and elements, but still constitute a best effort. Indeed, the Apache Xerces validator in the back end is bound to be run on the document when the front end has deemed it error-free (see following section).

The advantages of first running the validation on the front end are twofold. This scheme allows for a quasi-immediate response from the browser, without having to send the document over the network and wait for the validator messages. Moreover, this validation can be carried out interactively as the user is typing data, which allows quick rectifications of the data just entered, fresh in the user’s mind.

Validation Carried out by the Back End, XSDGuide’s Java Engine

As mentioned in Section section “XSDGuide’s General Architecture”, XSDGuide builds a full-flegdged XML validator from the XML schema(s) selected by the user to create his SAR. This validator is put to good use during this second step, and any remaining validation errors are captured and sent back to the user.

At this point in the development of the prototype, this validation still leaves room for improvement. The principal problem is that, when validation fails, the error messages are not clearly tied to the offending field(s) (contrarily to the messages produced in the step described in the previous section). See Section section “Current Limits and Perspectives” for more on this.

Saving the Suspicious Activity Report

At any time during the creation of the SAR, the user has the possibility to save his report by clicking the appropriate navigation menu item. This triggers the download of the XML document being edited. The document root contains the association to the corresponding XSD schema, through an xsi:schemaLocation attribute. The URL of the schema points to the XSDGuide server, which acts as a schema server, for the referenced schema and its possible XSD dependencies.

This allows the validation of the SAR using external tools, such as <oXygen>, which dereferences the schema URL and proceeds with validation.

Schema Management

In order to demonstrate the versatility of XSDGuide, we implemented a feature allowing the user to upload his own schema (or schema archive) in order to create SARs based on new XSD schemas. The user only has to click the “Add XML Schema” to upload a new XSD file. He also has the possibility of uploading and entire zip archive containing the XSD file as well as its dependencies (mainly specified through import and include statements). Uploading an entire archive is quite useful in our cases, since most complete SAR schemas are saved in IEPD zip files (see Section section “NIEM-SAR and Suspicious Activity Reporting”).

Whether uploaded alone or alongside its dependencies, each XSD file is validated before the operation can proceed. The validation consists in the compilation of the XSD file using the relevant Apache Xerces functions. If validation fails for at least a file, the operation aborts and the user is shown the offending file name and validation error(s).

Current Limits and Perspectives

In its current stage of development, XSDGuide is still a prototype, and our effort focused on making sure that most NIEM-SAR IEPD schema rules are recognized and correctly processed. However, the XSD standard taken as a whole is quite vast, and consequently there are various XSD validation rules that are yet to be implemented. Furthermore, some features are lacking from the overall application.

XSD Rules to Implement

Some of XML schema’s constraints are not yet implemented in XSGuide. They were either rarely seen in the IEPDs we worked with, or posed difficult ergonomics problems. We describe some of them below and give an idea of the prevalence of these rules in the IEPD “ISE-FS-200-version-1.5 Suspicious Activity Reporting (SAR)” mentioned in Section section “NIEM-SAR and Suspicious Activity Reporting”.

The subtle distinctions between the text-like types string, ncname, nmtoken, token have not been implemented. Only xs:token and xs:string are used in the IEPD ISE-FS-200.

Some facets for numbers and text (e.g. whitespace, length, totaldigits) are incomplete.

The regular expression language used in XSD to validate text content has not been entirely ported from the schema to the user interface. This proved difficult because the W3C XML Schema standard defines its own regular expression flavor, and some patterns cannot be copied verbatim from the schema specification to JavaScript. For instance, the range subtraction construct ([...-[...]]) does not exist in JavaScript. For now, only simple regular expressions are copied from XSD to JavaScript. Only one pattern is used in the entire ISE-FS-200 IEPD.

For the time being, the number of occurrences for model groups xs:sequence, xs:choice or xs:all can only be 1. There are no xs:choice or xs:all rules in the example IEPD and the cardinalities for xs:sequence is always 1. However, the IEPD contains 56 substitution groups. Consequently, we focused our efforts on these use-cases. Expanding on this to include other cardinalities should not be difficult.

Elements with mixed content (mixed="true") constitute a particularly arduous constraint when it comes to producing an appropriate UI guide. Fortunately, they rarely appear in IEPD (they are absent from all IEPDs we studied). Nonetheless, they represent an interesting challenge.

An element with mixed content may contain text, usually interspersed with nested elements. The UI guide should make it clear that the user can type arbitrary text and that he can insert nested elements within that text. The current interface choices implemented by XSDGuide make it difficult to provide this type of guide. We could have provided the user with the possibility to insert tags inside his free text, but we opted to avoid their use as much as possible, since they require a level of computer literacy that is not to be expected from the average user.

Figure 6 shows an idea for a mixed content guide. A text area allows the user to enter free text, and buttons allow the creation of nested elements within the text. Whenever the user clicks on these nested elements, the complete element appears below the text area, and behaves like any other element guide.

Figure 6: Mockup idea for user interface guides for mixed content. The content text can be interspersed with nested elements of types Person, Location or Time in this case.

Additional Features

SAR Loading

For now, the most important feature lacking from the prototype is the ability to load a previously saved SAR. While it is possible to create an XML instance of a given XSD schema, the interface does not allow the user to open such an instance and edit it in the interface. There are no specific conceptual hurdles to implementing this, it is simply that we could not complete this feature within the short timeframe allotted for this project. Obviously, such a feature is essential if our prototype is to be rolled out in a production setting.

Validation Feedback

While validating, the feedback provided by the back end is too generic and does not indicate clearly to the user the offending fields or values. Contrarily to error messages provided by the front end, these messages do not come with a visual feedback including the highlighting of the fields at the origin of the problem. This is indubitably disconcerting to the user.

Traditional text-based XML editors like <oXygen/> do not suffer from this problem, since the validation API provided by Apache Xerces associates line and column numbers to each validation problem. The editor can then highlight the problem in the code. In our case, we cannot benefit from such clear indications, since the SAR document is not text-based: it is kept as an in-memory DOM. One solution to this is to inspect the post-schema-validation infoset (PSVI)[14] provided by the API. After validation, the PSVI includes assessment outcome information that can offer the validation status of some elements and attributes. It then becomes a matter of mapping these statuses back to the interface so that the user understands the corrections needed.

Other Schemas

In theory, XSDGuide is not tied to a specific schema. In practice however, it has been designed to implement constraints found in our test set. The limitations one is bound to encounter when loading new schemas in our prototype have been outlined earlier in this article. Beside the (admittedly important) fact that not all constraints are implemented, other considerations are to be examined in order to tackle non-SAR schemas.

One of the most complex problems we see is that XML schemas can be used (and abused) to encode data models in ways that do not lend themselves well to the automated generation of a user interface. For instance, an XSD file may encode a data model featuring multiple inheritance through custom-made elements that only make sense to the application that created the schema. One way to encode such “proprietary” information is through the <xs:appinfo> element in XSD. For instance, a software tool could create <xs:appinfo> sub elements like <myapp:baseclass qualifiedname="basetypename"> to achieve a data model with multiple inheritance. XSDGuide’s corresponding interface would not be able to translate this clearly, simply because these extra layers of meaning are obviously not accessible to the schema processor.

In these cases, it’s difficult to imagine how a program like XSDGuide could be useful. Additional resources would need to be provided in addition to the XSD schema. Creating a generic tool in these conditions becomes arduous, if at all possible.

Conclusion

The XSDGuide prototype we have presented here was aimed at facilitating the creation of suspicious activity reports by public safety and security experts as well as by members of the public, in a timely fashion. Moreover, one of our aims was to design a tool allowing users to inspect and understand complex schemas by using familiar user interface controls.

In spite of some limitations, we feel that the prototype is sufficiently developed to clearly showcase the possibilities that intelligent user interfaces offer to achieve these goals. XSDGuide proposes a way to materialize schemas created within the NIEM-SAR framework into a concrete user interface in a web application. The latter can be used to create validated SARs but also to explore the data model defined by these schemas by parsing these constraints and rendering them in a uniform, user-friendly manner.

A formal evaluation of the prototype (probably after some improvements whose nature is outlined in Section section “Current Limits and Perspectives”) should be carried out in order to objectively assess the usefulness of the software. This evaluation could measure the time needed to create the same SAR using XSDGuide versus a more traditional approach. The quality of this SAR should be evaluated as well. Ultimately, however, the approach we propose here can only be judged when it is integrated in the full information processing pipeline. This pipeline goes from the creation of the SAR, to the data centers where information is stored and cross-referenced, and back to users in the fields in the form of notifications, warnings, etc.

A recurring question during the development of the software presented here is the quality of the standards used. While NIEM-SAR is undoubtedly an exceedingly well thought-out framework, the complexity that arises from such exhaustiveness can be perplexing for the authors of SARs. Moreover, some of the implementation choices in XSD are debatable. For instance, certain elements allow free text when they should probably have been enumerations, or complex types constrain the order of sub-elements when it is unnecessary. These questions arise simply because creating schema-backed XML documents is an excellent way of putting these schemas to the test.

An interesting perspective to the project is the collection of data through the creation of SARs. With such a tool as our prototype, it does become possible to envision a SAR creation campaign soliciting the help of interested parties (e.g. law enforcement agencies). Such data could prove invaluable in the creation of additional guides during SAR creation, like autocomplete features based on previously entered values. XSDGuide would then act as a “bootstrapping” tool in the implementation of a more advanced SAR authoring tool.

References

[1] Information Sharing Environment (ISE) Functional Standard (FS) Suspicious Activity Reporting (SAR) Version 1.5

[2] Adoption and Use of the National Information Exchange Model (NIEM)

[3] NIEM User Guide, Volume 1, 2014

[4] NIEM Suspicious Activity Report Schema

[5] Communications Interoperability Strategy for Canada 2011



[1] See for instance http://www.slideshare.net/drrwebber/niem-and-future-sar.

[2] http://www.nomagic.com/products/cameo-enterprise-architecture.html

[3] https://niem.gtri.gatech.edu/niemtools/iepdt/display/container.iepd?ref=opCeOMCX_74

[4] https://niem.gtri.gatech.edu/niemtools/iepdt/display/container.iepd?ref=ntsXeIX7M6Q=

[5] https://niem.gtri.gatech.edu/niemtools/iepdt/display/container.iepd?ref=-6kRpaB0tyY

[6] https://eclipse.org/

[7] http://oxygenxml.com/

[8] It is worth noting that XSDGuide can process arbitrary XSM schemas, as long as they are in XSD format.

[9] http://rali.iro.umontreal.ca/rali/?q=en/xsdguide

[10] http://xerces.apache.org/

[11] http://jquery.com/

[12] http://getbootstrap.com/

[13] http://www.w3.org/TR/xforms11/

×

Information Sharing Environment (ISE) Functional Standard (FS) Suspicious Activity Reporting (SAR) Version 1.5

×

Adoption and Use of the National Information Exchange Model (NIEM)

×

NIEM User Guide, Volume 1, 2014

×

NIEM Suspicious Activity Report Schema

×

Communications Interoperability Strategy for Canada 2011