Holman, G. Ken. “The Universal Business Language ecosystem and the OASIS TC process.” Presented at Symposium on Markup Vocabulary Ecosystems, Washington, DC, July 30, 2018. In Proceedings of the Symposium on Markup Vocabulary Ecosystems. Balisage Series on Markup Technologies, vol. 22 (2018). https://doi.org/10.4242/BalisageVol22.Holman01.
Symposium on Markup Vocabulary Ecosystems July 30, 2018
Balisage Paper: The Universal Business Language ecosystem and the OASIS TC process
G. Ken Holman
Active in ISO standardization since 1988, Mr. G. Ken Holman has represented Canada
a number of different roles, including International Secretariat for the ISO/IEC committee
responsible for SGML and XML, and a working group expert in the ISO/IEC eBusiness
committee. A founding participant in the OASIS Consortium in 1993, Mr. Holman has
founding chairman for a number of technical committees including Code List Representation,
XML Conformance and XSLT Conformance, while actively participating in other technical
committees. Mr. Holman is the current chair (and XML technology lead) of the OASIS
Technical Committee and editor of ISO/IEC 19845 Universal Business Language. As an
expert to the W3C he was on the committee that created XML from SGML. He is accredited
Canada as an expert contributor to UN/CEFACT. He has been an active editor in the
and maintenance of a number of ISO/IEC and OASIS specifications, including supporting
publishing process of specifications for both organizations and participating in the
definition and support of the OASIS technical committee process. His additional volunteer
work includes community-oriented activities near home and humanitarian education work
A rich worldwide ecosystem has grown around the freely-available Universal Business
Language (UBL) standard for 81 business documents such as purchase orders, invoices,
waybills, etc., and 4600 semantic business objects expressed in those business documents
UBL development committee was formed in 2001 as a technical committee in OASIS (the
Organization for the Advancement of Structured Information Standards) under a strict
transparency rules supported by a rich set of collaborative tools with which the committee
members have created both normative and non-normative work products. For UBL, normative
schemas and non-normative JSON schemas are examples of both of these kinds of work
specified for user communities to leverage in their communication environments for
electronic business solutions.
Solutions built on a foundation of open standards are attractive to large ecosystems
developers and end users working towards shared objectives. To be truly open, the
development and deployment of such a specification must address three critically important
issues: governance, transparency and availability. UBL was developed under the governance
the OASIS Technical Committee Process, working with a tool set that is available to
community wanting to create a markup vocabulary. OASIS, a membership organization,
the successful development of work products by its members participating in its technical
And so it is important to look into the detail of these two perspectives of UBL: how
is best deployed within the ecosystem due to its magnitude, and how UBL is governed
maintained during the development process.
Illustrated by UBL, when considering where to bring together participants developing
open vocabulary ecosystems, OASIS with its TC process should be a front-runner in
consideration of a positive and productive environment for building team results.
The Organization for the Advancement of Structured Information Standards
The OASIS membership consortium was first founded as SGML Open in 1993, years before
XML syntax was even considered. At the 1998 meeting where the organization was being
for fear of being tied to SGML, there was a push to use “XML” in the name. But the
organization avoided being typecast again by adopting the new name put forward by
the father of XML. Indeed, OASIS does not exist only to develop XML standards. Jon
first meetings to develop what has become the “Technical Committee Process” for the
groups of members creating work products for localized or global use. OASIS has matured
world-class standards development organization, backed by dedicated and talented staffers
are supporting a myriad of committees that are creating standards being used internationally.
Vocabulary ecosystems have developed around the work products of many OASIS technical
committees using this process, including those for office documents (OASIS ODF - Open
Format), for technical documentation (OASIS DITA - Darwin Information Typing Architecture,
OASIS DocBook), and for business documents (OASIS UBL - Universal Business Language).
The OASIS TC Committee Process is recognized for its quality by other standards developing
organizations in that an OASIS Standard is automatically accepted for consideration
be put forward to, for example, ANSI in the US (the American National Standards Institute),
ISO/IEC internationally (the Joint Technical Committee 1 of the International Organization
Standardization and the International Electrotechnical Commission). Accordingly for
as a Publicly Available Specification (PAS) submitter, the OASIS organization and
committee process gives a community a pathway to ISO standardization for their normative
The Universal Business Language (UBL) is a good example of a work product developed
the OASIS TC Process, successfully deployed around the world, and standardized as
The Universal Business Language vocabulary ecosystem introduction - conveying business
Business documents such as purchase orders, invoices, waybills, etc. are exchanged
the world. The Universal Business Language establishes a structured vocabulary for
procurement and transportation documents so that communities and users need not create
structures themselves. Moreover, interoperability is promoted when many communities
structured business documents on the same vocabulary.
Jon Bosak, then of Sun Microsystems, established the UBL committee in 2001. Funded
exclusively by the volunteer participation of committee members, the first version
deployed publicly was 0.7 by the government finance ministry in Denmark. Unsurprisingly,
most active members of the committee at that time were from Denmark. Version 1.0 with
eight document types was released shortly after, it just wasn’t released in the time
needed by the Danes to legislate its use in government invoicing. Based on some tough
learned in version 1.0, development immediately began on version 2.0 to create a framework
which to expand the scope and utility of the specification.
Now approved as ISO/IEC 19845:2015, UBL 2.1 is a family of 65 business documents around
common library of business objects. The recently released UBL 2.2 was finalized with
business documents and a richer common library than found in UBL 2.1. To maintain
availability and relevance, by design, each minor version of UBL is strictly backwards
compatible with all previous versions in the same major version. That is, every schema-valid
instance of UBL 2.0 is a schema-valid instance of UBL 2.1, and every schema-valid
UBL 2.1 is also a schema valid-instance of UBL 2.2. This ensures the UBL ecosystem
continuously and user communities can migrate organically to updated minor versions
of the UBL
specification without impacting on other users. Moreover, the design of UBL accommodates
different communities’ requirements through a number of tailoring techniques.
The very nature of the use of business documents such as purchase orders, invoices
waybills implies the need for an ecosystem of product developers servicing end users
needing to conduct business using a information vocabulary. Consider a choreography
the exchanges between a Buyer and a Seller:
And there is not just the Buyer and the Seller in a business scenario. Consider the
roles described by UN/CEFACT, the United Nations Centre for Trade Facilitation and
outlining a number of possible roles engaging in the Buy-ship-Pay process in addition:
Regardless of the sector environment, business information is conveyed from a sending
to a receiving role as a transaction within a profile of choreography. The sender
has its own
business practices developed over time to meet its obligations. The receiver could
different business practices because its obligations and its history differ from the
Traditionally the exchange of the paper business document bridges the two environments.
Employing a digital exchange removes the challenge of printing and interpreting the
printed content, though it does not remove the challenge of starting off with correct
information. But if the information is correct, then using digital technologies can
drastically reduce the opportunities for incorrect information ending up in the receiver’s
business practices. The sender marshals their information out of their application
syntax that is transported to the receiver who unmarshals the information from the
their different application. The choreography doesn’t change and the business practices
change, but the integrity of the information exchanged is greatly improved.
All of the aspects described above fit into the ISO/IEC Open-edi Reference Model,
14662, first developed starting in 1992. While the abbreviation for “electronic data
interchange” historically is often associated with financial information, Open-edi
been agnostic of the nature of the information being exchanged. From the introduction
ISO/IEC 14662 one reads:
The field of application of Open-edi is the electronic processing of business
transactions among autonomous multiple organizations, authorities or individuals within
across sectors (e.g. public/private, industrial, geographic). It includes business
transactions which involve multiple data types such as numbers, characters, images
The Open-edi Reference model is independent of specific:
information technology implementations;
business content or conventions;
parties participation in business activities.
In this depiction, the Open-edi reference model is described the left column. The
column outlining the components of an Open-edi configuration is from ISO/IEC 15944-20.
right column enumerates specifications available to address the two Open-edi aspects
information representation: bundles of semantic content, and data in syntax.
Open-edi describes two “views” of electronic business (the rows in the diagram): the
business operational view and the functional services view. The business operational
(BOV) describes the abstract properties of the environment, the scenarios, the roles
scenarios and the bundles of information conveyed between roles. The functional services
(FSV) describes the concrete machine-processable properties of user data representation
information bundles, the choreographies engaged by the roles in the scenarios of the
environment and the transport of the content between the parties.
Also shown in the diagram, in particular in the rightmost column, is the bridging
business specification of the information objects and definitions to the machine-processable
specification of the binding of the information objects to actual syntax representations
suitable for applications to produce and ingest. The two examples of syntax-independent
information bundle description technologies cited are the UN/CEFACT Core Component
Specification (CCTS) https://www.unece.org/cefact/codesfortrade/ccts_index.htmland the Unified Modeling
Language (UML). The three examples of syntax technologies cited are the text-oriented
JSON, and the binary-oriented ASN.1. The technology that bridges the two is the set
and design rules governing creating from the business view of information bundles
the functional view of user data (the syntax).
This bridging is accomplished in a rigourous mechanical fashion, producing robust
accurate document constraint expressions without the need for hand-crafting. For UBL,
technical committee formalized and standardized the OASIS Business Document Naming
Rules (NDR) http://docs.oasis-open.org/ubl/Business-Document-NDR/v1.1/csprd01/Business-Document-NDR-v1.1-csprd01.html
for the application of CCTS and the realization of schema artefacts from declarative
the information. As an example of the work product of one OASIS technical committee
by another, these NDR are also being used by the OASIS Business Document Exchange
Committee (BDXR) for work on the business document envelope and exchange header envelope
Information described by CCTS takes three forms to be expressed as a hierarchical
business objects. The Aggregate Business Information Entity (ABIE) is the shape of
of the information tree. The Association Business Information Entity (ASBIE) is an
the branch of the information tree. The Basic Business Information Entity (BBIE) is
a leaf of
the information tree. CCTS modeling is not based on syntax, thus allowing different
expressions of the information tree. The UBL TC has normatively standardized on an
serialization of the CCTS information tree, and has published non-normative alternative
expressions of UBL in JSON schema for JSON syntax, and in ASN.1 binary syntax.
With this standards-based foundation used to create the comprehensive UBL specification,
considerations must be made when deploying the work in difference scenarios across
Deploying the Universal Business Language vocabulary across the ecosystem
The UBL Technical Committee recognized that even when two communities are using the
same UBL structures, the business contexts of those communities will govern different
to be used. These values might be in code lists, identifier lists, contextual value
constraints, etc. Accordingly, the only two normative components of the UBL standard
semantics of the standardized constraints, and the document and business structures
in XSD schemas. There are no enumerations in any of the UBL schemas. Only the structures
standardized, not the values that go into those structures. Business value constraints
change on a daily or even hourly basis and it would not be at all desirable to require
to be modified and reintegrated into production processes so rapidly.
Accordingly, the UBL committee non-normatively suggests that UBL documents run through
two-pass validation phase before an application code acts on the content. In this
phase 1 shows the application of the structural constraint checks (both element structure
the lexical element/attribute content structure) using XSD, and phase 2 shows the
of value constraint checks for example in XSLT:
The use of ISO/IEC 19757-3 Schematron is common in UBL communities for the expression
the value constraints. To help with the generation of the Schematron expressions,
Code List Representation Technical Committee has published the genericode XML vocabulary
the expression of lists of coded values, and the Context/value Association (CVA) XML
vocabulary for the expression of XPath contexts to which genericode and other value
expressions are applied. Free tools are available to transform the CVA and genericode
into Schematron, and then translate the Schematron into the XSLT for runtime use.
But a common issue among new users or by communities considering UBL is almost always
raised regarding the magnitude of the published specification. Why is the UBL vocabulary
big and how can it be used effectively?
Enabling communities to work effectively with UBL
When Jon Bosak founded the UBL committee, he was fully aware that one committee’s
definition of the information components for electronic commerce would never be able
every business requirement globally. Nor should it try to do so, though the effort
can be made
to support as many as possible. However, such a vocabulary can have particular features
would allow the vocabulary to be a basis on which every business document information
requirement globally could be accommodated. The resulting specification for UBL accommodates
all of this, and version 2.2 of UBL includes 81 document types and 4600 distinct semantic
business objects realized as elements in those document types.
Firstly, consider the Pareto principle, also called “the 80/20 rule”. UBL is designed
the Pareto 80/20 principle in mind: the committee believes that 80% of world business
with only 20% of the UBL business objects. The other 80% of UBL exists in order to
less-common but still accepted business requirements for the defined document types.
enables yet more of world business to work with the standardized UBL business objects,
most people won’t need them. Moreover, the design of UBL incorporates user-defined
available to address in a standardized UBL document all of the remaining unaddressed
requirements not available in UBL business objects. Finally, the common library utilized
the UBL document types is available to be used by user-defined document types that
included in the UBL suite. All this should allow the UBL vocabulary to find a home
To manage these three concepts, the nomenclature used in UBL deals with extension
subset schemas, and additional schemas.
To accommodate business objects that are not found anywhere in UBL, the user community
create extension schemas and embed content conforming to those schemas. Every UBL
type has an extension point as a home for arbitrary content from multiple sources.
extension point is a scaffolding of metadata describing the apex of an information
of arbitrary XML content. A sending application adds the extension information under
extension point, and the receiving application looks under the extension point only
extensions that it recognizes. All unrecognized extensions in a UBL document are ignored
the processing application.
In UBL the extension point is the very first child of the document element in every
document type. This is important for streaming applications to be able to consume
all extension information before encountering standardized content just in case the
content impacts on the semantics to be interpreted by the receiving application.
There are some non-UBL business concepts that have already been standardized outside
OASIS and have had established XML schemas developed under the formal governance of
is not UBL’s intent to re-express those concepts using CCTS. Rather, the extension
point is a
home for XML constructs from foreign vocabularies using non-UBL namespaces or no namespaces.
An example of this is digital signatures. The UBL Technical Committee has published
scaffolding necessary to embed W3C Digital Signature structures, using the W3C namespaces
structures and schemas, inside the extension point of any UBL document.
But for those non-UBL business concepts that have not already been standardized elsewhere,
users need to be able to augment the UBL document to include such information in their
extension. While there is no obligation to use CCTS, doing so is consistent with the
UBL. Moreover, the user community may wish, then, to submit their CCTS-based designs
committee for consideration under UBL’s governance rules. The hierarchical tree structure
an extension with custom information for a line item is depicted in the following
Note in that diagram how the line item identifier is copied into the extension so
customized information in the extension can be associated with the standardized information
the UBL business objects. Being considered for the future UBL 2.3 is making each and
aggregate extensible by having an optional <ext:UBLExtensions> element at
every branch of the tree, not only at the document element. This contextualizes the
information at the location of the UBL structure where the standardized constructs
augmented. This relieves the need to use other means by which to associate the extended
information at the beginning of the document with the standard information found deep
the document. This was proposed for consideration after receiving feedback from implementers
regarding some awkwardness, though not technical deficiency, of the current approach
To user communities considering adopting UBL, a challenge of a vocabulary with 4600
distinct semantic business objects is the determination of the base 20% that applies
situation, and which of the other 80% might also apply. To address this the community
create a subset schema for their users. With a subset schema, every schema-valid instance
the subset schema is also a schema-valid instance of the full UBL schema, but the
dealing with the subset is not overwhelmed by the entire UBL suite. However, communities
to remember that subset schemas should play only a limited role in a deployed solution.
In general, only a subset of a protocol is actually used in real life. So, you should
conservative and only generate that subset. However, you should also be liberal and
everything that the protocol permits, even if it appears that nobody will ever use
Jon Postel, 1979, re: TCP/IP
This protocol-related principle can be applied conceptually to an XML vocabulary
ecosystem. The senders of UBL should be constrained by the subset set of constraints,
receivers should not be so constrained. Receivers should be accepting all of UBL because
is no guarantee that only users of the subset will be sending them content. Through
of value validation (perhaps by Schematron creating XSLT or culling the input instance
undesired constructs and then using the subset schema) the UBL-valid document can
before the receiving application acts on the content. This is shown in this diagram:
Finally, user communities can create additional CCTS-based document types that share
use of the common library of aggregate (ABIE - branch shapes), association (ASBIE
instances) and basic (BBIE - leaf instances) information entities. Additional schemas
importing the UBL common library into CCTS-based non-UBL documents can also incorporate
non-UBL supplemental library constructs. These supplemental library constructs can
library constructs, but common library constructs cannot be modified to use the supplemental
Of course using an abstract modeling technique, such as CCTS in the case of UBL, begs
question of how to get the actual runtime validation artefacts expressed according
naming and design rules. These are the schemas and other constraint expressions that
applications will use in the generation and validation of the syntax. The UBL Technical
Committee uses free tools available on GitHub to create XSD schemas, OASIS Context/value
Association expressions, and JSON schemas. Depicted in this diagram is the CCTS model
collaboratively modified by committee members as a Google Docs spreadsheet, downloaded
OASIS ODF spreadsheet, transformed into an OASIS genericode serialization that is,
transformed into the many artefacts published by the committee.
See http://goo.gl/DgMAqy for a description
of the process and links to the free tools used to create the validation artefacts.
This resulting environment effectively services a global community of users using
different ways, while still retaining a base level of commonality and conformance.
the web world. Jon Bosak has said that he wanted UBL to be “the HTML of e-commerce”:
commonly-understood freely-available base vocabulary on which user communities can
their specific solutions without the overhead of starting from scratch. And, also,
end-users to leverage work products created by a cadre of global and regional experts
followed an open and transparent process using effective development tools.
Leveraging the OASIS TC process, tools and resources to create UBL
The quality and global acceptance of the UBL committee work products are evidence
results of collaborating within an effective standards development process.
Important in any development of such an open specification, in order to gain the trust
potential users in the ecosystem, are three critical aspects: governance, transparency
availability. The rules of engagement and obligations by contributors are formalized
governance of the project. The openness of the development process to public scrutiny
needed for transparency. The openness of the work product is characterized by its
availability (recognizing that even “mandatory registering for a free copy” is a barrier
The internationally-recognized OASIS Technical Committee (TC) process at http://www.oasis-open.org/policies-guidelines/tc-process (accredited by ANSI in the
US and ISO as suitable for creating national and international standards) is an ideal
framework under which one would create and run a committee of members publishing work
for local or global use.
Jon Bosak chaired the first OASIS committee to hone the definition of the TC process.
objective he set was to be general enough that “if Japanese subway operators wanted
get together to create an XML vocabulary for interchanging scheduling information,
should be straightforward and flexible enough that they would find a home at OASIS
so”. (Author: I don’t think any Japanese subway operators actually did
so, but it exemplified the kind of framework OASIS was striving for.)
The process has matured and become very successful, and OASIS offers assistance to
technical committees to help promote membership in the TCs. And the legal counsel
at OASIS has
ensured the important issues of copyright and intellectual property rights involved
developments of open-use standards are appropriately accommodated by member participation
agreements and by non-member submission agreements. Such gives confidence to the user
community to exploit OASIS work products without concerns of losing their investment
technology by claims from third parties.
Having worked out such IPR issues, the TC process and procedures protect the work
from being blind-sided by IPR claims (provided that the TC members respect their membership
obligation and members of the public only use the Public Comment list to submit, which
obligations built in to subscribing to the list). The OASIS process dictates that
agendas, minutes, TC mail list and documents be transparently open to the public at
OASIS puts no encumbrances on using the work products, not even “register to
use”, and puts all work products in the publicly-accessible file repository. Ownership
of the resulting specification rests with OASIS, but the specification is fully open.
The charter for a new technical committee needs to spell out the purpose of the new
and the expected work products. The required five member companies needed to form
must be identified, and it is in the interests of stakeholders to find at least another
charter members, hopefully more. If one had only a particular geographic focus for
economic sector, some interest in participating might be raised internationally if
geographical areas had similar interests, thus making the new committees work products
The OASIS TC process for public review and creating a committee specification is extensive
and rigourous. The TC administration support of wikis, JIRA ticket management (very
for building and maintaining the specification), a document repository and a file
are all available to use by a TC at no charge. Other development tools are also available.
There is no software to install or maintain. Public visibility is mandated for all
projects actions: meeting agendas and minutes, member rosters, discussions, document
and final versions, committee specifications and distribution artefacts.
A technical committee can be arranged with subcommittees responsible for certain domains,
and the subcommittees make recommendations to the technical committees to include
Given that OASIS is an accredited ISO/IEC JTC 1 Publicly-Available Specification (PAS)
submitter, the option is there to make a work product an ISO standard. For example,
UBL 2.1 is
now ISO/IEC 19845:2015, a recognized ISO Standard. ODF is another example of an OASIS
that has become an ISO standard, initially ISO/IEC 26300:2006 and now split into many
All committee work must be performed transparently. One can find the UBL 2.1 vocabulary
information model, expressed using CCTS, at https://docs.google.com/spreadsheets/d/1amzk8jn1boD2q3ze9rR14PVB6OGDyHTc2pQl92JutvE/view.
The use of Google Drive allows international members of the committee to collaboratively
the content simultaneously. For archive purposes, periodic snapshots of the ever-changing
document are made and stored in the OASIS repository. This ensures a transparent history
the evolution of artefacts and prohibits the modification of the historical artefacts
the act of publishing.
Using the UBL TC example, these are the OASIS artefacts and resources related to the
essentials of open specifications development:
governance: the rules of engagement and development
Overall, the biggest challenge faced by the committee is the availability of time
individual members to contribute to the efforts. Membership in the committee has waned
waxed. Non-voting members have access to all tools and have all messages pushed to
Voting members participate in ballots regarding committee direction and work product
development. Voting privileges are accorded to those members who are actively participating
meetings. Voting privileges are lost when active participation drops, until they are
recovered by participating once again.
Understandably, this always is a factor of members’ management’s commitments to
volunteering their staff to an effort that may only be indirectly benefiting their
organization. As demonstrated by Denmark in the early days, their determination to
into a useful tool for their immediate objectives justified their contribution of
effort in participation in the community. The end result met their requirements while
same time established interoperability with others who also decided to base their
work on UBL.
Those considering participating in the committee can point to this success when presenting
their rationale to their own organizations.
To be fair to all, face-to-face meetings need to be held in turn at locations around
world. This can significantly add to the costs of participation and to the time taken
from one’s organizational obligations. The frequency of international meetings depends
need for productive time together as a group.
Logistically, the globally-distributed membership presents a challenge for all members
speak together at the same time between the face-to-face meetings. This is addressed
splitting a single weekly meeting into two teleconferences: the Pacific Call and the
Call. The Pacific Call is attended by members in North and South America and the Pacific
The Atlantic Call is attended by members in North and South America and in Europe.
are held on the same Wednesday considering UTC time, which for those in North and
America puts the Pacific Call on their Tuesday evenings. Preliminary discussions and
decisions are tabled during the Pacific Call for subsequent discussion and change
endorsement by the Atlantic Call. This is not perfect, as decisions can be postponed
important issues are raised during the Atlantic Call without consideration by those
This shifts a lot of responsibility to members to use the many tools made available
OASIS. The tools themselves work well and are well-maintained by OASIS staff, and
a lot of
effort is put into making many and varied tools available to committees who may work
one way or in another. And the tools reinforce the transparency to the public regarding
inner-workings of the committee and the decisions being made. But, in particular for
is a challenge to get members to use JIRA tickets effectively to appropriately record
observations and proposed dispositions of issues that are raised. As mentioned above,
products present abstract business concepts and the technical artefacts are synthesized
software being maintained by very few. The value in UBL is in what UBL defines, not
artefacts that support that definition. Not all committee members are well versed
online collaborative tools, nor have they developed the discipline to use the tools
effectively and in a timely fashion.
Open and free standards, all for only the price of membership
A common thread in all of this is the yeoman effort made by Jon Bosak to create an
effective standards development process and to use that process to create a world-class
product collaborating with a diverse team of dedicated committee members who value
influence on creating the specifications.
While there is zero cost to obtain or use OASIS work products, and zero cost to publicly
comment on OASIS work products, there is a justifiable cost of membership to participate
directly in the OASIS standardization TC process. All of the enumerated list of tools
support environments comes just with the creation of a new technical committee, being
supported by OASIS TC Administration, and so justifies the cost of active participation
https://www.oasis-open.org/join/categories-dues for details).
And while the OASIS Universal Business Language (UBL) is very large and encompassing
many of the world’s business document information requirements, the vocabulary design
methodology can accommodate an ecosystem’s requirements through a base definition,
extension methodology, a subset approach, and the leveraging by additional business
It has become, indeed, the HTML of e-commerce.
Together, the process and the end result illustrate the creation of and the use of
world-class markup vocabulary ecosystem that can be mimicked when addressing one’s
requirements for such a solution.