How to cite this paper
Turner, Matt. “Entity Services in Action with NISO STS.” Presented at Balisage: The Markup Conference 2017, Washington, DC, August 1 - 4, 2017. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies, vol. 19 (2017). https://doi.org/10.4242/BalisageVol19.Turner01.
Balisage: The Markup Conference 2017
August 1 - 4, 2017
Balisage Paper: Entity Services in Action with NISO STS
CTO Media & Entertainment
Matt Turner is the CTO, Media and Entertainment at MarkLogic where he develops
strategy and solutions for the Media, Publishing, Entertainment and Information Provider
markets and works with customers and prospects to create leading edge information
digital content applications with MarkLogic¹s Enterprise NoSQL database. Matt has
closely with MarkLogic customers NBC, Warner Bros., LexisNexis, McGraw-Hill Finance,
Jones and more. Before joining MarkLogic, Matt was at Sony Music and PC World pioneering
the use of XML and developing innovative publishing and asset delivery
Copyright © 2017 MarkLogic Corporation
Standards impact nearly every industry and government process, and standards
organizations like BSI and ISO have been leading a change in how to provide a variety
audiences not just with the standards documents themselves but with valuable data
standards and the process of standardization. Now, there is a new data standard for
standards called NISO STS (National Information Standards Organization Standards Tag
Working with ISO and sample content from ISO, MarkLogic has created a demonstration
STS using MarkLogic 9's new Entity Services feature. This reviews the industry impact
standardization of data and how Entity Services can help leverage these standards.
Table of Contents
- The Impact of Definition
- Data Definition
- MarkLogic Entity Services
- NISO STS Standard
- Entity Services NISO STS Demo
- Getting Started
- Putting the Model into Action
- Thank You
Across every industry, efforts to define processes and materials have resulted in
improvements in productivity and efficiency. These technical and business process
are often expressed as standards that are created with leaders in the industry and
international standards organizations and adopted by the industry.
Like industry standards, data definitions could also have an industry impact to enable
collaboration and efficiency in the creation, management and delivery of critical
However, the management of data definitions and the creation of data standards are
largely separate from the applications and processes that use those definitions. In
these processes often separately define data models for each purpose.
This paper will cover two developments in the space of defining data standards and
models: the NISO STS standard to define a data model for standards documents and MarkLogic
Entity Services to put data models into action in the database.
The Impact of Definition
Both standards and data models benefit their constituencies by providing common
definitions that enable interoperability and efficiency between the organizations
the common standards and models.
Interoperability – the ability for multiple processes and applications to make use
Specialized roles – relying on the standard and model, groups can specialize their
efforts and focus expertise and resources optimizing their role
Universal application – wide adoption and multiple uses creating benefits across the
One example of definition in action is the impact of the ISO freight container standards.
Developed in the 1960s, these standards have dramatically changed the shipping industry
are one of the biggest factors in the economic globalization that has changed the
world in the
last 60 years.
Prior to the adoption of the definition of the freight container described in the
standards, the industry moved break bulk cargo. This freight came in many different
sizes and required each step in the shipping process to be unique and custom for every
shipment. The freight container standard brought dramatic efficiencies. Time in port
reduced from 4-5 days to overnight while increasing the capacity of ships and standardizing
and optimizing every step of the shipping process.
The result has been a dramatic change in the cost of shipping. Prior to the adoption
the freight container standards, shipping was up to 20% of the total cost of goods.
adoption of the container, this now a fraction of a percent. (Ninety
Percent of Everything, Rose, 2017)
To gain these benefits, the industry put the definition of the freight container into
action. The definition itself is the standard—a precisely worded document. The manufacturers
that make the containers use that definition to create the containers. Everyone else
shipping industry can then use those standard containers for whatever purposes they
container ship operators or crane designers don’t have to know how to make the containers;
they can rely on the containers having a specific size and fittings as defined in
and specialize in making their part of the process as efficient as possible. This
illustrated in figure 1.
This is an example of the benefits of definition. Because the freight container is
defined, it is interoperable with containers moving easily between uses, it enables
of the industry to specialize, and it is universally applied with many uses and adaptations
across the industry.
Today, data is defined in many ways, but these approaches haven’t yet delivered the
impacts of the wide adoption of definition and standards. This is because the way
defined is not put into action in the same way that definitions and standards are
Instead, data is defined in many places and with many different tools. This
Database schemas that have requirements and optimizations for specific uses of the
Application code that interpret the data and create functionality to create, query
access the data
ETL (extract, transform and load) code that processes, transforms and moves the data
from one system to another
All of these processes may work with the same data, but they all, independently, define
that data in many different ways and with many different variations.
There may be an overall definition of this data— an entity model that describes the
but this model is usually only interpreted and referenced by each process that needs
with the data. There is no direct connection to the actual schemas, application code
configurations that process and handle the data. Instead, each of these tools and
more that can be part of a project, creates and updates its own independent data models
to the outcome of only that process.
This lack of a definition or model that is put into action means that technology teams
struggle with interoperability of code and data models, have issues creating specialization
each resource working on a process has to understand every other application’s definitions.
a result, there is seldom universal adoption of data models and code across the
MarkLogic Entity Services
The goal of MarkLogic Entity Services is to define functionality and processes that
the entity model, the description of the data in its truest and most universal form,
The MarkLogic Entity Services is designed to:
Describe real-world entities, properties, and relationships in a Semantic model
Automatically derive services, transformations, configuration from the model
Enable users to govern context and data together
Enable users to take an iterative and evolutionary approach to use only as much as
need of the data model for each iteration to adapt to changes instead of setting the
entire model in place at the start
The starting point to use Entity Services is to create an entity model. This model
describes the data in its most complete form and includes the following elements:
Entities – the domain objects that this data is about
Properties – characteristics of that entity
Relationships – how entities fit together
Once the entity model is created, it is put it into action with services that use
model to update data and generate code that developers use in their projects.
These artifacts are generated by the entity services code modules, can be customized
the developers, and then installed in the MarkLogic database to enable the functionality
on the entity model. This is shown in Figure 2.
With this pattern, the MarkLogic Entity Services can allow technology teams to gain
benefits of definition. These benefits include:
Interoperability – everyone can refer to the model
Specialization – developers don’t need to know how to model data – they can just use
data, just like the ship and crane operators don’t need to know how to build boxes
Universal Adoption – this data model can be used for multiple purposes in many
NISO STS Standard
To show MarkLogic Entity Services in action and demonstrate the value of putting the
definition into action to technology teams, MarkLogic selected the newly proposed
standard for standards, NISO STS as the basis of a demonstration application.
This initiative from NISO brought together the leaders for the world’s standards bodies
define a data standard that would bring the many benefits of standardization to the
The current draft of the standard was released in April 2017. It is based on ANSI/JATS
the ISO STS format. The community recognized the potential value of wider adoption
data standard and created the NISO STS initiative to define the tag set standard for
The goals of the initiative are:
Ease publication of standards
Increase interoperability of standards
Aid distribution of standards
Improve the future of standards publishing
Full information on the NISO STS standard and full documentation are available at:
Entity Services NISO STS Demo
Working with the International for Standardization (ISO), one of the main sponsors
NISO STS initiative, MarkLogic created a demonstration of the Entity Services features
content samples from the ISO Freight Container standards. These data samples were
related ISO STS format as NISO STS is just in the recommendation phase.
The demonstration highlighted the major features of MarkLogic Entity services to create
data model and put that model into action with generated code and artifacts as well
overall impact of putting the data definition into action.
The demonstration first creates the building blocks that data modelers use to generate
the universal artifacts that are later used in the actual application code.
Entity Model: a model of the key data elements in the
NISO standard that included, for the demo, URN, doc number, title, doc type, originator,
secretariat, pub date, release date, scope and normative references. The model is
to be used iteratively and these data elements were all that were needed for the
demonstration. The model is created in JSON and uploaded into the database where it
persisted as semantic triples. The model can then be queried and linked to other entity
to provide additional context, such as business definitions or security data, to
applications using and governing the data. The Entity Model for this demonstration
in Figure 3.
Instance Converter: Code to map the source data in the
data in the STS format to the entity model. The exact paths to the elements in the
documents are input the converter file enabling all the other Entity Services functions
access data from the source documents in structure of the entity model.
Instance Generator: To put the mode into action, the
entity services use the converter and the entity model to create new documents in
database. These documents use the envelope pattern that captures the entity data and
metadata in an envelope around the original source document. This enables developers
subsequent entity services functions to easily access the entity data while also making
original source data available.
Putting the Model into Action
Using these foundational pieces, data molders can then generate code and developers
that code into action based on the shared and defined data model.
Search Options: model driven code to generate a
MarkLogic Search API options node. This includes the specification for search constraints
and facets with the typed data defined in the entity model.
Template Based Extraction (TDE)
/ SQL Query: TDE is a new feature new of MarkLogic 9 that
enables a view of data to be generated based on a template that describes the source
data in the database and structure of the view. Once the template is loaded into the
database, data supporting the view is automatically created (in triple format) and
available for query using MarkLogic’s SQL and Optic API functionality. MarkLogic Entity
Services features automatically generate this template based on entity model enabling
developers to immediately use the TDE features. For the demo, this was used to run
against the data elements defined in the entity model.
Custom Application: to demonstrate the entity model in
action, the demonstration included a custom application that used the search options
access and search the standards with facets for originator, secretariat, pub date
release date. The application also featured normative reference SQL queries when displaying
a standard to show the related standards and explore the additional freight container
standards that also relied on those standards. Normative references SQL query is shown
In addition to these features, the Entity Services functionality includes these features
not explored in the demonstration:
The features of MarkLogic Entity Services define a pattern that enables the entity
to be put into action and for technology teams to gain the benefits of definition.
The benefits of definition can have an impact on industries and processes that are
put definitions into action. These results can be dramatic and world changing as in
of the shipping industry and the ISO freight container standards.
Bringing these benefits to data and information technology has, to date, been challenging
as data definitions and standards are often separated from the processes that use
The demonstration using NISO STS, a standard for standards, and MarkLogic Entity Services,
the pattern to put the data model into action, shows that these benefits can be within
for information technology. These include interoperability as data models can be used
multiple purposes, specialization of roles as developers and data architects can focus
their tasks and universal adoption as everyone, including external and industry organizations,
can rely on and use the data model. This is illustrated in Figure 5.
As these processes and standards are adopted, information technology teams can see
benefits that other industries have seen by putting definitions and standards into
The following references were used to create this paper and the demonstration application:
The author would like to thank the following people for their support to create this
and demonstration application:
Stephane Chatelet, Director Information Technologies, ISO
Holger Apel, Software Manager, ISO
Bruce Rosenblum, CEO Inera Systems and Chairman of NISO STS working group
Charles Greer, Lead Software Engineer, MarkLogic
Justin Makeig, Product Manager, MarkLogic