Balisage logo

Balisage 2019 Program

Monday, July 29, 2019
Symposium: Markup Vocabulary Customization

Tuesday, July 30, 2019

Tuesday 8:00 am - 9:00 am (location: Conference level)

Conference registration & breakfast

Pick up your conference badge in the Gleason Boardroom and join us for breakfast in Baker before taking your seat in Sinequa, the conference room.

Tuesday 9:00 am - 9:15 am (location: Sinequa)

Welcome and introductions

Tuesday 9:15 am - 9:45 am

Explicit markup: a fool’s errand or the next big thing?

B. Tommie Usdin, Mulberry Technologies

In 1998, at a Balisage predecessor conference, Brian Reid told us we couldn’t have the world we wanted. XML wouldn’t deliver. He used twenty-year-old slides, slides that he had originally presented at a conference in 1981 to make his point. I still want the world that Brian Reid told us we could not have; I still want Brian Reid to have been wrong. I still believe that separating meaning from format will enable our documents to be displayed in many forms and media, that a markup format that makes hierarchy explicit makes complex documents tractable, that when content creators author in systems that make declarative markup visible and use the author’s knowledge to add value to their content, we will be able to make documents sing! And I have the twenty-year-old slides to prove it.

Tuesday 9:45 am - 10:30 am

Implementing TEI standoff annotation in the browser

Hugh Cayless, Duke Collaboratory for Classics Computing (DC3)

Standoff markup allows you to add information to a text without modifying the source. Often this can be achieved by linking between different documents. Various mechanisms exist for handling the connections involved. But some cases such as named entity recognition appear to require inline markup. Could we do this with standoff markup too? The answer is yes, using the TEI Critical Apparatus model, but it isn’t completely straightforward.

Tuesday 10:30 am - 11:00 am

Break

Tuesday 11:00 am - 11:45 am

Eating your own dog food

Ari Nordström

Declarative solutions generally—and XML specifically—invite experimentation, iterative development, and play. In this way they encourage the self-described “non-programmer” to build rich models, extensive workflows, and robust systems. But can you build the whole application this way? And if the application is critical to getting paid, do you have the courage to do so? We Swedes are a courageous lot.

Tuesday 11:45 am - 12:30 pm

Reserved for Late-breaking News

Tuesday 12:30 pm - 2:00 pm (location: Social Circle - lobby level)

Lunch

Please check computer bags, backpacks, brief cases, suitcases, and other bags and bundles with conference staff in the Gleason Boardroom. Lunch is a serve-yourself buffet with limited space.

Tuesday 2:00 pm - 2:45 pm

Interviews

to be determined

Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures—and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Tuesday 2:45 pm - 3:30 pm

Reserved for Late-breaking News

Tuesday 3:30 pm - 4:00 pm

Break

Tuesday 4:00 pm - 4:45 pm

Application of Brzozowski derivatives to JSON Schema

Mary Holstege, MarkLogic Corporation

In 1964, Janusz Brzozowski defined a new technique for computing whether a string of symbols is in the language defined by an extended regular expression. Brzozowski derivatives have been used for content model validation in several XML schema processors; they can also be applied to the task of model validation for JSON Schema. As it turns out, applying them to JSON Schema requires several extensions to cover “type-tagged” expressions, which sheds light on certain interesting matching problems outside the original problem scope of JSON Schema validation.

Tuesday 4:45 pm - 5:30 pm

Reserved for Late-breaking News

Tuesday 8:00 pm - 10:00 pm (location: Baker)

Balisage hospitality

Stop in to the Balisage Coffee and Conversation room. We'll have coffee, a comfortable place to talk, and possibly a toy or two worth a look.

Wednesday, July 31, 2019

Wednesday 8:00 am - 9:00 am (location: Conference level)

Conference registration & breakfast

Pick up your conference badge in the Gleason Boardroom and join us for breakfast in Baker before taking your seat in Sinequa, the conference room.

Wednesday 9:00 am - 9:45 am

Text and markup processing languages, past, present, and future

Sam Wilmott

Programming language design is in continual flux, with significant new languages coming along every few years. In the field of text and markup programming languages, things seem stable at the moment, with XSLT in a dominant position and a few other languages filling in the gaps. But text and markup processing is no more exempt from change than any other field. What should the next language for this application domain look like? Can we make text and markup processing easier than it is now? What direction should we take? For the last ten years or so, I have been working on this problem. I have a plan.

Wednesday 9:45 am - 10:30 am

Reserved for Late-breaking News

Wednesday 10:30 am - 11:00 am

Break

Wednesday 11:00 am - 11:45 am

Graphical user interfaces in the X stack

Zahra Al-Awadai, Anne Brüggemann-Klein, Christina Grubmüller, & Philipp Ulrich, Technical University of Munich (TUM)

“XML Everywhere” isn’t just a slogan: it actually works, up and down the XML application stack. Recent developments, such as the inclusion of custom elements in HTML5, allow the declarative approach of XML to come into the browser/server interaction. XForms, supported by SVG and CSS, can serve as the basis for a graphical user interface. A custom WebSocket element can support client-to-client and server-push communication of XML data. Applications of State Chart XML (SCXML) mean that the “XML Everywhere” approach can be extended all the way to models of operations in an application. Interactive games offer living proof of the stack.

Wednesday 11:45 am - 12:30 pm

Multitasking algorithms in XForms

John M. Boyer, IBM Canada

Via declarative expressions, XForms simplifies interactive XML data processing, but XForms isn’t just a declarative language. When it’s needed, XForms authors can also rely on event-driven procedural scripting. Best of all, scripted data changes can automatically trigger additional updates from declarative expressions, so authors are free to use the best method for solving each interactive data processing need. With live demonstrations and markup discussions, this presentation will focus on advanced procedural techniques in XForms, event-driven methods for non-blocking procedures and non-preemptive multitasking, and the hybrid combination of procedural and declarative computations. Come to this presentation to see the full power of interactive XML data processing that you can access directly within current web browsers. There’s a lot more to XForms than you might have expected!

Wednesday 12:30 pm - 2:00 pm (location: Social Circle - lobby level)

Lunch

Please check computer bags, backpacks, brief cases, suitcases, and other bags and bundles with conference staff in the Gleason Boardroom. Lunch is a serve-yourself buffet with limited space.

Wednesday 2:00 pm - 2:45 pm

Interviews

to be determined

Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures—and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Wednesday 2:45 pm - 3:30 pm

“With one voice”: streamlining character data for tokenization

Ashley M. Clark

Some full-text search and textual analysis tools operate exclusively on sequences of tokens. Deriving input for these tools from XML documents can be challenging and depends heavily on the encoding practices and assumptions which produced the XML. Does metadata information, for example, carry the same weight as the text? If a document includes annotations about nuances of the transcription, including those annotations may aid researchers attempting to find relevant documents, but may hinder a process that is performing textual analysis of the work authored. Rather than attempting to make all tools powerful enough to deal with these issues, a modular approach to tokenization has been developed.

Wednesday 3:30 pm - 4:00 pm

Break

Wednesday 4:00 pm - 4:45 pm

Do we really want to see markup?

James David Mason

Markup fanatics have long cried, “We need to see the markup!” Yet since the earliest stages of developing the SGML standard, there has been an urge even among standards developers to avoid having to write tags everywhere. The recent urge to create “Invisible XML” is but the latest symptom of a smoldering disease, from which I, too, suffer.

Wednesday 4:45 pm - 5:30 pm

Aparecium: an XQuery/XSLT library for invisible XML

C. M. Sperberg-McQueen, Black Mesa Technologies LLC

This paper introduces Aparecium, a library intended to make the use of “invisible XML” convenient for users of XSLT and XQuery. Invisible XML, a method for treating non-XML documents as if they were XML, holds great promise for immediately and easily bringing our array of XML technologies to bear on the non-XML data that we encounter (CSS, wiki markup, domain-specific notations, JSON, LaTeX, etc.). Aparecium uses an Earley parser to ensure that any context-free grammar can be used.

Wednesday 8:00 pm - 10:00 pm (location: Baker)

Balisage hospitality

Stop in to the Balisage Coffee and Conversation room. Will someone bring out a card game this evening?

Thursday, August 1, 2019

Thursday 8:00 am - 9:00 am (location: Conference level)

Conference registration & breakfast

Pick up your conference badge in the Gleason Boardroom and join us for breakfast in Baker before taking your seat in Sinequa, the conference room.

Thursday 9:00 am - 9:45 am

XForms Space Invaders

John J. Chelsom, Seven Informatics

The Model-View-Controller (MVC) paradigm is a design pattern for creating applications in which: the View (web page) interacts with the user; the Model controls manipulation of the data; and the Controller orchestrates the work of the view and the model. Implementing the classic arcade game Space Invaders in an XForms workbench proved to be a successful testbed for this approach. Key functionalities required for Space Invaders are an application “heartbeat” to control the speed/progression of the invaders; animated graphics for the invaders, the Mystery Ship, and laser fire; and the user-controlled laser cannon. The workbench was implemented using Orbeon Forms, an open source framework which supports XForms 1.1 with a number of custom extensions, including Javascript actions, Attribute Value Templates on XHTML elements, and listeners for “keypress” events. Most of the extensions required are included in the draft XForms 2.0 specification (albeit with slightly modified syntax).

Thursday 9:45 am - 10:30 am

Reserved for Late-breaking News

Thursday 10:30 am - 11:00 am

Break

Thursday 11:00 am - 11:45 am

SCAP composer: a DITA Open Toolkit plug-in for packaging security content

Joshua Lubell, National Institute of Standards and Technology

The Security Content Automation Protocol (SCAP) schema for source data stream collections standardizes the requirements for packaging XML security content into bundles for easy deployment. SCAP bundles must be self-contained (each bundle contains all necessary information without external references) and reversible (XML components must be unmodified so they can be rebundled). These requirements (along with very long identifiers) make authoring the content and bundling very difficult. SCAP Composer is an authoring product which uses a DITA specialized element type for source data stream collections that makes the authoring process easier. SCAP Composer takes an incremental approach to aiding SCAP content authors: it helps only with creating source data stream collections; it does not offer any help with creating the XML resources encapsulated in a data stream collection. SCAP Composer is implemented using the DITA Open Toolkit and can be used with any DITA authoring software that includes the Toolkit, or with a standalone Toolkit.

Thursday 11:45 am - 12:30 pm

Reserved for Late-breaking News

Thursday 12:30 pm - 2:00 pm (location: Social Circle - lobby level)

Lunch

Please check computer bags, backpacks, brief cases, suitcases, and other bags and bundles with conference staff in the Gleason Boardroom. Lunch is a serve-yourself buffet with limited space.

Thursday 2:00 pm - 2:45 pm

Interviews

to be determined

Balisage is a gathering of remarkable people! We have developed standards, tools, languages, and applications. We have written books, blogs, tweets, and code. We have changed organizational cultures—and in a few cases we have changed the world. In these interviews, we will get to know a bit about the work and lives of some of the remarkable people in our markup community.

Thursday 2:45 pm -3:30 pm

Extending vocabularies: the rack and the weeds

Liam Quin, Delightful Computing

Markup languages such as XML, JSON, and SGML divide documents into two parts: markup and content. While in theory markup could be created ad hoc for every document, this would mean that markup had no meaning (and thus no value) to anyone but the creator of the document. In order to realize the value of marked up documents for interchange and longevity, we create, write documentation for, and share markup vocabularies. Vocabularies are created in specific contexts and for specific purposes. Like all human constructs, they are flawed and need to be repaired and changed over time. As people bump up against the limitations of their markup vocabularies, they often want to extend those vocabularies. Understanding these processes requires sensitivity of the human needs involved and the social contexts in which people interact with and around the vocabularies. This paper characterizes some of these contexts and their properties, and in the light of this characterization describes changes to vocabularies both successful and unsuccessful.

Thursday 3:30 pm - 4:00 pm

Break

Thursday 4:00 pm - 4:45 pm

Reserved for Late-breaking News

Thursday 4:45 pm - 5:30 pm

Encoding

Allen H. Renear, University of Illinois at Urbana-Champaign

In their model of digital objects, David Dubin and others postulate three entity types (propositions, symbols, and documents) with three relationships: “expresses”, “encodes”, and “inscribes”. We can “express” an assertion with a sentence. We can also “inscribe” symbols in physical media. I’d like to investigate the cascade of “encodings” that we find in every digital computing system, and the articulation of those encodings that is bound up in everything we do. Encoding can be recursive, but do we really understand it? What is happening when we encode a sentence as a character string? A character as an integer? An integer as an octet? Is encoding a well-understood linguistic or mathematical relationship? Is encoding just a mapping (function)? Is it the same as the relationship between a name and its referent? Is it the same as the relationship between a sentence and the proposition it expresses? I don’t think so. So let’s explore some possibilities.

Thursday 8:00 pm - 10:00 pm (location: Baker)

Balisage hospitality

Stop in to the Balisage Coffee and Conversation room. We might be talking about markup or the organization of electronic materials, but we might just as easily be talking about astronomy, butterflies, scuba diving, antique cars, or ... something else entirely.

Friday, August 2, 2019

Friday 8:00 am - 9:00 am (location: Conference level)

Breakfast

Join us for breakfast in Baker before taking your seat in Sinequa, the conference room.

Friday 9:00 am - 9:45 am

The Open Security Controls Assessment Language (OSCAL): schema and metaschema

Wendell Piez, National Institute of Standards and Technologies / Information Technology Laboratory

The Information Technology Lab at NIST is developing technical standards for documentation related to systems security. The Open Security Controls Assessment Language (OSCAL) defines lightweight schemas, along with related infrastructure, for tagging system security information to support routine tasks like crosschecking, validating against arbitrary constraints, and producing punchlists. OSCAL is not conceived as “another big XML application” but as a metaschema. This approach allows us to simplify the design and maintenance of schemas and related tooling; support generation of documentation; produce multiple parallel schemas for XML, JSON, and YAML; and construct conversion tools more easily. Documents and tools leverage basic HTML, or even Markdown, for simplicity even though it limits the complexity of what can be directly imported. Conversion is simplified by the metaschema approach, even when multiple schemas apply to a single data collection. We hope that these simplifications will lead not only to more documents but also to more useful documents.

Friday 9:45 am - 10:30 am

Loose-leaf publishing using Antenna House and CSS

Eliot Kimber, Contrext, LLC

Loose-leaf publishing is the ability to typeset and print only the pages in a document that have changed since its last publication. This presents many interesting challenges. We developed a loose-leaf publication system using Antenna House Formatter, CSS for pagination, and XSLT for post processing the area tree into “change packages” which include only the changed pages. Both the CSS markup and the publication workflow warrant a closer look.

Friday 10:30 am - 11:00 am

Break

Friday 11:00 am - 11:45 am

Reese’s Peanut Butter Cups and eXist-db: integration of XML databases and content management systems in digital editions

David J. Birnbaum, University of Pittsburgh
Hugh Cayless, Duke University
Leif-Jöran Olsson, University of Gothenburg (Sweden)
Joseph Wicentowski, Office of the Historian, US Department of State
Emmanuelle Morlock, French National Center for Scientific Research (CNRS); History and Sources of the Ancient Worlds (HiSoMA) Research Center, Lyon (France)

We have identified four models for integrating digital edition content into eXist-db: TEI Publisher; the eXist-db app framework using HTML templating; the eXist-db app framework without HTML templating; and Apache and PHP mediating between the user and eXist-db, so that eXist-db provides only XML database services. We examine and compare these ways of conceptualizing and implementing the infrastructure for a digital edition. Each of them has advantages and disadvantages, primarily from the perspective of sustainability. Our considerations apply to edition frameworks generally and are therefore not specific to eXist-db.

Friday 11:45 am - 12:30 pm

Thinking, wishing, saying

C. M. Sperberg-McQueen, Black Mesa Technologies LLC

Can we have rules for our documents we cannot write down in a schema language? If a conformance requirement is not mechanically checkable, is it a conformance requirement? If a rule is not testable, is it a rule?

Friday 12:30 pm - 2:00 pm (location: Social Circle - lobby level)

Lunch

Please check computer bags, backpacks, brief cases, suitcases, and other bags and bundles with conference staff in the Gleason Boardroom. Lunch is a serve-yourself buffet with limited space.

Relax at the Cambria and enjoy talking about markup over lunch. For participants who must rush off, wrapping materials and bags are supplied so you can take your sandwich with you to enjoy in the cab or at the airport (but do not eat on Metro!).