Balisage logo

Balisage 2013 Program

Tuesday, August 6, 2013

Tuesday 9:15 am - 9:45 am

The semantics of “semantic”

B. Tommie Usdin, Mulberry Technologies

There was a time when I knew what the word “semantic” meant. That was a long time ago. Since then many people, on many occasions, in many contexts, have corrected my misunderstanding of the meaning of semantic. Perhaps it means nothing, or everything. Or perhaps I’m simply misinformed.

Tuesday 9:45 am - 10:30 am

icXML: Accelerating a commercial XML parser using SIMD and multicore technologies

Nigel Medforth, International Characters, Inc., and Simon Fraser University; Dan Lin, Simon Fraser University; Kenneth Herdy, Simon Fraser University; Rob Cameron, Simon Fraser University and International Characters, Inc.; & Arrvindh Shriraman, Simon Fraser University

Earlier research prototypes have shown how SIMD (single-instruction multiple-data instructions) and multi-core parallelism can accelerate XML processing. We have tested how well those techniques work when fully integrated into a commercial XML parser. The Apache Software Foundation’s Xerces C++ parser was restructured for SIMD and multi-core parallelism, while retaining the existing application programming interface unchanged. SIMD techniques alone produced a 50% increase in parsing speed; when pipeline parallelism on dual core processors was added, improvements of 2x and beyond were realized. (Proceedings)

Tuesday 11:00am - 11:45am

Markup to generate markup to generate markup

Peter Flynn, Cork University & Silmaril Consultants

LaTeX relies for its extensibility on a library of over four thousand packages and document classes, which provide additional markup, additional layouts, and variant behavior. The ltxdoc class supplies features for maintaining these packages and classes in a literate programming style: code (with comments) and full end-user documentation derive from a single source. The syntax of ltxdoc, however, is complex, as documentation must be shielded from interpretation as code, and vice versa. The system presented here is an experiment in using XML (specifically DocBook 5) to mark up and maintain LaTeX classes and packages. XSLT 2 styleheets generate the .dtx and .ins distribution files expected by end users. There are numerous benefits in automation and reusability of code, a number of areas where a customization layer for DocBook would be useful, and a few unresolved restrictions that package and class authors and maintainers would need to keep in mind when editing XML.

Tuesday 11:45 am - 12:30 pm

The FtanML markup language

Michael Kay, Saxonica

What if we could reinvent markup anew? What if XML and JSON were ‘the ones we built to throw away’ and we could start over to build our markup? Can we imagine what the world would be like if we didn’t have to design with an eye to the past? Michael Kay and Stephanie Haupt explored these questions with some advanced students at a Swiss summer program in 2012. That summer is over, but the exploration continues. FtanML is a rethink of markup from the ground up, with an associated schema language FtanGram and a query/transformation language FtanSkrit. What can this more-than-thought experiment teach us all?

Tuesday 2:00 pm - 2:45 pm

First Person: The allure of Gothic markup

Simon St. Laurent, O’Reilly Media

John Ruskin, markup theorist as well as art critic? Perhaps not, but we who practice document architecture can learn from his analysis of the Gothic way. The history of markup languages is littered with formalisms and schema languages designed to constrain and validate documents. Consider instead a markup community where documents don’t need to be restricted but are instead adaptable to customization for individual needs. The flexibility of Web tools, combined with implementation advice from architect Christopher Alexander, may serve some communities better than the rigidity of traditional markup approaches. Let’s try building Gothic cathedrals that allow for individual creativity rather than Brutalist apartment blocks of mass-produced documents.

Tuesday 2:45 pm - 3:30 pm

Programming in XPath 3.0

Dimitre Novatchev, Intentional Software

It’s a bird! It’s a plane! XPath is critically at the intersection of XQuery and XSLT. It is also the expression language of a number of other XML technologies. It is not, however, a full-fledged programming language. Or is it? Historical limitations in XPath 1.0 and 2.0 may have made them a weak substitute for a full-fledged programming language, but what about XPath 3.0? XPath 3.0 is the ideal language to write function libraries designed to be used both in XSLT and XQuery. Possessing variables, inline functions, closures, a simple mapping operator, strong typing, robust recursion, and the ability to create nodes, XPath 3.0 can truly sport an “S” on its chest.

Tuesday 4:00 pm - 4:45 pm

(LB) Marking up changes to ISO standards: A case study

Tristan Mitchell, Nigel Whitaker, both of DeltaXML

The ISO Standards Tags Set (ISOSTS) is a customization of NISO's Journal Article Tag Suite (JATS) developed for the International Standards Organization for authoring standards documents. As part of the authoring workflow used at ISO, they need to produce redline publications of a document in order to show changes between versions of a standard. Alongside Typefi, who provided the functionality for publishing the marked XML into PDF with changebars, we provided our XML comparison toolset to detect and mark the changes as required. The issues we faced while completing this work include representation of changes in the XML, comparison of tables, ignoring text formatting changes, and the use of processing instructions. We discuss the pros and cons of various format design decisions that can impact good comparison.

Tuesday 4:00 pm - 4:45 pm

Hypermedia services, loosely coupled

Jonathan Robie, Rob Cavicchio, Rémon Sinnema, & Erik Wilde, all of EMC

Your service has endpoints, my service has endpoints, but how do we communicate? We can document in great detail all the URIs and parameters involved in our services; this approach treats them like function signatures and invites the construction of tightly coupled services. But depending on out-of-band information to drive interaction runs against the grain of truly RESTful services. Instead, the interaction between services should be expressed by documenting the processing rules for the media types of the representations exchanged. By moving to a new description language we can make loosely coupled web services a thing of the present!

Tuesday 4:45 pm - 5:30 pm

(LB) What it is vs. how we shall: complementary agendas for data models and architectures

David Dubin, Megan Senseney, & Jacob Jett all of University of Illinois

Data models play two kinds of role in information interchange. Descriptively, they offer domain models. Prescriptively, they propose plans of action. While stipulative definitions fill in a model's representational gaps, elegance and descriptive power reduce the need for arbitrary choices in standards and specifications. Proposals for modeling digital annotation serve to illustrate competing representational and cohortative agendas.

Tuesday 4:45 pm - 5:30 pm

My document object model can do more than yours

Alain Couthures, AgenceXML

Document object models, specifically the browser DOM, were designed to represent HTML and XML documents. Languages such as XPath were designed to access and traverse the DOM of HTML and XML documents. But suppose we wanted to bring the power and convenience of XML technologies like XPath to new data types. Could we extend the DOM to support CSV files? JSON? ZIP files? Yes we can! This paper explores a number of ways in which the DOM can be made to do more. We can loosen restrictions, describe new sequence types, and even define new XPath axes to make the DOM better and more useful.

Wednesday, August 7, 2013

Wednesday 9:00 am - 9:45 am

Processing XForms in HTML5-enabled browsers

Tobias Niedl & Anne Brüggemann-Klein, both of Technische Universität München

Forms technology for the World Wide Web has developed along two lines. The XForms strain has worked for a cleaner separation of concerns and supports more complex bindings between user interface and data. The HTML strain has focused on the user interface, defining new widgets and in HTML5 adding type definitions to form elements to enable native in-form validation. Some XForms implementations translate XForms elements into HTML widgets plus executable code. But HTML5 also defines new Javascript APIs browsers should support. The new facilities of HTML5-enabled browsers can be used to support XForms near-natively. We explain how.

Wednesday 9:45 am - 10:30 am

The case for authoring and producing books in (X)HTML5

Sanders Kleinfeld, O’Reilly Media

Publishers find themselves caught between the steep learning curves and rigorous validation of production systems like DocBook and the anything-goes approach of the commercial word processors used by many of their authors. Can publishing requirements, particularly for electronic output, be met by simple online tools that are able to produce structured output without punishing authors? The resilience of Web browsers suggests that an HTML-based solution might be promising. A proposal for an HTMLBook standard applies new rules and semantics to (X)HTML5 to create a format that is easy to edit while also being ready to produce output.

Wednesday 11:00 am - 11:45 am

Semantic profiling using indirection

Ari Nordström, Condesign

Profiling is a publishing technique in which portions of the content are identified as relevant to various conditions. Publications are created by selecting the appropriate portions. In XML this is often implemented by marking nodes using attribute values as filtering conditions. When publishing, the nodes are only included if the publishing conditions match the publishing context. The profiles are sometimes also used as the basis for text generation. While useful, these techniques have a number of problems. For example, if the attribute values need to be changed, the new values usually require converting any “live” legacy documentation to the new values and changing the schema, stylesheets, etc. Supporting both the old and new profiles will not be possible. An abstraction or indirection layer solves this. The profile values are not used directly; instead they represent a specific “semantic profile”. The abstraction layer can be expressed using URNs that are matched to human-readable values when required.

Wednesday 11:45 am - 12:30 pm

The XML info space

Hans-Jürgen Rennau, Traveltainment

XML-related standards imply an architecture of distributed information which integrates all accessible XML resources into a coherent whole. Attempting to capture the key properties of this architecture, the concept of an info space is defined. The concept is used as a tool for deriving desirable extensions of the standards. The proposed extensions aim at a fuller realization of the potential offered by the architecture. Main aspects are better support for resource discovery and the integration of non-XML resources. If not adopted by standards, the extensions may also be emulated by application-level design patterns and product-specific features. Knowledge of them might therefore be of immediate interest to application developers and product designers.

Wednesday 1:15 pm - 2:00 pm

Balisage Bluff: An Entertainment

Balisage Attendees

Balisage Bluff: markup-truth may be stranger than fiction! Participants will listen to short stories that involve markup, Montreal, or have some other connection to the conference. The audience will be challenged with identifying which stories are true (or close to it) and which are mostly fabricated.

Do you have a story to tell? Stories will be limited to 2 minutes, but even so there are al lot of Balisageurs with great tales to tell. Volunteer by sending email to, or by talking with Lynne Price, gamemaster, on site in Montreal. If there are more than ten volunteers by July 15, ten will be randomly selected. If we have more time in the actual session volunteers will be recruited from the audience/participants.

Wednesday 2:00 pm - 2:45 pm

First Person: Where did all the document kids go?

Matt Patterson, Constituent Parts

When I began developing with XML technologies, there were a multitude of toolkits and implementations of XML parsers, multiple DOM (and DOM-like) implementations outside web browsers, and XSLT implementations everywhere. My current development environments seems impoverished in comparison. What happened? The population of web development tools, by contrast, has grown by leaps and bounds. Why is the one ecosystem contracting and the other growing? One should never underestimate the power of making things more accessible to the casual user.

Wednesday 2:45 pm - 3:30 pm

Invisible XML

Steven Pemberton, CWI, Amsterdam

What if you could see everything as XML? XML has many strengths for data exchange, strengths both inherent in the nature of XML markup and strengths that derive from the ubiquity of tools that can process XML. For authoring, however, other forms are preferred: no one writes CSS or Javascript in XML. It does not follow, however, that there is no value in representing such information in XML. Invisible XML is a method for treating non-XML documents as if they were XML, enabling authors to write in a format they prefer while providing XML for processes that are more effective with XML content. There is really no reason why XML cannot be more ubiquitous than it is.

Wednesday 4:00 pm - 4:45 pm

Collecting and curating slide sets

Alan Bilansky, University of Illinois at Urbana-Champaign

Enormous numbers of presentations are created in PowerPoint, Open Office, KeyNote, and similar slideware every day. These slide decks are emailed, posted on the web, shared, and stored on filesystems throughout industry and academia. And yet, unlike many other cultural artifacts, we have no systematic way to archive and curate them. What are the semiotics of slides? Should we be archiving and curating the various aspects of slide decks: graphics, diagrams, photographs, and words, and the transitions between them? If so, how should this information be captured? The time has come! Save the decks! (Proceedings)

Wednesday 4:00 pm - 4:45 pm

Using XSD import, include, and redefine in the MailXML logistics system

Dianne Kennedy, IDEAlliance

The United States Post Office needs flexible means for exchanging messages with its logistics clients. MailXML, developed in collaboration with IDEAlliance, has evolved into a flexible suite of XSD schema modules. These modules have been constructed to exploit XSD's facilities for redefinition of components. The framework, in which new modules can be prototyped and added to the system without disrupting current services, has potential uses beyond its original application.

Wednesday 4:45 pm - 5:30 pm

Igel: Comparing document grammars using XQuery

C. M. Sperberg-McQueen, Black Mesa Technologies, and Oliver Schonefeld, Marc Kupietz, Harald Lüngen, & Andreas Witt all of the Institut für Deutsche Sprache (IDS)

Igel is a small XQuery-based web application for examining a collection of document grammars; in particular, for comparing related document grammars to get a better overview of their differences and similarities. In its initial form, Igel reads only DTDs and provides only simple lists of constructs in them (elements, attributes, notations, parameter entities). Our continuing work is aimed at making Igel provide more sophisticated and useful information about document grammars and building the application into a useful tool for the analysis (and the maintenance!) of families of related document grammars.

Wednesday 4:45 pm - 5:30 pm

(LB) A data-driven approach using XForms for building a web forms design framework

Stephen Cameron, Collinta & William Velasquez,

In a project to build a web-based auto-generated forms framework, we needed to decide whether to use XForms for the 'designer' user-interface as well as for the generated forms. In trying to make this decision it became apparent that, at a fundamental level, there are two distinctly different means to develop web-based interfaces. These two means, or approaches, can be described as 'data-driven' or 'behavioural'. We suggest that the Model-View-Controller (MVC) design 'pattern', which is now becoming popular as terminology for describing the basis of several JavaScript web development frameworks, is of limited practical usefulness as it encompasses too many variants. In contrast, the distinction between 'data-driven' and 'behavioural' approaches seems to be a more useful. In particular, it provides clarity in distinguishing the respective benefits of using 'XML technologies' (particularly XPath) versus other object-based alternatives for web application development. This distinction is illustrated using working examples from this on-going project. Some implications, such as the role of schema documents in the data-driven approach, the practicality of writing XML 'as code', and issues encountered with the 'XRX' architecture are also discussed.

Thursday, August 8, 2013

Thursday 9:00 am - 9:45 am

Indexing queries in Lux

Michael Sokolov, Safari Books Online

Query optimizers often mystify database users: sometimes queries run quickly and sometimes they don’t. An intuitive grasp of what will work well in an optimizer is often gained only after trial, error, inductive logic (i.e. educated guessing), and sometimes propitiatory sacrifice. This paper tries to lift the veil by describing work on Lux, a new indexed XQuery search engine built using Saxon and Lucene, which is freely available under an open-source license. Lux optimizes queries by rewriting them as equivalent (but usually faster) indexed queries, so its results are easier for a user to understand than the abstract query plans produced by some optimizers. Lucene-based QName and path indexes prove useful in speeding up XQuery execution by Saxon.

Thursday 9:00 am - 9:45 am

An extensible API for documents with multiple annotation layers

Nils Diewald & Maik Stührenberg, both of Universität Bielefeld

XML namespaces and standoff annotation are promising approaches to tackle overlapping multiple annotation layers in XML instances. But the creation and processing of standoff instances can be cumbersome, especially when the underlying text is modified after an annotation has been added. We present a powerful API capable of dealing with these tasks. It provides an extension mechanism to allow for the easy creation of modules corresponding to a certain namespace (and therefore markup language). As a working example, we use XStandoff (which combines standoff notation with a GODDAG-like data model), since it is a standoff format highly dependent on XML namespaces.

Thursday 9:45 am - 10:30 am

(LB) Sociology, history and overview of Rights Metadata Standards

Linda Burman, L. A. Burman Associates

Rights metadata has recently become a very hot topic. While rights management (copyright law) has been discussed and debated for many years, the term “rights management” has many different meanings to people in different roles and is applied to a wide variety of behaviors, workflows and systems. In publishing, rights have typically been the domain of legal staff and documented in paper contracts. Unsurprisingly, ‘uptake’ of rights metadata standards has been slow. However, now that digital asset and/or content management systems are becoming ubiquitous, users have immediate access to their digital assets and want to know what content they can repurpose and under what restrictions – platform, media type, distribution channel, geography and so on - without having to phone a permissions manager for every question. Suddenly rights metadata vocabularies are becoming extremely useful. Unfortunately, most companies are still unaware that rights metadata standards -- PRISM Usage Rights (PUR), PLUS (UsePlus), and ODRL-- already exist. This overview of the existing standards will emphasize each one’s point of view – the type of content and company each is best suited for and being used by, and how to learn more about each standard.

Thursday 9:45 am - 10:30 am

Modeling overlapping structures: Graphs and serializability

Yves Marcoux, Université de Montréal; C.M. Sperberg-McQueen, Black Mesa Technologies; & Claus Huitfeldt, University of Bergen, Norway

Modeling overlapping structures (e.g. verse and line structures in poetry) in an XML environment which represents only cleanly embedded structures, is a familiar problem. Proposals to address this problem include XML solutions (based essentially on a layer of semantics) and non-XML ones such as TexMecs, a markup language that allows overlap (and other features). Overlap-only TexMecs documents have been shown to correspond to completion-acyclic node-ordered directed acyclic graphs. Elaborating on that result, we cast it in the setting of a strictly larger class of graphs, child-ordered directed graphs (CODGs), that includes multi-graphs and non-acyclic graphs, and show that — somewhat surprisingly — it does not hold in general for graphs with multiple roots. Second, we formulate a stronger condition, full-completion-acyclicity, that guarantees correspondence with an overlap-only document, even for graphs that have multiple roots. Full-completion-acyclicity can be checked with polynomial-time algorithms, which can compute oo-serialization of fully-completion-acyclic CODGs.

Thursday 11:00 am - 11:45 am

(LB) Could authors really write in XML one day?

Peter Flynn, Cork University & Silmaril Consultants

The learning curve for non-markup-expert authors to start writing and editing structured documents in XML is steep, and there are some specific barriers to the acceptance of editor interfaces. In exploring the reasons behind these barriers, we identified some changes that could be made to common interfaces to improve acceptability. This paper presents the results of usability tests on the modifications, and suggests how some aspects of structured editing software could be adapted to extend their use into additional areas and markets.

Thursday 11:00 am - 11:45 am

Poio API and GraF-XML

Jonathan Blumtritt, University of Cologne; Peter Bouda, Centro Interdisciplinar de Documentação Linguística e Social; & Felix Rau, University of Cologne

Language documentation projects all over the world have accumulated a large and heterogeneous corpus of linguistic material. Because of its diversity, access to and analysis of the components is difficult, particularly for multimedia instances. The "Graph Annotation Framework" (GrAF), a standoff annotation method, is applied to utterance examples in time-aligned annotations of video samples. An easy-to-use programming interface defined in the Poio API, a project within the CLARIN framework ("Common Language Resources and Technology Infrastructure"), then greatly simplifies access without the need to deal with multiple input formats in the source material. GrAF-XML provides a basis for exchanging results among the various projects that analyze the corpus. (Proceedings)

Thursday 11:45 am - 12:30 pm

Where are all the bugs? Introspection in XQuery

Mary Holstege, MarkLogic

In a large and complex code base, it is infeasible to develop tests manually for every feature and every combination of features. The key to quality assurance in this context is automation and focus. Using XQuery introspection, one can examine a large XQuery code base to find smart places to focus testing. Based on the proposition that the set of functions, and the sequence types of parameters used by those functions, constitute vocabularies following classic Zipf distributions, we show that TF-IDF scoring over the terms in those vocabularies identifies areas of potential testing interest.

Thursday 2:00 pm - 2:45 pm

First Person: WebVTT versus TTML

Andreas Tai, Institut für Rundfunktechnik, München

A clash of the “web culture” with the “XML culture” has resulted in a divergence in standards development for timed text and the more general domain of subtitling. Timed Text Markup Language (TTML), released by the W3C in 2010, has been rejected by the WHATWG in favor of the text-based format WebVTT. The broadcast community prefers the XML-based TTML, but the roll-your-own faction among Web developers has pushed for WebVTT. Is this a symptom of a divergence of the Web world from the world of markup that gave it birth? Stay tuned for more!

Thursday 2:45 pm - 3:30 pm

(LB) Interactive XSLT in the browser

O'Neil Delpratt & Michael Kay, both of Saxonica

Remember the dream of being able to process XML in the browser to write richly interactive applications? It's taken a long time coming, and a lot of people have given up waiting, but it is now a reality. With the open-source Saxon-CE engine, you can now write highly interactive applications in the browser to process XML content, without writing a single line of Javascript. As a bonus, you get all the benefits of XSLT 2.0. During this talk we will demonstrate what can be achieved. And because Balisage audiences are interested in the theory as well as the practice, we'll also touch on some of the underlying concepts: how does one use a purely functional language to manipulate a stateful interactive dialogue with the user?

Thursday 4:00 pm - 4:45 pm

The New W3C Publishing Activity

Liam Quin, W3C

The W3C is involving publishers and people and organizations who provide tools for publishers in an effort to change the Web so that it's suitable for publishing. The Open Web Platform is changing the ways people do things. Proprietary desktop tools are being replaced by Web-based applications. At the same time ebooks are forcing publishers to come to terms with producing multiple output formats from their assets, so that "XML Early" and "XML First" are hot buzzwords in the industry. The EPUB3 format, defined by IPDF, uses XHTML and CSS, W3C Web technologies. The Open Web Platform doesn't meet the needs of publishers today. So W3C is working more closely with IPDF, with publishers and designers, and others, to change the Web so that it's suitable for publishing. Technical work on CSS has already begun and W3C is looking at internationalization, HTML, metadata, and workflow.

Thursday 4:45 pm - 5:30 pm

Transcending triples in semantic modeling

Micah Dubinko, MarkLogic

Can't documents and triples all just get along? Proponents of RDF semantic modeling often see the world through triples-shaped glasses. If they can't do it in triples, they don't know what to do: reification in particular has terrifying implications. Broadening a view of inference to transcend triples might overcome the constraints of triple-vision and point the way toward future solutions.

Friday, August 9, 2013

Friday 9:00 am - 9:45 am

(LB) General Architecture for Generation of Slide Presentations, including PowerPoint, from arbitrary XML Documents

Eliot Kimber, Contrext

PowerPoint slide decks are often required for training content authored in XML. Until recently, this was difficult for many users. With the development of the Apache POI library, it is now possible to reliably generate PowerPoint documents with a minimum of implementation effort. This paper presents a general architecture for generating slide presentations of any format from XML of any sort through the use of an intermediate format that abstracts the general structure of PowerPoint-type presentations. This general architecture allows the same source to potentially produce PowerPoint, Slidey, PDF, or any other presentation-optimized format from the same source with a minimum of implementation effort. The paper focuses on the specific challenge of producing PowerPoint using this architecture.

Friday 9:45 am - 10:30 am

What, when, where? Spatial and temporal annotations with XStandoff

Maik Stührenberg, Universität Bielefeld

In annotating non-textual sources: maps, images, and motion pictures, common practice employs standoff markup. XStandoff has been successfully used to annotate multiple hierarchies largely based on textual primary data. We have now extended it to support both spatial annotations (for images and maps) and spatial-temporal annotations (for video files). Practical applications might range from simple image feature extraction to something as complex and dynamic as representing three-dimensional eye-tracking data.

Friday 9:45 am - 10:30 am

Fat Markup: Trimming the myth one calorie at a time

David Lee, MarkLogic

JSON is lean and XML is fat — or so say some factions in the online community. Does this hold up in the real world? Tests of a corpus of several dozen varied documents, using a variety of browsers on many operating systems show that care in markup design and choice in processing methods (for example, direct JavaScript vs. jQuery) may have more effect on speed and throughput than the actual markup language chosen. The myth that XML is more fat than JSON may belong in the same category as an assertion that the “<” and “>” characters are larger then the “{” and “}” characters due to their excessive pointiness.

Friday 9:45 am - 10:30 am

Some assembly required: XML semantics, digital preservation, and the construction of knowledge

Jerome McDonough, University of Illinois at Urbana-Champaign

If we think of meaning as emerging primarily from the interaction between a text, its markup, and a reader, we may be missing other influences on meaning that operate at a larger scale than a single document or even a collection. For the past six years, the “Preserving Virtual Worlds” projects have been investigating the preservation of computer games and interactive fiction. For computer games, identifying and collecting information is not simply an issue of documenting a particular file format; it becomes an exercise in knowledge representation and management. If highly complex, multimedia objects such as game software are going to survive in the long-term, archivists will need to collect, organize and preserve not just the objects that comprise the game, but a large body of information necessary to interpret those objects. Somehow, they will need to preserve people’s ability to understand the game.

Friday 11:00 am - 11:45 am

Decision making in XSL-FO formatting

Tony Graham, Mentea

XSL-FO 1.0 and 1.1 share a very linear processing model that makes it difficult to use the results of one formatting task to make layout decisions in other formatting tasks. But in real composition, good layout often requires the ability to postpone decisions for best fit, particularly when positioning tables and graphics. In the past, multi-pass workarounds have been necessary to allow decisions to depend on the sizes of objects in the formatted output. The requirements for XSL-FO 2.0 included many of the necessary abilities, but the Working Group’s charter expired before XSL-FO 2.0 was completed, leaving the specification unfinished. The Print and Page Layout Community Group at the W3C is working on innovative solutions to many of the delayed decision-making problems.

Friday 11:00 am - 11:45 am

Musical variants: Encoding, analysis and visualization

Johannes Kepper, University of Paderborn, Germany; Perry Roland, University of Virginia Library; & Daniel Röwenstrunk, University of Paderborn, Germany

Variation in music, like variation in texts, reflects the history of the cultural artifact. We propose a model for the encoding of variance in music that is based on traditional models and implemented using the data framework offered by the Music Encoding Initiative (MEI). Our model can be used to identify a portion of the musical text that varies among different sources, to identify the relations between sources (with their directionalities), and to illustrate the relationships between the encoded sources. The model is aligned with the Functional Requirements for Bibliographic Records (FRBR) and can be used to provide an overview of multiple variant sources and to inform an editor’s interpretation of the overall connections. The Freischütz Digital project, which will create a digital scholarly edition of Carl Maria von Weber’s opera Der Freischütz based on encodings of all relevant sources.

Friday 11:45 am - 12:30 pm

Markup and Canada’s national model building codes

Brent Nordin

Canada’s Building Codes are large, typographically complex documents that cover building and occupant safety for areas as diverse as plumbing, fire, and energy, as well as buildings. Over the years these documents have moved from typewriters, to desktop publishing, to SGML and Dynatext, to Arbortext XML with output on HTML and NXT CDs, and finally to the bright present. We have added a CMS to support life-cycle management of revisions and integrated the CMS with our XML library. Our current output formats of PDF and HTML show change tracking for revised material, offer side-by-side rendering in French and English, and more. In this case study, I share the solutions, tools, frustrations, and triumphs.

Friday 11:45 am - 12:30 pm

(LB) Transforming schemas: architectural forms for the 21st Century

John Cowan, LexisNexis

XML documents are typically transformed in three steps: validate, transform, validate. Architectural forms, a feature of the SGML-based hypermedia standard HyTime, uses a combination of enhancements to DTDs and annotations in source documents to allow a two-step pipeline, whereby an SGML document could be automatically transformed using a specialized SGML parser, called an architectural engine, into another SGML document valid against a more general DTD known as the meta-DTD. This permitted document creators to conform to a general document architecture without having to constrain their own documents to every detail of a specific schema. Unfortunately, DTDs have not seen wide uptake in the XML world, and the few XML architectural engines that have been built have conformed more to the letter than to the spirit of architectural forms. Instead, the emphasis has been on the creation of comprehensive and complex schemas which attempt simultaneously to serve local needs and the needs of interchange. This work is an attempt to provide a modern architectural forms engine for documents described using the Examplotron schema language.

Friday 12:30 pm - 1:15 pm

Climbing the hill

C. M. Sperberg-McQueen, Black Mesa Technologies

Notes on making things better and on getting from here to where we want to be.

There is nothing so practical as a good theory