DITA Document Types: Enabling Blind Interchange Through Modular Vocabularies and Controlled Extension
Senior Solutions Architect
Really Strategies, Inc.
Interchange of XML documents depends in large part on the use of "compatible" vocabularies of element types and attributes, where "compatible" means "understandable and processable by all parties involved in the interchange". The traditional SGML and XML approach to interchange used "interchange" document types that defined a fixed set of element types and attributes to which all interchange parties agreed. History has demonstrated conclusively that this approach does not work. The DITA standard, which is expressly designed to enable blind interchange of DITA documents over the wisest possible scope, avoids this failure by turning the problem around. Rather than making the unit of vocabulary definition the document type and then allow unconstrained extension to it, it makes the units of vocabulary definition invariant modules which are combined by documents to form complete document types. Extension is allowed through two controlled facilities: constraint modules and specialization. The constraint and specialization facilities serve to ensure two preconditions for blind interchange: (1) All DITA documents are inherently and reliably processable to some degree by all general-purpose DITA processors irrespective of the markup details and (2) non-general-purpose DITA processors can quickly determine, from document instances alone, whether or not a given document may contain elements and attributes it does not know how to process. It is this aspect of DITA that distinguishes it from all other XML applications and in particular from traditional "interchange" document types based on monolithic DTDs that allow unconstrained (and unconstrainable) extension and customization. This paper presents the details of the DITA vocabulary and constraint module system and how that mechanism serves to ensure smooth and reliable blind interchange of documents. It makes the argument that the DITA vocabulary module and constraint approach, if not the DITA-specific implementation details, could be applied to any markup application domain and thus to any tag set.