<?xml version="1.0" encoding="UTF-8"?><article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0-subset Balisage-1.2" xml:id="HR-23632987-8973"><title>Axioms of Versioning</title><info><confgroup><conftitle>International Symposium on Versioning XML Vocabularies and Systems</conftitle><confdates>August 11, 2008</confdates></confgroup><abstract><para>The problems of language versioning can be better understood with the help of some formal axioms defining the relations among the extensions and semantics of languages. Such axioms allow us to specify what makes one language extensionally, syntactically, or semantically a subset, superset, or equivalent of another. The difference between syntactic and semantic compatibility makes clear how languages can grow in a forward-compatible way. The key to compatible versioning is to assign new semantics in the new version of a language for syntax that was already accepted in the prior version, but to which the prior version assigned no semantics.</para></abstract><author><personname><firstname>Marc</firstname><surname>de Graauw</surname></personname><personblurb><para>Marc de Graauw is an independent consultant and has worked in IT for 20 years; as an IT architect, he has contributed to digital exchanges in health care, criminal justice, insurance and social security. Marc is a frequent speaker at international conferences and regularly writes articles on health care, versioning and REST-vs-SOA. He lives and works in Amsterdam, the Netherlands.</para></personblurb><affiliation><orgname>Independent</orgname></affiliation><email>marc@marcdegraauw.com</email></author><legalnotice><para>Copyright Marc de Graauw.  This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 Unported License.</para></legalnotice><keywordset role="author"><keyword>versioning</keyword><keyword>semantics</keyword><keyword>compatibility</keyword></keywordset></info><section><title>Introduction</title><para>What is compatibility anyway? What does it mean when we say a version of a language is
         backward and/or forward compatible with another language? What does it mean when we say this
         of a document, or message, or schema, or an application? Does the notion of compatibility
         apply to languages at all, or does it only apply to documents or applications? And what is
         the difference between syntactical an semantical compatibility? </para><para>Questions like these have lead me to explore the notion of compatibility in a more formal
         way. For efforts in achieving back- and forward compatibility in real life applications, a
         framework for understanding key notions in compatible versioning can be very useful. Key to
         understanding the working of advanced compatibility mechanisms is the semantical equivalence
         set. In this paper we shall define what it is, and explain why this notion is central to
         understanding compatibility.</para><para>I will focus on message exchanges, since that is my main area of expertise, but the
         principles apply much wider. In general, they should apply to most computer-understandable
         document formats for data storage and/or exchange. More specific, I do not intend to cover
         natural language, other non-computer languages, programming languages or specifically
         multi-document data stores such as databases.</para></section><section><title>Languages, Sets and Extensions</title><para>Consider a set U of language specifications in the above field. </para><para>It's easy to define for each language specification the set of conforming document
         instances extensionally, as a set. For instance, the extension of UBL 1.0 is the set of all
         possible conforming UBL 1.0 instances, or the extension of HL7v3, May 2007 ballot is the set
         of all possible conforming HL7v3, May 2007 instances. (Often, language specifications for
         message exchange have many flavors and variations. HL7v3 has many versions ("ballots") and
         even those are hardly ever implemented as they are, but instead are localized to a
         particular national environment before implementation.) </para><figure xml:id="Bal2008graa030601" floatstyle="1" xreflabel="Bal2008graa030601"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-001.png" width="100%"/></imageobject><caption><para>Extension ELx of a Language</para></caption></mediaobject></figure><para>If we show this pictorially, we get the above figure. Lx is a language specification which
         has all blue rectangles as conforming documents. The extension of Lx, ELx, are all blue
         rectangles. The complement (everything not in ELx) are all non-rectangular blue shapes, and
         all shapes in other colors.</para><figure xml:id="Bal2008graa030602" floatstyle="1" xreflabel="Bal2008graa030602"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-002.png" width="100%"/></imageobject><caption><para>ELx as Subset of ELy</para></caption></mediaobject></figure><para>The extension of the language Ly consisting of all rectangles in any color is a superset
         of the extension of Lx; for shorthand, we say Ly is a superlanguage of Lx.</para><figure xml:id="Bal2008graa030603" floatstyle="1" xreflabel="Bal2008graa030603"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-003.png" width="100%"/></imageobject><caption><para>ELx as Subset of ELz</para></caption></mediaobject></figure><para>Likewise, Lz, the language of all blue shapes, is a superlanguage of Lx.</para><figure xml:id="Bal2008graa030604" floatstyle="1" xreflabel="Bal2008graa030604"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-004.png" width="100%"/></imageobject><caption><para>ELx as Intersection of ELz and ELy</para></caption></mediaobject></figure><para>And Lx is (extensionally) also the intersection of Ly and Lz.</para><para>Starting with a rather loose notion of "conformance" (let's assume for the moment this
         notion is unproblematic) it's easy to define relation between languages extensionally. A
         language Lx can be an extensional sublanguage of Ly: if every document conforming to Lx is
         also a conforming Ly document. If the reverse is true, all conforming Ly documents are also
         conforming Lx documents, Lx is an extensional superlanguage of Ly. If both are true, Lx and
         Ly are extensionally equivalent. Note that we only speak of extensional equivalence: for all
         we know, Lx may be about pasta recipes and Ly about reggae albums: if by some weird
         happenstance both languages have encoded their information in similar data formats, we can
         still compare the pasta and reggae languages extensionally.</para></section><section><title>Accepting and Rejecting Documents</title><figure xml:id="Bal2008graa030605" floatstyle="1" xreflabel="Bal2008graa030605"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-005.png" width="100%"/></imageobject><caption><para>Accept and Reject Behavior of Px</para></caption></mediaobject></figure><para>Applications can read and write (produce and consume, send and receive) documents. For now
         we still suppose an application reads and writes the same set of documents. In the above
         picture, the implementation Px1 of language Lx accepts all blue rectangles (ELx), and
         rejects all other documents. So through accepting and rejecting, Px1 decides which documents
         are syntactically conformant with Lx (as implemented in Px1).</para></section><section><title>From Syntax to Semantics</title><para>When we move from pure extensions, we must consider the semantics of a language. This is a
         difficult notion. At one extreme, a language may not consider semantics at all, and just
         define a syntactical data format, and leave all semantics up to applications. It's often
         said that "XML is just syntax". But if this is true, XML itself is not very interesting, and
         the real-world applications of XML, such as in UBL or HL7v3, certainly must include some
         semantics. At its highest level, those semantics are descriptions in natural language,
         defining for instance what an invoice or a medication prescription is. At a lower level,
         those semantics may be expressed in more formal ways. </para><para>One way to (try to) express the semantics of messages in a messaging system with respect
         to forward and backward compatibility is to use formal semantics. This is – in my opinion –
         the wrong way. For simple (abstracted, theoretical) systems this might work. One would have
         to model the system and the encapsulated exchanged information using a language such as OWL.
         For real life purposes, this is hardly feasible. To make an OWL description of the
         underlying semantics of UBL or HL7v3 is a (much) larger undertaking than specifying a decent
         versioning mechanism to provide back- and forward compatibility to a reasonable degree: so
         using formal semantics to provide a framework and definitions for the notion of
         compatibility defeats the problem.</para></section><section><title>Semantics and Behavior</title><para>A language specification of the kind under consideration should enable a capable engineer
         to implement a conforming application: an application which presumably can read an write
         conforming documents (let's assume for the moment the application can read every document it
         writes and vice versa). The application as a total then exhibits some behavior: if we use a
         word processor to read a document, we expect the word processor to display the stored text
         and formatting. If we use a medical application to send a prescription, we expect the
         receiving application to display the intended medication and applicable quantity and dosage. </para><para>Languages may not define behavior very strictly. HL7v3 even goes at great length to stress
         is does not restrict application behavior at all, but I do not think this really holds.
         Obviously there are a great deal of application behaviors which are at odds with HL7v3
         specifications, such as displaying pasta recipes instead of medication prescriptions.</para><para>The behavior of a message endpoint is not unproblematic. It's usually best to consider
         some message endpoint to a certain degree as a black box; to consider only what's going in
         and coming out, not what is inside. How it works is not relevant; what it does is relevant.
         But when sending messages, there often is no direct behavior (no response) associated. When
         I send you a message with my new home address, I may expect no response (or just a simple
         ask). I do expect future mail to be delivered to my new home address, however. the point is,
         there may never be new mail to me, so there may never be a response message showing the
         desired behavior. </para><para>On a higher level, the behavior of a corresponding message endpoint may not be
         deterministic at all. When a general physician sends a medication prescription to a
         pharmacist, the pharmacist may overrule the general physician's decision: this is an
         independent professional judgment, based on the pharmacists own competence. </para><para>So the behavior of a message endpoint is not unproblematic. I believe the solution is
         basically to take humans out of the box. We must consider the behavior of a receiving
         computer system when judging the behavior for a message, not the humans interpreting and
         acting on the information. The behavior of the application is also in need of human
         interpretation. For instance, a part of a medication prescription may be:</para><para><code>&lt;code code=”27” codeSystem=”2.16.840.1.113883.2.4.4.5” /&gt;</code></para><para>and the receiving application may display:</para><para><code>"Dissolve in water"</code></para><para>which is a faithful rendering of the information sent. However, instead of "Dissolve in
         water" the receiving application may also display some local code. In general, it is not
         possible to judge whether a receiving application behaves well without the judgment of a
         professional well acquainted with the messaging framework as well as the receiving
         application.</para><para>So where does this leave us with behavior as a basis for a framework for compatible
         versioning? If we take a look a practical aspects, behavior is a near perfect fit. The most
         practical aspect of a framework for compatible versioning is testing the entire messaging
         chain. And especially for this aspect, behavior as a basis for semantic compatibility is
         perfect. The fact that there may – in real life – never be a response message showing the
         desired behavior is not relevant. Relevant is the fact that such behavior is a testable
         condition of the messaging chain. In a test, we can send an "update address" message and
         next test the printing of an address label. Likewise, the fact that behavior is not
         deterministic with a human "in the box" is not a problem: in testing, we generally want a
         professional judging whether the system does what it is supposed to do. Again, behavior as a
         basis for semantic compatibility is a perfect fit. </para><para>If we have a certain implementation of a language specification, the reading of a document
         should result in a changed application state: for instance, the text on the screen has
         changed, or its internal data store has been updated, resulting in specific behavior if the
         data store is later queried. Let's assume, as most often is the case, that the behavior (the
         state change) is deterministic: if the application is in the same state before reading a
         document, then reading the document always results in the same end-state.</para></section><section><title>Behavior of Language Processors</title><figure xml:id="Bal2008graa030606" floatstyle="1" xreflabel="Bal2008graa030606"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-006.png" width="100%"/></imageobject><caption><para>Behavior of Px</para></caption></mediaobject></figure><para>We can show the behavior of a processor Px1 pictorially as in the above figure: for every
         document in ELx, Px1 exhibits some particular behavior. Px1 is partitioned, as it were, in
         behavioral compartments, one for each document.</para><para>Of course it is quite common for messaging systems to have different behaviors for exactly
         the same document: for instance, a common property of reliable messaging is for an order to
         be executed, and a duplicate to be rejected (unless the original was lost, which is the
         raison d'être of reliable messaging). This is no serious obstacle to the model: for testing
         purposes, it is sufficient (and desirable) to assume a fixed start state for any sequence of
         messages, which makes the responses deterministic. For real life systems, if we wanted to
         model this aspect, we could assume the behavior to be determined by the message and (some
         aspect of) the state of the processor. Like before, I believe a framework in which testable
         scenario's can be described sufficient.</para><para>Next we need to make some idealizations to get rid of all the processors. We'll simply
         assume every language specification is flawless (not hard to imagine, is it?) and there is a
         perfect processor for every language; we'll take that processor as an exemplar of the
         language. And we no longer assume a processor makes the same set of documents as it accepts.</para></section><section><title>Syntactical Compatibility</title><para>This allows us to define syntactic compatibility: a (language) processor is syntactically
         compatible with another (language) processor if they are able to exchange documents without
         syntactic failure: if they are able to accept each other's documents. A common case is when
         they implement the same language (version). </para><figure xml:id="Bal2008graa030608" floatstyle="1" xreflabel="Bal2008graa030608"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-007.png" width="100%"/></imageobject><caption><para>Behavior with Disjunct ELX and ELy</para></caption></mediaobject></figure><para>But the processors need not produce the same documents at all. Another common case is
         where one processor makes and sends orders, but does not receive orders, and the other
         processor receives orders and makes order confirmations, but no orders. Even though the
         processors make and accept different documents, it makes perfect sense to say they are
         syntactically compatible. So we can say a processor Px is compatible with a processor Py if
         Px makes only documents which Py accepts, and Py makes only documents which Px accepts. (It
         would be more common to say Px and Py are partial implementations of one and the same
         language, but that does not change the point.</para><figure xml:id="Bal2008graa030609" floatstyle="1" xreflabel="Bal2008graa030609"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-008.png" width="100%"/></imageobject><caption><para>Behavior and Forward Compatiblity of Px and Px+1</para></caption></mediaobject></figure><para>For versioning, the interesting case is twofold: for backward syntactical compatibility, a
         processor Px+1 should accept all documents produced by Px. and for forward syntactical
         compatibility, a processor Px should accept all documents produced by Px+1. The above
         picture highlights one variant, where Px and Px+1 both accept the same set of documents, but
         make different sets.</para><para>The accept/reject reaction to a document is fundamental to the entire compatibility and
         versioning problem. However, the accept/reject distinction is not syntactical in nature. It
         is a behavioral aspect of a processor: accepting or rejecting is a response to a document,
         it attaches meaning to the document: acceptable! (or not). So the basic aspect which decides
         syntactical conformance in a sender/receiver (writer/reader, producer/consumer)
         configuration is itself semantical in nature.</para><para>The key to enhance (future) compatibility on a syntactic level is to make processor accept
         more than they produce. In other words, a processor should not reject everything it could
         not produce itself, and should not reject everything it does not completely understand.
         Next, we'll see what to do with that un-understood "extra".</para></section><section><title>Semantical Equivalence Sets</title><figure xml:id="Bal2008graa030610" floatstyle="1" xreflabel="Bal2008graa030610"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-009.png" width="100%"/></imageobject><caption><para>Behavior of Px</para></caption></mediaobject></figure><para>In the above figure, we have a processor for a language which makes only blue rectangles,
         but accepts all rectangles. If we now associate some particular behavior which each class of
         documents, we get semantical equivalence sets. Not every document is normally associated
         with some unique behavior: we often have syntactical redundancy, as in C's i=1+1 and i+=1 or
         attribute order in XML: so there is a set of documents which excite the same behavior in a
         receiver. The pattern of syntactical forward compatibility, where a receiver accepts more
         than it can act upon, achieves the same: there is a set of documents for which trigger the
         same behavior: since the receiver does not understand the extra syntactical components, it
         cannot act upon them, and it therefore must act the same, whatever the extra content.</para></section><section><title>Semantics and Compatibility</title><figure xml:id="Bal2008graa030611" floatstyle="1" xreflabel="Bal2008graa030611"><mediaobject><imageobject><imagedata format="png" fileref="../../../vol2/graphics/deGraauw01/deGraauw01-010.png" width="100%"/></imageobject><caption><para>Behavior of Px+1</para></caption></mediaobject></figure><para>If a language Lx has a successor Lx+1, Lx+1 may specify additional behavior for documents
         accepted, but not produced by Lx: in the above figure, green and red rectangles. This has
         two consequences: - for every document produced by Px, Px+1 will behave as Px expects: it is
         therefore safe for Px to send messages to Px+1 - for documents produced by Px+1 which Px
         cannot produce (again the green and red rectangles) Px will exhibit the same behavior as for
         other documents in the same semantical equivalence set: the language designer of Lx+1 will
         know what behavior this is, and if it is deemed acceptable, it is safe for Px+1 to send
         messages to Px (if the behavior is not deemed acceptable, Px+1 must make sure in some way
         that Px will reject those documents, maybe through a "mustUnderstand" flag).</para></section><section><title>Conclusion</title><para>The basics of compatible versioning thus are: 1. make sure Px accepts more documents than
         it produces (or fully understands), and, 2. partition the semantical equivalence sets of Px
         into smaller, more refined semantical equivalence sets for Px+1 with additional (new)
         behavior.</para></section><appendix xml:id="mul88"><title>Definitions
      </title><para>This Appendix contains informal definitions of the concepts of compatibility.
      </para><orderedlist><listitem><para>
               Two languages are <emphasis>extensionally equivalent</emphasis> if they accept the same set of documents.
            </para></listitem><listitem><para>
               A language is an <emphasis>extensional sublanguage </emphasis>of a second language if all documents accepted by the first language are also accepted by the second.
            </para></listitem><listitem><para>
               If two processors behave the same for every document which belongs to a language Lx, the processors are <emphasis>behaviourally equivalent</emphasis> for Lx.
            </para></listitem><listitem><para>
               Two languages are <emphasis>syntactically compatible</emphasis> if they accept the documents produced by each other.
            </para></listitem><listitem><para>
               A language change is <emphasis>syntactically backward compatible</emphasis> if a new receiver accepts all documents produced by an older sender.
            </para></listitem><listitem><para>
               A language change is <emphasis>syntactically forward compatible</emphasis> if an old receiver accepts all documents produced by a new sender.
            </para></listitem><listitem><para>
               If two languages take the same documents as input, and their processors behave the same for every document, the languages are <emphasis>semantically equivalent</emphasis>.
            </para></listitem><listitem><para>
               The<emphasis> semantical equivalence set </emphasis>of a document d is the set of documents which make a processor behave the same as d.
            </para></listitem><listitem><para>
               A language is a <emphasis>semantical superlanguage</emphasis> of a second language if for all documents produced by the second language, processors of the first language behave the same as processors of the second language.
            </para></listitem><listitem><para>
               A later version of a language is <emphasis>semantically backward compatible</emphasis> with an earlier version if the later version is a semantical superlanguage of the earlier one (an old sender may expect a newer, but semantically backward compatible, receiver to behave as the sender intended).
            </para></listitem><listitem><para>
               An earlier language is <emphasis>semantically forward compatible</emphasis> with a later language  iff the later language is a semantical sublanguage of the earlier one (this is only possible if a language loses semantics).
            </para></listitem><listitem><para>
               A later language <emphasis>semantically extends</emphasis> and earlier language if the later language  introduces new behaviour for some documents accepted, but not produced by the earlier one.
            </para></listitem></orderedlist></appendix><appendix><title>Formal Axiomatization</title><para>This appendix contains a formal account of the principles which are introduced informally in the main article.</para><orderedlist numeration="arabic"><listitem><para>Let U be a set of (specifications of) software processable languages {L1, L2, ...
               Ln} <orderedlist numeration="lowerroman"><listitem><para>This axiomatization does not concern itself with natural language</para></listitem></orderedlist>
            </para></listitem><listitem><para>The <emphasis>extension</emphasis> of a language Lx is a set of conforming
               document instances ELx = {Dx1, Dx2, ... Dxn} <orderedlist numeration="lowerroman"><listitem><para>Iff ELx = ELy then Lx and Ly are extensionally equivalent <itemizedlist><listitem><para>Lx and Ly may still be different: they may define different
                                 semantics for the same syntactical construct</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>Iff ELx ⊂ ELy then Lx is an extensional sublanguage of
                     Ly</para></listitem><listitem><para>Iff ELx ⊃ ELy then Lx is an extensional superlanguage of
                     Ly</para></listitem><listitem><para>D is the set of all possible documents; or the union of all ELx where Lx
                        ∈ U </para></listitem></orderedlist>
            </para></listitem><listitem><para>For each Lx ∈ U there is a set of <emphasis>processors</emphasis> Px =
               {Px1, Px2, ... Pxn} which implement Lx <orderedlist numeration="lowerroman"><listitem><para>Each Pxy exhibits behaviour as defined in Lx</para></listitem><listitem><para>Processors can produce and consume documents</para></listitem><listitem><para>Each Pxy produces only documents it can consume itself</para></listitem><listitem><para>At consumption, Pxy may accept or reject documents</para></listitem></orderedlist>
            </para></listitem><listitem><para>The <emphasis>behaviour</emphasis> of a processor Pxy of language Lx is a function
               Bxy <orderedlist numeration="lowerroman"><listitem><para>The function Bxy takes as argument a document, and it's value is a
                        processor state <itemizedlist><listitem><para>We assume a single processor state before each function
                                 execution, alternatively we could assume a &lt;state,
                                 document&gt; tuple as function argument</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>If for two processors Pxy and Pxz for language Lx for a document d Bxy(d)
                        = Bxz(d) then the two processors behave similar for d <itemizedlist><listitem><para>Two processor states for language Lx are deemed equivalent if a
                                 human with thorough knowledge of language specification Lx
                                 considers the states equivalent. Details may vary insofar as the
                                 language specification allows it.</para></listitem><listitem><para>Processor equivalence is not intended to be formally or
                                 computably decidable; though in some cases it could be.</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>If ∀d ( d ∈ ELx → Bxy(d) = Bxz(d) ) then
                        Pxy and Pxz are behaviourally equivalent for Lx <itemizedlist><listitem><para>If two processors behave the same for every document which
                                 belongs to a language Lx, the processors are behaviourally
                                 equivalent for Lx.</para></listitem></itemizedlist>
                     </para></listitem></orderedlist>
            </para></listitem><listitem><para>An <emphasis>ideal language</emphasis> specifies all aspects of desired processor
               behaviour completely and unambiguously; assume all languages in U are ideal</para></listitem><listitem><para>A processor Px is an <emphasis>exemplary processor</emphasis> of a language Lx if
               it fully implements language specification Lx; assume all processors for all
               languages in U are exemplary <orderedlist numeration="lowerroman"><listitem><para>Because they are (defined to be) exemplary, every two processors for a
                        language Lx are behaviourally equivalent</para></listitem><listitem><para>ELx = { d is a document ∧ Px accepts d }</para></listitem><listitem><para>The complement of ELx is the set of everything (normally, every document)
                        which is rejected by Px</para></listitem><listitem><para>The make set MLx = { d is a document ∧ Px can produce d
                     }</para></listitem></orderedlist>
            </para></listitem><listitem><para>A language Lx is <emphasis>syntactically compatible</emphasis> with Ly iff MLx
               ⊂ ELy and MLy ⊂ ELx <itemizedlist><listitem><para>Two languages are syntactically compatible if they accept the documents
                        produced by each other.</para></listitem></itemizedlist>
               <orderedlist numeration="lowerroman"><listitem><para>A language Ln+1 is <emphasis>syntactically backward compatible</emphasis>
                        with Ln iff MLn ⊂ ELn+1 and Ln+1 is a successor of Ln <itemizedlist><listitem><para>A language change is syntactically backward compatible if a new
                                 receiver accepts all documents produced by an older sender.</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>A language Ln is <emphasis>syntactically forward compatible</emphasis>
                        with Ln+1 iff MLn+1 ⊂ ELn and Ln+1 is a successor of Ln <itemizedlist><listitem><para>A language change is syntactically forward compatible if an old
                                 receiver accepts all documents produced by a new sender.</para></listitem></itemizedlist>
                     </para></listitem></orderedlist>
            </para></listitem><listitem><para>A document d can be a member of the extension of any number of languages</para><orderedlist numeration="lowerroman"><listitem><para>Px is an (exemplary) processor of Lx, Py is an (exemplary) processor of
                     language</para></listitem><listitem><para>Two languages Lx and Ly are <emphasis>semantically equivalent</emphasis> iff
                     ELx = ELy ∧ ∀d ( (d ∈ ELx ) → Bx(d) =
                     By(d) )</para><itemizedlist><listitem><para>If two languages Lx and Ly take the same documents as input, and their
                           exemplary processors behave the same for every document, the languages
                           are semantically equivalent.</para></listitem><listitem><para>Two languages can only be compared if their exemplary processors are
                           similar enough to be compared. </para></listitem><listitem><para>Not every two languages can be compared.</para></listitem><listitem><para>"Semantic" should not be interpreted in the sense of "formal
                           semantics".</para></listitem></itemizedlist></listitem></orderedlist></listitem><listitem><para>The <emphasis>semantical equivalence set</emphasis> of a document d for Lx = { y
               ∈ ELx | Bx(d) = Bx(y) } <orderedlist numeration="lowerroman"><listitem><para>Or: SLx,d = { y ∈ ELx | Bx(d) = Bx(y) } <itemizedlist><listitem><para>The semantical equivalence set of a document d is the set of
                                 documents which make a processor behave the same as d</para></listitem><listitem><para>Semantical equivalence occurs for expressions which are
                                 semantically equivalent, such as i = i + 1 and i += 1 for C, or
                                 different attribute order in XML etc.</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>d ∈ SLx,d</para></listitem><listitem><para>Any two semantical equivalence sets of Lx are necessarily disjunct</para><itemizedlist><listitem><para>If z ∈ SLx,e were also z ∈ SLx,d then every
                              member of SLx,e would be in SLx,d and vice versa and thus SLx,d =
                              SLx,e</para></listitem></itemizedlist></listitem></orderedlist>
            </para></listitem><listitem><para>A language Ly is a <emphasis>semantical superlanguage</emphasis> of Lx iff
               ∀d ( d ∈ MLx → By(d) = Bx(d) ) <orderedlist numeration="lowerroman"><listitem><para>For all documents produced by Px, Py behaves the same as Px <itemizedlist><listitem><para>Equivalence in this case should be decided based on Lx; if Ly
                                 makes behavioural distinctions which are not mentioned in Lx,
                                 behaviour is still the same as far as Lx is concerned</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>It follows: ∀d ( d ∈ MLx → ∃SLy,d
                        ( SLy,d ⊂ ELy ∧ ( SLx,d ∩ MLx ) ⊂
                        SLy,d ∧ By(d) = Bx(d) ) ) <itemizedlist><listitem><para>For any document produced by Px, the part of its semantical
                                 equivalence set which Px can actually produce, is a subset of the
                                 semantical equivalence set of Py for this document</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>For all d ∈ ELx ∧ d ∉ MLx there may be many
                        equivalence sets in Ly for which By(d) ≠ Bx(d) <itemizedlist><listitem><para>In other words: for documents accepted but not produced by Px,
                                 Ly may define additional behaviours</para></listitem></itemizedlist></para></listitem><listitem><para>Lx is a semantical sublanguage of Ly iff Ly is a semantical superlanguage
                        of Lx</para></listitem></orderedlist>
            </para></listitem><listitem><para>A language Ln+1 is <emphasis>semantically backward compatible</emphasis> with Ln
               iff Ln+1 is a semantical superlanguage of Ln and Ln+1 is a successor of Ln
                  <orderedlist numeration="lowerroman"><listitem><para>An old sender may expect a newer, but semantically backward compatible,
                        receiver to behave as the sender intended</para></listitem><listitem><para>A language Ln is <emphasis>semantically forward compatible</emphasis>
                        with Ln+1 iff Ln+1 is a semantical sublanguage of Ln and Ln+1 is a successor
                        of Ln</para></listitem><listitem><para>Semantic forward compatibility is only possible if a language loses
                        semantics; i.e. its processors exhibit less functionality, and produce less
                        diverse documents</para></listitem><listitem><para>A processor cannot understand what it does not know about yet</para></listitem></orderedlist>
            </para></listitem><listitem><para>A language Ln+1 <emphasis>semantically extends</emphasis> Ln iff ∃d ( d
               ∈ MLn+1 ∧ d ∈ ELn ∧ BLn+1 ≠ BLn ) <itemizedlist><listitem><para>Ln+1 introduces new behaviour for some documents accepted, but not
                        produced by Ln</para></listitem></itemizedlist>
               <orderedlist numeration="lowerroman"><listitem><para>Assume Ln is syntactically forward compatible with Ln+1, Ln+1 is
                        semantically backward compatible with Ln, and Ln+1 semantically extends Ln <itemizedlist><listitem><para>Pn accepts all documents produced by Pn+1, or: MLn+1 ⊂
                                 ELn</para></listitem><listitem><para>For documents produced by Pn, Pn+1 behaves as Pn would expect,
                                 or: ∀d( d ∈ MLn → Bn+1(d) = Bn(d)
                              )</para></listitem><listitem><para>But for some documents accepted, but not produced by Pn, Pn does
                                 not behave as Pn+1 would, or: ∃d ( d ∈ MLn+1
                                 ∧ d ∈ ELn ∧ d ∉ MLn
                                 ∧ BLn+1 ≠ BLn )</para></listitem><listitem><para>New senders send documents to old receivers, which do not behave
                                 as new receivers would behave</para></listitem></itemizedlist></para></listitem><!-- --><!-- --><listitem><para>∀d (d ∈ MLn+1 ∧ d ∈ ELn
                        ∧ d ∉ MLn → ∃d'( d' ∈ MLn
                        ∧ Bn(d') = Bn(d) ) <itemizedlist><listitem><para>For every document d, accepted but not produced by Pn, there is
                                 a document d' for which Pn behaves the same as for d, or: d'
                                 ∈ SLn,d (the semantical equivalence set of d for
                              Ln)</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>We can assume Ln <emphasis>transforms document</emphasis> d to document
                        d' <itemizedlist><listitem><para>If a language Ln accepts a document which it would not produce,
                                 it transforms it to a document which it could produce</para></listitem><listitem><para>There has to be no literal transformation</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>We now have: Bn(d') = Bn(d) ∧ Bn+1(d') ≠ Bn+1(d) <itemizedlist><listitem><para>Pn behaves the same for d and the transformed d'; but Pn+1
                                 behaves differently</para></listitem></itemizedlist>
                     </para></listitem><listitem><para>Iff it is acceptable that Bn(d') = Bn(d) for transformed documents, then
                        Pn <emphasis>may ignore</emphasis> the semantical extension responsible for
                        Bn+1(d') ≠ Bn+1(d) <itemizedlist><listitem><para>The fact that Ln is syntactically forward compatible with Ln+1
                                 enables Pn+1 to introduce new semantics without breaking existing
                                 Pn</para></listitem><listitem><para>We could say Ln is <emphasis>partially semantically forward
                                    compatible</emphasis> with Ln+1; it only ignores semantics it
                                 may ignore</para></listitem><listitem><para>New senders can interact with old receivers; old receivers may
                                 ignore what they do not understand</para></listitem></itemizedlist></para></listitem><listitem><para> Iff it is not acceptable that Bn(d') = Bn(d) for transformed documents,
                        then Pn <emphasis>must understand</emphasis> the semantical extension
                        responsible for Bn+1(d') ≠ Bn+1(d) <itemizedlist><listitem><para>If Pn must understand d, then Pn should not process d</para></listitem><listitem><para>A Pn+1 must not send d to a Pn; or Pn should be adapted so that
                                 Pn rejects d (or: d ∉ ELn)</para></listitem><listitem><para>If a Pn+1 can send d to a Pn, then Ln should not be (completely)
                                 syntactically forward compatible with Ln+1</para></listitem><listitem><para>A processor should not accept what it must understand, yet
                                 cannot understand</para></listitem></itemizedlist>
                     </para></listitem></orderedlist>
            </para></listitem></orderedlist></appendix><bibliography><title>Bibliography</title><bibliomixed>UBL. <link xlink:type="simple" xlink:show="new" xlink:actuate="onRequest">http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ubl</link></bibliomixed><bibliomixed>HL7v3. <link xlink:type="simple" xlink:show="new" xlink:actuate="onRequest">http://www.hl7.org/</link></bibliomixed><bibliomixed>Versioning XML with XML Schema, David Orchard, XTech 2008. <link xlink:type="simple" xlink:show="new" xlink:actuate="onRequest">http://2008.xtech.org/public/schedule/detail/530</link></bibliomixed><bibliomixed>Extending and Versioning Languages: Terminology, Draft TAG Finding 13 November 2007. <link xlink:type="simple" xlink:show="new" xlink:actuate="onRequest">http://www.w3.org/2001/tag/doc/versioning-20071113.html</link></bibliomixed><bibliomixed>Extending and Versioning Languages: Strategies, Draft TAG Finding 17 September 2007. <link xlink:type="simple" xlink:show="new" xlink:actuate="onRequest">http://www.w3.org/2001/tag/doc/versioning-strategies-20070917.html</link>.</bibliomixed></bibliography></article>
