The Clinical Document Architecture (CDA) Release 2 is derived from the Healthcare Level 7 (HL7) Reference Information Model (RIM). It defines a formal model and semantics for concepts found in a clinical document — acts, such as procedures, substance administrations, and observations of laboratory findings; entities such as people, places, devices, and drugs; relationships such as participation and causality; and document metadata.
CDA is undeniably complex, covering as it does the full range of clinical documents for multiple countries and regions. Its derivation from a Unified Modeling Language (UML) representation of healthcare concepts reflects that complexity.
A CDA Implementation Guide (IG) defines a specific implementation of CDA by specifying how to use these building blocks to express a particular kind of clinical document — a report on infection in hemodialysis patients, or the history and physical taken during a visit to a physician.
The CDA model is instantiated as one W3C Schema (version 1.0) that covers all these related clinical report types (also called implementations). The model is very expressive but there are technology-determined gaps between the UML of the model and the XML Schema expression of it, and then further gaps between the generic XML Schema that covers all clinical documents and the specific type of document described in a given implementation guide for a specific purpose. Some of these gaps could perhaps be filled if CDA used W3C XSD 1.1, or RELAX NG, to define the schema, but neither of those options are likely to be possible in the near term, given the practicalities of implementations of complex standards that are used across the world in critical healthcare systems.
People who are guaranteeing to a healthcare organization that the documents they deliver contain the right information for a specific purpose, and expressed using the right syntax, need to know that the validation we provide for testing will pass all good files and fail all bad files. This means we have to test our validation mechanism, and that mechanism has to be in addition to the basic CDA.xsd (1.0) validation.
The validation mechanism we choose has to fulfill a number of criteria. It must be easy for non-XML experts to use to test the files that come out of their implementations, and it must be able to be used in many ways. It must cover the gaps between the prose definitions in a specific implementation guide and the generic XSD schema that is used for a large number of implementation guides as much as possible. The goal is to make validation available in appropriate forms to guarantee the quality of the XML documents that the public health organizations receive from hospitals and other healthcare organizations.
This paper introduces Clinical Document Architecture (CDA) concepts to show why Schematron validation is needed to supplement schema validation, discusses how we currently produce and test the Schematron validation, and explores some challenges. We are interested in other approaches to quality management and testing that we could investigate to supplement our current methods.
The documentation examples used in this paper are taken from the HL7 Implementation Guide for CDA® Release 2 - Level 3: Healthcare Associated Infection Reports, Release 7 (US Realm) (HAI R7 IG), available at http://www.hl7.org/dstucomments/.
Much of this work was carried out for the Lantana Consulting Group.
The authors appreciate the comments made by the anonymous peer reviewers as well as Rick Geimer and Liora Alschuler from Lantana.
Aspects of the CDA Model
Those seeking to represent clinical documents in XML face the same choices as in many other areas: where do we place the line between a centrally-mandated model that omits data that only a few participants need to record, and a free-for-all in which no two documents are modelled in the same way? CDA addresses this in a two-pronged approach — the abstract model is expressed as elements; attribute values refine the meaning, with heavy reliance on public vocabularies. These vocabularies are slightly different than the ones often assumed in an XML context; they are not concepts described in an XML schema but rather an ontology or set of codes that can be (and are) described in many different formats. In many ways they play a role similar to that played by elements in some schemas -- they associate semantic meaning with the data. Healthcare professionals who specialize in vocabulary can spend just as long arguing over the precise meaning of a term that is to be defined in a codeset, or over which one to use in which context, as XML schema designers spend in arguing over what name to give a particular element in its context. We will come back to vocabularies again throughout this paper.
What does it mean to say that attribute values refine the meaning of an element?
CDA has two aspects: a text-heavy document aspect, called the narrative block, that is recorded in HTML-like elements; and an interoperable, machine-processable aspect with more precise semantics, called clinical statements or coded entries. Coded entries do not record presentational aspects such as section, paragraph, table, list, or figure. Rather, they record specializations of the abstract concepts of entities, acts, and relationships. These specializations have XML names: a participant is a specialization of the entity concept; procedure and substanceAdministration are specializations of an act; component-of, is-reason-for, is-cause-of are types of relationship. Some of these examples are elements, others are attributes.
In many applications of XML, attribute values are not central to interpretation. Some of us were taught, when first learning a pointy-bracket syntax, that an element name classifies the content and an attribute provides additional, secondary information:
<animal coatColor="brown">dog</animal>In CDA attributes have a stronger role: rather than providing supplementary information, they usually continue refining the taxonomic distinctions made by elements.
In the procedure element below, the code element refines its parent's meaning by specifying the kind of procedure, using a value from a specific vocabulary. The vocabulary is identified in the codeSystem attribute by a dot-notation object identifier (OID):
<procedure> <!--ID of procedure --> <id root="2.16.840.1.113822.214.171.124.126.96.36.199" extension="232323"/> <code codeSystem="2.16.840.1.113883.6.96" code="423827005" displayName="Endoscopy"/> </procedure>
The refinement of meaning can have multiple levels. The previous example captures "A procedure; what kind of procedure? An endoscopy." The next example shows a waterfall-like nesting of questions and answers: "A participant; what kind of participant? A location; what kind of location? A service delivery location; what kind of service delivery location? A Medical/Surgical Critical Care unit."
<participant typeCode="LOC"> <associatedEntity classCode="SDLOC"> <!--ID of facility --> <id root="2.16.840.1.1138188.8.131.52.184.108.40.206" extension="9W""/> <code codeSystem="2.16.840.1.113883.6.259" codeSystemName="HL7 Healthcare Service Location Code" code="1029-8" displayName="Medical/Surgical Critical Care"/> </associatedEntity> </participant>
CDA has two more general specializations of the act concept, the observation elements and act elements.
<observation classCode="OBS" moodCode="EVN" negationInd="false"> <code codeSystem="2.16.840.1.113883.6.96" code="50373000" displayName="Body Height"/> <value xsi:type="PQ" value="180" unit="cm"/> </observation>This captures "An observation; of what? A body height; what height was observed? 180cm."
A consequence of the semantic role of attributes in CDA XML is that the words "value" and "code" have several usages: the value element, its value attribute, the value of that attribute (which may be a code), the value of the code element's code attribute (which is always a code), or -- which is usually clear from context -- the value of some other attribute. (Ordinary speech is similarly challenged in distinguishing between the abstract concepts, which are UML classes, such as Act, and XML elements in the CDA schema, such as act.) To cut through that confusion, don't think about the XML first! Focus on the clinical content -- what is being expressed? -- and consider the XML elements and attributes as packaging.
What is to be expressed? A body height. That's an observation. An observation of what? body height (code element: code attribute). What was observed? 180cm. (value element: datatype is physical quantity, value is 180, unit is cm)
Uses of Attribute Values
In CDA, attribute values can have implications for the node tree, primarily through alternatives and through conditional requirements.
The range of relationships in clinical content goes far beyond child containment, so the model interposes a wrapper that can carry information about the relationship. Here we’re recording the micro-organism cause of a positive blood culture:
<observation classCode="OBS" moodCode="EVN" negationInd="false"> <code code="ASSERTION" codeSystem="2.16.840.1.113883.5.4"/> <statusCode code="completed"/> <value xsi:type="CD" codeSystem="2.16.840.1.113883.6.277" code="1955-4" displayName="Positive blood culture"/> <entryRelationship typeCode="CAUS" inversionInd="true"> <observation classCode="OBS" moodCode="EVN"> ... </observation> </entryRelationship> </observation>
One powerful attribute, @moodCode, expresses something akin to mood in English verbs: it can change the sense of a substanceAdministration element from prescription (an intent) to application (an event).
To record that something did not happen or was not done, CDA provides a negation mechanism — this is also an attribute value. This patient experienced no adverse reaction:
<observation classCode="OBS" moodCode="EVN" negationInd="false"> <code codeSystem="2.16.840.1.113883.5.4" code="ASSERTION"/> <statusCode code="completed"/> <value xsi:type="CD" codeSystem="2.16.840.1.113883.6.96" code="281647001" displayName="Adverse reaction"/> </observation>
This has great expressive power when used in combination with relationships (the cause of the fever was not the bacterium).
Of course, that is not the same as not knowing whether the cause of the fever was the bacterium....
Unlike many paper forms and database tables, CDA makes a strong distinction between a value and the reason a value is not recorded. Such reasons are recorded in a @nullFlavor attribute. Here, we haven’t asked for the patient’s birthdate (perhaps the patient arrived unconscious and without his wallet):
For an elderly person living in a remote village, the appropriate nullFlavor might be “UNK” — the question was asked, but the answer wasn’t known.
The Price of Power
The price of this expressive power and interoperability is complexity, of course. Nevertheless this provides a reasonably concise expression of the very large world of clinical documents that is the model's scope. Every layer has a role; any collapsing of the model leaves some body of information out. The act relationships and moods are elegant: convert an intent to administer tachytherapy into a report of having done it by changing the moodCode from INT to EVN, and look forward to processing the a volume of XML documents to compare the number of intents to the number of events, or to report on the average elapsed time between intent and event.
Vocabulary for those Attribute Values
Controlled and widely-used vocabularies are crucial to making this approach work. There are several vocabularies covering every aspect of healthcare, from units of measure to precise descriptions of body parts as a surgeon would view them). These public vocabularies are crucial for interoperability.
There are many public vocabularies in the healthcare realm: for example, SNOMED CT is a core general terminology with more than 311,000 active concepts organized into hierarchies that is commonly used for clinical findings and body parts; RxNorm provides normalized names for clinical drugs and ingredients. There are, of course, overlapping vocabularies with concepts that almost, but not quite agree with each other, so in practice many healthcare systems need to support multiple vocabularies to cover all the cases.
These vocabularies are made available under differing licensing terms, and in different formats. For the schematron testing purposes we create custom XML files with only the terms (codes) that are relevant to the specific implementation guide. The format we use has entries like this:
<code value="413495001" displayName="ASA physical status class 1" NHSNdisplayName="Normally healthy patient" codeSystem="2.16.840.1.113883.6.96"/> <code value="413496000" displayName="ASA physical status class 2" NHSNdisplayName="Patient with mild systemic disease" codeSystem="2.16.840.1.113883.6.96"/> <code value="413497009" displayName="ASA physical status class 3" NHSNdisplayName="Patient with severe systemic disease, not incapacitating" codeSystem="2.16.840.1.113883.6.96"/> <code value="413498004" displayName="ASA physical status class 4" NHSNdisplayName="Patient with incapacitating systemic disease, constant threat to life" codeSystem="2.16.840.1.113883.6.96"/> <code value="413499007" displayName="ASA physical status class 5" NHSNdisplayName="Moribund patient, < 24-hour life expectancy" codeSystem="2.16.840.1.113883.6.96"/> </system>
The many-digit numbers are globally unique object identifiers (usually abbreviated as OID). These identifiers are the preferred method of identifying objects in HL7 standards such as CDA, and are used for everything from sets of vocabulary (e.g., the ValueSet definition above) to chunks of the implementation guide, known as templates (referenced in the pattern id in the schematron snippet). HL7 has an OI registry, available at http://www.hl7.org/oid/index.cfm, with more information about the design and use of OIDs.
One thing to note about OIDS: there is something of a structure in that the left-most number is considered the root and the right-most number the leaf node on the tree. OIDs are assigned to organizations at a particular sub-tree level, and how that organization chooses to arrange its sub-tree depends on that organization. It may choose to have a logical structure for its OIDs, or not.
Constraints, Value Sets, Alternatives and Conditionals
In any healthcare record there are rules about which information must be present. In CDA convention these are represented as constraints on the CDA model. The constraints have a formal prose representation that is published in a document called an Implementation Guide because it defines an implementation of CDA. For example, an observation representing an adverse reaction:
5. SHALL contain [1..1] code (CONF:11542). a. This code SHALL contain [1..1] @code, which SHALL be selected from ValueSet 2.16.840.1.114220.127.116.1191 NHSNAdverseReactionTypeCode DYNAMIC (CONF:4698).We generate this prose representation from a database. A constraint is associated with a context (observation) and is recorded in data such as "conformance verb" (SHALL), "value", "value conformance", and "value set".
A value set is a set of coded concepts, drawn from one or more public vocabularies, that are appropriate for the context. In the example above, the value set members are types of adverse reaction. The concepts in the previous example, showing patient status, are members of a value set named ASAClassCode.
Constraints that express alternatives are common in some implementations of CDA.
One necessary usage is to require that
a code element contain either (a) both the
codeSystem attributes OR (b) a
Value-driven conditional rules arise for specific content situations;
for example, if the procedure being recorded was a cesarean (the relevant
the report must also specify the estimated maternal blood loss.
Why Schematron validation is Needed to Supplement Schema Validation
As we've seen, much of the meaning of a CDA document resides in an element's attribute values, which are used to
refine the meaning of those elements (rather than merely to describe the object, as in
expand the varieties of relationship beyond what’s available from the XML tree,
vary the verb mood,
switch the subject and object of a compound expression,
explain the absence of a value.
These tools can build remarkably complex sentences. “Marie’s grandmother, who is her legal guardian, said Marie had pneumonia when she was six, which went untreated and is a possible explanation for the scarring on her lungs; however, Marie’s mother denied this and her father was unsure.”
In any healthcare records there are report-specific rules about which data must be present. Since so much of the content in CDA is recorded in attribute values, these rules amount to value dependencies, which are not adequately expressible in W3C Schema validation. Some of the report-specific rules could be tested with a custom W3C Schema, but not all, and, in practice, many of the most important report-dependent rules cannot be checked by even a custom W3C Schema.
The two main problem areas for validation are alternatives and value-dependent conditionals.
As we saw above, one commonly-used construct in CDA is to require that a code element contain either the code and codeSystem attributes (with optional displayname and codeSystemName attributes), OR a nullFlavor attribute. The CDA Schema allows all the relevant attributes to appear on the code element, in any combination. As a result, a valid document instance might populate the code attribute without the codeSystem attribute, or populate both the code and nullFlavor attributes. Both combinations are inherently meaningless, but the CDA Schema can’t check for them.
The conditional rules that arise for specific content situations can be expressed as
if [some XPath] then [some other XPath]For example,
If procedure/code/@code="1234" (a specific type of procedure), then performer/id must be present.Since the type of procedure is recorded as an attribute value, even a custom XML Schema can’t check this requirement.
Schematron covers the gap. We use a two-step validation approach: first against the CDA XML Schema file (CDA.xsd), and then against a Schematron file and custom vocabulary file that tests the rules that cannot be expressed in the CDA.xsd. Currently we are using Schematron 1.5 and XPath 1, for compatibility/historical reasons; we are gradually moving to ISO Schematron and thence to XPath 2.
Creating the Schematron Validation
A typical schematron section looks like this:
<sch:pattern id="p-2.16.840.1.113818.104.22.168.6.41-errors" name="p-2.16.840.1.113822.214.171.124.6.41-errors"> <sch:rule context="cda:observation[cda:templateId/@root='2.16.840.1.1138126.96.36.199.6.41']"> <sch:assert test="count(cda:statusCode[@code='completed'])=1">shall contain 1..1 statusCode=completed "Completed" (CodeSystem: 2.16.840.1.113883.5.14) (CONF:2282)</sch:assert> <sch:assert test="count(cda:code[@codeSystem='2.16.840.1.113883.6.1'][@code='41852-5'])=1">shall contain 1..1 code=41852-5 "Microorganism Identified" (CodeSystem: 2.16.840.1.113883.6.1) (CONF:2281)</sch:assert> <sch:assert test="@classCode='OBS'">shall contain 1..1 @classCode=OBS "Observation" (CodeSystem: 2.16.840.1.113883.5.6) (CONF:2279)</sch:assert> <sch:assert test="@moodCode='EVN'">shall contain 1..1 @moodCode=EVN "Event" (CodeSystem: 2.16.840.1.113883.5.1001) (CONF:2280)</sch:assert> <sch:assert test="cda:value[@xsi:type='CD']">shall contain 1..3 value, which SHALL be selected from ValueSet 2.16.840.1.114188.8.131.5294 STATIC (CONF:2283)</sch:assert> </sch:rule> </sch:pattern>
This shows the standard set of attribute/attribute value testing, and an example of the vocabulary testing that is such an important part of testing the validity of CDA documents.
Testing Schematron Validation
The first reaction of most people is to wonder why you test the Schematron, when everyone knows the Schematron is used to test the document instances. The second reaction is to say “of course” — we need to ensure that the combination of XML Schema and Schematron that we create correctly confirms the validity of all the files that conform to the constraints, while flagging any errors.
We also need to ensure that all Schematron error messages point to the real error that needs to be fixed. Many of the parser error messages that arise from validation errors are difficult to understand, particularly for those who are not XML experts. This adds a testing requirement — not only does the Schematron have to fail a bad test file, but it has to fail it with an understandable message.
As we discussed earlier, the CDA schema does not catch quite all the CDA errors people might make, and of course it does not validate rules that constrain CDA; so our testing concentrates on the those aspects of the Schematron validation.
The database that stores the constraints and exports prose for the Implementation Guide also generates the basic schematron validation file. We make the most of the fact that Schematron error messages can be more tailored: the error message we generate cites the constraint prose just as it appears in the Implementation Guide.
In general the items that the CDA XML Schema doesn't validate, such as alternative attributes and value-dependent conditionals, currently can't be recorded as computable constraints and thus can't be generated from the database. These aspects of validation are written by hand, and therefore need to be thoroughly tested before delivering the resultant Schematron file to the customer. Since the automatic generation process is constantly beng developed and upgraded, we also need to do regression testing on new releases of the Schematron export.
When we are creating Schematron for a client, as opposed to testing the Schematron generation system ugrades, the automatically-generated Schematron requires only spot checks to make sure nothing went wrong with the generation. The hand-coded portion requires far more in-depth testing.
The testing process
We use the phrase “good-test files” for the test files that ought to pass validation. “Bad-test files” are those which should throw errors.
The first step is to create good-test files — combinations of elements, attributes, and attribute values that are allowed. Once the Schematron passes those correctly, it’s time to ensure it fails the bad-test files, and fails them in the correct way. This is where the bulk of the testing work comes in.
We take the good test-files as a base, and make them incorrect by deleting a required element, or setting an attribute value that is not allowed at that point. The Schematron has to fail all the bad-test files, with a reasonable error message, at the right spot. Then we create more bad-test files, taking the documentation of the report type as a guide, to cover any combination of wrong choices that a user could make.
We don’t want to go overboard with the number of test files though, as we do need, at some stage, to ship the Schematron. Thus we are constantly looking for ways to ensure better and more time-efficient coverage of the possible error conditions while maintaining confidence of complete coverage. One approach is to create a complete set of alternatives for each constraint we want to test. That gives us perfect confidence on coverage (with some challenges for tracking which file tests which condition) but requires a very large number of test files. We add efficiencies by making carefully-chosen assumptions such as: The mechanism that generates a test for membership in a value set is applied in the same way wherever used; we will only test it once. This significantly reduces the number of test files we must create, but having a dozen or so such rules complicates the business of drawing up a test list.
Creating the good-test files requires a solid understanding of the report rules. In practice, this means that creating the good-test files also tests the quality of the documentation. The best way to test this is for the person who creates the test files to be someone other than the person who wrote the documentation.
The prose representation of a CDA constraint has precise meanings for every part of the sentence. The constraint is documented in reference to an XML fragment in CDA, and the element and attribute combination are defined in terms of XPaths. Each constraint has a conformance number (the 10304 and 10907 listed here). In the following snippet, the topic is the administration of a drug:
SHALL contain consumable/manufacturedProduct/manufacturedMaterial/code (CONF:10304). a. In an Evidence of Infection (Dialysis) Report, i. If the antimicrobial started was Vancomycin, the value of @code SHALL be '11124' Vancomycin [CodeSystem: 2.16.840.1.113883.6.88 RxNorm]. ii. Otherwise, the value of @nullFlavor SHALL be 'NI'. (CONF: 10907)
This specifies: in the context of this substance administration, there must be a code element at the bottom of that XPath chain. If the report is of type “Evidence of Infection (Dialysis)”, we're only collecting statistics about one drug: if the antimicrobial started was Vancomycin, then the code element must have an xsi:type attribute with value ‘CE’, a code attribute with value ‘11124’, and a codeSystem attribute with value ‘2.16.840.1.113883.6.88’. (It may also have a displayName attribute with value ‘Vancomycin’ and/or a codeSystemName attribute with value ‘RxNorm’.) Otherwise the code element has neither of those attributes; instead it must have the attribute nullFlavor with value ‘NI’.
To test this, we need a good-test file that shows when Vancomycin was started, a good-test file that shows when it wasn’t Vancomycin, but something else, and a good-test file that shows when no IV antimicrobial was started. That makes three separate good-test files, and at least that number of bad test files to test the number of ways in which things can be wrong.
This is a relatively simple constraint; we have some that run to several alternative branches, each with a number of options. Independent values can be tested separately (recording which antimicrobial was started does not depend on whether the person administering it washed their hands), but some values interact even though widely separated in the CDA document (maternal blood loss after a caesarian section cannot logically be recorded for a male patient, nor can a male patient logically be treated in a pre-natal ward).
Ensuring reasonable test coverage is one of the big issues we face. For one project we created 91 good and 133 bad test files by hand to test one medium-sized chunk of a report. Not surprisingly, we’ve started looking into ways to generate test files as variations of base sample files to save time. We also don’t want to create more test files than we need; testing unnecessary combinations costs time and effort, delaying delivery without benefitting the customer.
We need to itemize the combinations and variations, determine which we will test, and ensure that at least one good and one bad test file exists for each of those combinations and variants that can't be tested by the CDA Schema. We need to create a few bad test files to ensure the CDA.xsd validation is triggered by the validation process, but not for every possible error. We need to be sure that the error message thrown by the Schematron for each bad file is reasonable; this involves keeping track of the file name and the error message(s) through iterative development cycles.
Currently we keep track of the files in a spreadsheet, using carefully-constructed filenames that indicate what is being tested. (The file-naming conventions aren’t strictly necessary, but they are a significant practical help to the person creating the test files and the person checking the test files.) The test files are commented to show what is being tested, and where the error (if a bad test file) is. The comments reference the conformance numbers in the documentation. We track and improve error messages that are inaccurate or unhelpful. The spreadsheet goes through iterations to match the Schematron development iterations.
We have a number of challenges apart from creating the test files. Many of these are typical testing challenges, such as how to assess the results quickly, and what the best type of testing system is (which depends to some extent on what the various developers are used to).
A bigger issue is the best way to indicate how many errors should be present in a given bad test file. Any given single error in an XML document can potentially be the cause of more than one error message. When the Schematron finds a different number of errors in an XML file to the expected number, it could be due to error(s) in the Schematron, or error(s) in the XML. Once the XML file has been checked, what remains are the Schematron error(s). And the number of those may change as the Schematron is developed.
There are a few ways to tell the error-testing system how many errors go with a particular file. Tony Graham published a poster at XML Prague 2012 (http://www.mentea.net/resources/schematron-testing-framework.pdf) discussing his framework, which uses processing instructions in the XML file itself (along with XProc). We use a spreadsheet with the filename and expected number of errors (along with some custom Java code that invokes the standard JUnit testing framework). Both systems then use ant to run the tests against all the test files and report results. If the number of expected errors remains stable, either method should work well.
We have shown why we need to test Schematron files in the context of healthcare standards, how we use the standard testing methods of good files and bad files, and discussed some of the challenges we find. We welcome feedback, suggestions for improvement in the process, and comments.