Balisage Paper: Quality assurance in the XML world: Beyond validation
Dale Waldt’s nearly 30 years of working with structured content, has focused primarily on leading the development of XML and Web systems that meet business and accuracy goals. Dale currently manages a large team of schema developers working on building semantically-rich XML content to support robust online delivery of more than 100 TB of legal information. Content Architecture has 20+ schema developers working closely with repository and delivery application development groups. Dale works closely with all stakeholders and developers to create processes and schedules that meet the business, technical, and quality requirements for all uses of the content. Before that, for 2 years, Dale was a Senior Analyst at the Gilbane Group writing and consulting on XML and DITA systems. Previously Dale spent 10 years consulting to state legislatures, federal agencies and complex/regulated content applications in pharmaceutical, legal, and technical documentation vertical markets helping them adopt XML publishing systems. Previously Dale was VP Product Systems Development for RIA, the legal and tax division of Thomson-Reuters. Dale has also worked for the US IRS, standards organizations (OASIS, ISO, CSA and others) and has taught, written and spoken widely on XML and related technologies around the world.
Copyright © 2012 Dale Waldt
Validation of XML documents typically provides feedback in binary, yes/no form. This avoids the ambiguity, manual intervention, and increased cost of other approaches. But it may not be enough to make XML applications efficient, accurate, or semantically rich. How do you ensure that the correct element and attribute types are applied to the appropriate content chunks? That XML documents are accurate and current? That your XML has a level of semantic richness appropriate to your business goals? How do you control quality over large collections? How do you resolve conflicting organizational goals for information integration and ensure that content and schemas help the enterprise as a whole? Conceptual and physical models, model / schema traceability, and effective stakeholder review can all help. Schematron, document comparison (diff) tools, statistical methods can also help, but may raise QA questions of their own. Improvements in requirements gathering and QA processes can produce visible results; concrete examples can and will be discussed.