Comparing and diffing XML schemas
Priscilla Walmsley
Datypic
Schemas evolve over time, and it is useful to be able to automatically compare versions of a schema in order to provide detailed, accurate documentation to implementers. Automatically “diffing” schemas is also an effective quality control technique, ensuring that inadvertent changes were not made, and that all changes made are backward compatible (if that is a goal).
When taking into account the variety of ways of expressing a content model, and the possibility that advanced schema features were used, it is necessary to go beyond simple text diffing or even XML diffing. By first “canonicalizing” schemas to make them easier to compare, and then cataloging the differences between schemas we can answer questions like “Is this schema backward compatible?” and “Is this schema a subset or superset of another schema?”