How to cite this paper

Chelsom, John J., and Jay H. Chelsom. “Scaling XML Using a Beowulf Cluster.” Presented at Balisage: The Markup Conference 2018, Washington, DC, July 31 - August 3, 2018. In Proceedings of Balisage: The Markup Conference 2018. Balisage Series on Markup Technologies, vol. 21 (2018).

Balisage: The Markup Conference 2018
July 31 - August 3, 2018

Balisage Paper: Scaling XML using a Beowulf cluster

John J. Chelsom

Seven Informatics Ltd.

John Chelsom is CEO of Seven Informatics Ltd. He trained as an electrical engineer before gaining a PhD in artificial intelligence in medicine. He has been a Visiting Professor in Health Informatics at City University, London and the University of Victoria, Canada. As Managing Director of CSW Group from 1993 to 2008, John was responsible for implementation of XML workflow and production systems for many major organisations, including the British Medical Journal, Jaguar Cars, and the Royal Pharmaceutical Society.

The Case Notes product developed by CSW was based on XML and other open standards. In 2003 the UK government chose Case Notes as the primary clinical system in the national architecture for a shared electronic health record covering the 55 million citizens in England. In 2000, John founded the XML Summer School and continues as a board member and lecturer in this annual event.

Since 2010 he has been the lead architect of the open source cityEHR product — an XRX (Xforms, REST, XQuery) health records system currently used in a number of hospitals in England.

Jay H. Chelsom

Abingdon School

Copyright ©2018 by the authors. Used with permission.


We describe a series of experiments which test the performance and scalability of an XML records system deployed on a Beowulf cluster of open source XML databases. Using the open source cityEHR health records system as an example, we first ran experiments to determine the feasibility and optimal size of database instances running on Raspberry Pi and low-cost Intel computers. We describe the implementation of a Data Access Layer for create, read, query and delete operations, using XForms submissions, which encapsulates all database access. We then present the results of testing the scalability and performance of this implementation on clusters of one to sixteen physical database nodes. We conclude that Beowulf clustering provides an effective and cost-efficient mechanism for scaling XML records systems.