Efficient scripting

David A. Lee

Principal senior software engineer

Epocrates, Inc.

Norman Walsh

Principal Technologist in the Information & Media group

Mark Logic Corporation

Copyright © 2009 David A. Lee and Norman Walsh. Used by permission.

expand David A. Lee

expand Norman Walsh

Balisage logo


expand How to cite this paper

Efficient scripting

International Symposium on Processing XML Efficiently: Overcoming Limits on Space, Time, or Bandwidth
August 10, 2009


The efficiency and performance of individual XML operations such as parsing, processing (XSLT, XQuery) and serialization, and the merits of different in-memory document representations, have been widely discussed. However, real world uses cases often involve many operations orchestrated using a scripting environment. The performance of the scripting environment can often overshadow any performance gains in individual operations. In an exploration of real world scripting, we compare performance of several scripting languages and techniques on a set of typical XML operations such as generation of a table of contents and conditionally accessing non-XML files identified in XML documents. Based on performance results, we suggest best practices for scripting XML processes. Scripting languages compared include DOS Shell (CMD.EXE), Linux Shell (bash), XMLSH, and XProc (calabash). These are run (where possible) on multiple operating systems: Windows XP, Linux, and Mac/OS.