You Pull, I’ll Push: on the Polarity of Pipelines
Pipelines provide an excellent way of structuring XML applications, simplifying complex
processing tasks and enabling the reuse of generic components, using a variety of technologies.
Efficient pipelines often pass data from one stage to the next as a sequence of
events, representing the structure of the tree as a by notifying
startElement, endElement and similar transitions.
The control flow in a pipeline can either run with the data flow (push polarity) or against the flow (pull polarity). Performance problems occur when components with different polarity need to be integrated into the same pipeline: traditionally this problem is handled either by buffering the data in memory (leading to scalability problems as well as loss of latency), or by using multiple threads, which introduces coordination overheads.
This paper looks at a different way of managing polarity conflicts, by applying the concepts of program inversion developed during the days of batch magnetic tape data processing. It specifically examines how this concept can be applied to the compilation of XSLT stylesheets, both single and multi-phase.