Extending XQuery with pattern matching over XML, HTML and JSON, and its usage for data mining
Benito van der Zander
Institute for Theoretical Computer Science
Graduate School for Computing in Medicine and Life Sciences, University of Lübeck
Pattern matching in a broad sense is a common feature of modern functional programming languages, answering the question, if one complex structured object has a form that is the same as another complex structured object, for some definition of “the same”. In XQuery path expressions, switch, and typeswitch statements are often described as performing pattern matching, but these are merely impoverished flavors of matching when compared to the real thing. We describe a syntax for general pattern matching based on regular expressions for XML/HTML/JSONiq trees, how these patterns are matched against input data, and how this pattern matching can be integrated into the syntax and semantics of the XQuery language. At the end we summarize real-world experience using it for large-scale data mining of library webcatalogs.