Table of Links
Abstract and 1 Introduction
2 Background
3 Approach and 3.1 Differential Testing for XML Processors
3.2 XPath Expression Generation
3.3 XML Generation
4 Evaluation
4.1 Effectiveness
4.2 Efficiency
4.3 Comparison to the State of the Art
4.4 Analysis of BaseX Historical Bug Reports
5 Related Work
6 Conclusion, Acknowledgments, and References
While various related approaches to our work exist, to the best of our knowledge, we propose the first general approach to testing XML processors to find logic bugs. As discussed above, the most closely related work proposed testing the index support of SQLServer in the context of XPath and XQuery [41], which, to the best of our knowledge, is the only work that has tackled the test-oracle problem for XML processors, but is limited in scope.
Testing XPath functionality. Various approaches to benchmarking XPath implementations or test suites for them have been proposed, the most representative being XPathMark and the W3C qt3 test suite. XPathMark [25] is a benchmark for testing XML processors’ XPath standard 1.0 functionality, containing both correctness as well as performance tests. The W3C qt3 test suite developed by the W3C XQuery and XSLT Working Groups [19] contains around 30,000 tests for XPath and XQuery targeting XPath 3.0 and later versions, which cover a broad range of functions and expressions.
XML-related automated synthetic data generation. Previous works have proposed approaches for automatically generating XML-related data, such as XML documents, XPath, and XQuery expressions. Aboulnaga et al. proposed an XML document generator to generate synthetic, but complex, structured XML data by introducing recursion and repetition on tag name assignment and controlling the element frequency distribution [20]. Rychnovský and Holubová proposed an approach to generate XML documents related to given XPath queries from a specific XML schema to improve query efficiency [37], which is useful for developers to create micro-benchmarks for testing performance over certain XPath expressions. XQGen [42] is a tool for generating XPath queries that conform to a given XML schema, allowing users to specify multiple parameters, such as the percentage of empty queries desired and the percentage of queries with predicates. XPath generated by XQGen includes only direct node tests without introducing complex expressions, such as axes or function transformations. Similarly, the XQuery generator designed by Todic and Uzelac [41] includes XQuery FLWOR expressions, but the logic predicate consists only of simple operations, such as value comparisons. Neither of these works tackled the test oracle problem, and, as indicated by the results in Section 4.3, given their different focus, they cannot be effectively combined with a differential testing oracle.
Targeted test case generation. Many testing tools guide their test case generation process to improve testing efficiency, for random approaches such as random byte mutation used in fuzzing approaches generate a large proportion of invalid queries [47]. DynSQL [27] guides the fuzzing process of DBMSs towards increased code coverage and high statement validity. APOLLO [28] is a system for detecting performance regression bugs in DBMSs. It increases the probability of including components from previously encountered performance issues. Cynthia [39] was proposed to test Object Relational Mappers (ORMs) and generates targeted databases dependent on generated abstract SQL queries, which are likely to return non-empty results. Query Plan Guidance (QPG) [22] guides testing towards exploring more unique query plans.
Pivoted Query Synthesis. The targeted node in XPress was inspired by the pivot row in Pivoted Query Synthesis (PQS) [36], which was originally proposed to test relational DBMSs. PQS’ and XPress’ commonality is that they select a random element, in PQS, a row in the database, while for XPress, a node in an XML document, based on which they generate a query that is guaranteed to fetch the element. However, both the purpose and use of the targeted node and pivot row differ. In PQS, the pivot row is used both for test-case generation and to construct the test oracle, by evaluating an expression and ensuring that it evaluates to true for the pivot row so that it can be used in a query that is guaranteed to fetch the row. Doing so requires a naive reimplementation of all the DBMSs’ operators that should be tested, which incurs a high implementation effort, as highlighted in follow-up work [? ]. In XPress, the targeted node is used only for test-case generation, to improve testing efficiency and to ensure non-empty intermediate results; to this end, XPress uses the XML processor to determine the result of the expression, rather than requiring the reimplementation of operators. In addition, for predicate rectification, XPress provides operator-specific rules, rather than relying on a generic one, aiming to generate more interesting test cases. The high-level idea of a pivot element also inspired other works; for example, recent work on Android testing introduced the concept of a pivot layout [40].
Authors:
(1) Shuxin Li, Southern University of Science and Technology China and Work done during an internship at the National University of Singapore ([email protected]);
(2) Manuel Rigger, National University of Singapore Singapore ([email protected]).