
Registered user since Mon 10 Jul 2017
Contributions
View general profile
Registered user since Mon 10 Jul 2017
Contributions
Research Papers
Thu 13 Oct 2022 16:00 - 16:20 at Banquet A - Technical Session 31 - Code Similarities and Refactoring Chair(s): Hua MingAn Object-Relational Mapping (ORM) provides an object-oriented interface to a database and facilitates the development of database-backed applications. In an ORM, programmers do not need to write queries in a separate query language such as SQL, they instead write ordinary method calls that are mapped by the ORM to database queries. This added layer of abstraction hides the significant performance cost of database operations, and misuse of ORMs can lead to far more queries being generated than necessary. Of particular concern is the infamous “N+1 problem”, where an initial query yields N results that are used to issue N subsequent queries. This anti-pattern is prevalent in applications that use ORMs, as it is natural to iterate over collections in object-oriented languages. However, iterating over data that originates from a database and calling an ORM method in each iteration may result in suboptimal performance. In such cases, it is often possible to reduce the number of round-trips to the database by issuing a single, larger query that fetches all desired results at once.
We propose an approach for automatically refactoring applications that use ORMs to eliminate instances of the “N+1 problem”, which relies on static analysis to detect data flow between ORM API calls. We implement this approach in a tool called Reformulator, targeting the Sequelize ORM in JavaScript, and evaluate it on 8 JavaScript projects. We found 44 N+1 query pairs in these projects, and Reformulator refactored all of them successfully, resulting in improved performance (up to 7.67x) while preserving program behavior. Further experiments demonstrate that the relative performance improvements grew as the database size was increased (up to 38.58x), and that front-end page load times were improved.
(Abstract of Original Paper)
An Object-Relational Mapping (ORM) provides an object-oriented interface to a database and facilitates the development of database-backed applications. In an ORM, programmers do not need to write queries in a separate query language such as SQL, they instead write ordinary method calls that are mapped by the ORM to database queries. This added layer of abstraction hides the significant performance cost of database operations, and misuse of ORMs can lead to far more queries being generated than necessary. Of particular concern is the infamous “N+1 problem”, where an initial query yields N results that are used to issue N subsequent queries. This anti-pattern is prevalent in applications that use ORMs, as it is natural to iterate over collections in object-oriented languages. However, iterating over data that originates from a database and calling an ORM method in each iteration may result in suboptimal performance. In such cases, it is often possible to reduce the number of round-trips to the database by issuing a single, larger query that fetches all desired results at once.
We propose an approach for automatically refactoring applications that use ORMs to eliminate instances of the “N+1 problem”, which relies on static analysis to detect data flow between ORM API calls. We implement this approach in a tool called Reformulator, targeting the Sequelize ORM in JavaScript, and evaluate it on 8 JavaScript projects. We found 44 N+1 query pairs in these projects, and Reformulator refactored all of them successfully, resulting in improved performance (up to 7.67x) while preserving program behavior. Further experiments demonstrate that the relative performance improvements grew as the database size was increased (up to 38.58x), and that front-end page load times were improved.
(Artifact Abstract)
This artifact presents a Docker image equipped with the source code for Reformulator, a refactoring and data flow analysis tool, as well as eight pre-configured ORM-backed JavaScript web applications that can be run out-of-the-box. The artifact contains instructions and all the necessary material to reproduce the evaluation in our paper, as well as the data used to prepare the tables in the paper.
DOITool Demonstrations
Tue 11 Oct 2022 14:50 - 15:00 at Ballroom C East - Technical Session 5 - Code Analysis Chair(s): Vahid AlizadehDynamic taint analysis (DTA) is a popular approach to help protect JavaScript applications against injection vulnerabilities. In 2016, the ECMAScript 7 JavaScript language standard introduced many language features that most existing DTA tools for JavaScript do not support, e.g., the async/await keywords for asynchronous programming. We present Augur, a high-performance dynamic taint analysis for ES7 JavaScript that leverages VM-\textit{supported} instrumentation. Integrating directly with a public, stable instrumentation API gives Augur the ability to run with high performance inside the VM and remain resilient to language revisions. We extend the abstract-machine approach to DTA with semantics to handle asynchronous function calls. In addition to providing the classic DTA use case of injection vulnerability detection, Augur is highly configurable to support any type of taint analysis, making it useful outside of the security domain. We evaluated Augur on a set of 20 benchmarks, and observed a median runtime overhead of only 1.77×. We note a median performance improvement of 298% compared to the previous state-of-the-art Ichnaea.
Tool demo: https://www.youtube.com/watch?v=GczQ-2A58LE
Link to open source code repository: https://github.com/nuprl/augur