Daniel Zimmermann FZI Research Center for Information Technology, Patrick Deubel FZI Research Center for Information Technology, Anne Koziolek Karlsruhe Institute of Technology
Evaluating the Effectiveness of Neuroevolution for Automated GUI-Based Software Testing
As software systems become increasingly complex, testing has become an essential component of the development process to ensure the quality of the final product. However, manual testing can be costly and time-consuming due to the need for human intervention. This constrains the number of test cases that can be run within a given timeframe and, as a result, limits the ability to detect defects in software in a timely manner. Automated testing, on the other hand, can reduce the cost and time associated with testing, but traditional approaches have limitations. These include the inability to thoroughly explore the entire state space of software or process the high-dimensional input space of graphical user interfaces (GUIs). In this study, we propose a new approach for automated GUI-based software testing utilizing neuroevolution (NE), a branch of machine learning that employs evolutionary algorithms to train artificial neural networks with multiple hidden layers of neurons. NE offers a scalable alternative to established deep reinforcement learning methods and provides higher robustness to parameter influences and improved handling of sparse rewards. The agents are trained to explore software and identify errors while being rewarded for high test coverage. We evaluate our approach using a realistic benchmark software application and compare it to monkey testing, a widely adopted automated software testing method.
Automated Android testing approaches often fail to interact properly with complex UIs consisting of multiple related elements. For instance, to trigger a state transition in a form-based UI, one has to first fill out all input fields and then click on the submit button, but test generators would usually interact with the fields and button in arbitrary order, struggling to trigger the corresponding state transition, and resulting in overall lower code coverage. One way to overcome this problem is to define motif actions, which allow test generators to interact not just with individual UI elements, but with combinations of UI elements related through common patterns of interaction sequences. We designed 12 such motif actions for common scenarios and integrated them into the Android test generation tool MATE. Our experiments demonstrate that these motif actions are applicable to a wide range of apps (86.5% out of a sample of 551 apps). Motif actions are particularly useful on complex apps, where our experiments on 109 such apps demonstrate an average increase of 2.19% activity coverage and 2% line coverage.