
Registered user since Mon 21 Mar 2022
Contributions
View general profile
Registered user since Mon 21 Mar 2022
Contributions
Research Papers
Wed 12 Oct 2022 17:00 - 17:20 at Banquet A - Technical Session 18 - Testing II Chair(s): Darko MarinovQuestion answering (QA) software uses information retrieval and natural language processing techniques to automatically answer questions posed by humans in a natural language. Like other AI- based software, QA software may contain bugs. To automatically test QA software without human labeling, previous work extracts facts from question answer pairs and generates new questions to detect QA software bugs. Nevertheless, the generated questions can be ambiguous, confusing, or with chaotic syntax, which are unanswerable for QA software. As a result, a large proportion of the reported bugs are false positives. In this work, we proposed QATest, a sentence-level mutation based metamorphic testing tool for QA software. To eliminate false positives and achieve precise automatic testing, QATest leverages five Metamorphic Relations (MRs) as well as semantics-guided searching and enhanced test oracles. Our evaluation on three QA datasets demonstrates that QATest outperforms the state-of-the-art in both quantity (8,133 vs. 6,601 bugs) and quality (97.67% vs. 49% true positive rate) of the reported bugs. Moreover, the test inputs generated by QATest successfully reduce MR violation rate from 44.29% to 20.51% when being adopted in fine-tuning the QA software under test.
Pre-print