Not registered as user yet
Contributions
View general profile
Not registered as user yet
Contributions
NIER Track
Thu 13 Oct 2022 13:50 - 14:00 at Banquet A - Technical Session 27 - Dynamic and Concolic Analysis Chair(s): ThanhVu NguyenAnalysis of data is the foundation of multiple scientific disciplines, manifesting nowadays in complex and diverse scientific workflows often involving exploratory analyses. Such analyses represent a particular case for traditional data engineering workflows, as results may be hard to interpret and judge whether they are correct or not, and where experimentation is a central theme. Input data, or assumptions made about it may be incorrect and may need to be refined – an analogous problem to fault localization in software engineering. Typical techniques assume that a fault is identified, usually by an oracle in the form of a test. The workflows we target however are usually explorative, which makes it hard – if not impossible – to define tests specifying correct behaviour, while spotting irregularities is highly desired. To this end, we advocate data input reduction such that a specified outcome is preserved, aiding debugging. In our proposal, reductions are used as debug hypotheses for data. We outline our bold vision on building engineering support for outcome-preserving input reduction within data analysis workflows, and report on preliminary results.
Pre-print