Registered user since Sat 23 Apr 2022
Contributions
Research Papers
Tue 12 Sep 2023 15:30 - 15:42 at Room C - Testing AI Systems 3 Chair(s): Mike PapadakisThere has been an increasing interest in enhancing the fairness of machine learning (ML). Despite the growing number of fairness-improving methods, we lack a systematic understanding of the trade-offs among factors considered in the ML pipeline when fairness-improving methods are applied. This understanding is essential for developers to make informed decisions regarding the provision of fair ML services. Nonetheless, it is extremely difficult to analyze the trade-offs when there are multiple fairness parameters and other crucial metrics involved, coupled, and even in conflict with one another.
This paper uses causality analysis as a principled method for analyzing trade-offs between fairness parameters and other crucial metrics in ML pipelines. To practically and effectively conduct causality analysis, we propose a set of domain-specific optimizations to facilitate accurate causal discovery and a unified, novel interface for trade-off analysis based on well-established causal inference methods. We conduct a comprehensive empirical study using three real-world datasets on a collection of widely used fairness-improving techniques. Our study obtains actionable suggestions for users and developers of fair ML. We further demonstrate the versatile usage of our approach in selecting the optimal fairness-improving method, paving the way for more ethical and socially responsible AI technologies.
Pre-printDebugging performance anomalies in databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrades. Nev- ertheless, causality analysis is challenging in practice, particularly due to limited observability. Recently, chaos engineering (CE) has been applied to test complex software systems. CE frameworks mutate chaos variables to inject catastrophic events (e.g., network slowdowns) to stress-test these software systems. The systems under chaos stress are then tested (e.g., via differential testing) to check if they retain normal functionality, such as returning correct SQL query outputs even under stress.
To date, CE is mainly employed to aid software testing. This paper identifies the novel usage of CE in diagnosing performance anomalies in databases. Our framework, PERFCE, has two phases — offline and online. The offline phase learns statistical models of a database using both passive observations and proactive chaos experiments. The online phase diagnoses the root cause of performance anomalies from both qualitative and quantitative aspects on-the-fly. In evaluation, PERFCE outperformed previous works on synthetic datasets and is highly accurate and moderately expensive when analyzing real-world (distributed) databases like MySQL and TiDB.
Pre-print