Understanding Automatically-Generated Patches Through Symbolic Invariant Differences
Developer trust is a major barrier to the deployment of automatically-generated patches. Understanding the effect of a patch is a key element of that trust. We find that differences in sets of formal invariants characterize patch differences and that implication-based distances in invariant space characterize patch similarities. When one patch is similar to another it often contains the same changes as well as additional behavior; this pattern is well-captured by logical implication. We can measure differences using a theorem prover to verify implications between invariants implied by separate programs. Although effective, theorem provers are computationally intensive, but we find that string distance is an efficient heuristic for our implication-based distance measurements. We propose to use distances between patches to construct a hierarchy highlighting patch similarities. We evaluate our approach on over 300 patches and find that it correctly categorizes programs into semantically similar clusters. Clustering programs reduces human effort by reducing the number of semantically distinct patches that must be considered by over 50%, thus reducing the time required to establish trust in automatically generated repairs.