RENN: Efficient Reverse Execution with Neural-Network-assisted Alias Analysis
Reverse execution and coredump analysis have long been used to diagnose the root cause of software crashes. Each of these techniques, however, face inherent challenges, such as insufficient capability when handling memory aliases. Recent works have used hypothesis testing to address this drawback, albeit with high computational complexity, making them impractical for real world applications. To address this issue, we propose a new deep neural architecture, which could significantly improve memory alias resolution. At the high level, our approach employs a recurrent neural network (RNN) to learn the binary code pattern pertaining to memory accesses. It then infers the memory region accessed by memory references. Since memory references to different regions naturally indicate a non-alias relationship, our neural architecture can greatly reduce the burden of doing hypothesis testing to track down non-alias relation in binary code.
Different from previous researches that have utilized deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we designed a new recurrent neural architecture that could capture the data dependency between machine code segments.
To demonstrate the utility of our deep neural architecture, we implement it as RENN, a neural network-assisted reverse execution system. We utilize this tool to analyze software crashes corresponding to 40 memory corruption vulnerabilities from the real world. Our experiments show that RENN can significantly improve the efficiency of locating the root cause for the crashes. Compared to a state-of-the-art technique, RENN has 36.25% faster execution time on average, detects an average of 21.35% more non-alias pairs, and successfully identified the root cause of 12.5% more cases.