Program semantics learning is a vital problem in various AI for SE applications i.g., clone detection, code summarization. Learning to represent programs with Graph Neural Networks (GNNs) has achieved state-of-the-art performance in many applications i.g, vulnerability identification, type inference. However, currently, there is a lack of a unified framework with GNNs for distinct applications. Furthermore, most existing GNN-based approaches ignore global relations with nodes, limiting the model to learn rich semantics. In this paper, we propose a unified framework to construct two types of graphs to capture rich code semantics for various SE applications.
Program Display Configuration
Wed 23 Sep
Times are displayed in time zone: (UTC) Coordinated Universal Time
Ethereum has become a widely used platform to enable secure, Blockchain-based financial and business transactions. However, many identified bugs and vulnerabilities in smart contracts have led to serious financial losses, which raises serious concerns about smart contract security. Thus, there is a significant need to better maintain smart contract code and ensure its high reliability. In this research: (1) Firstly, we propose an automated deep learning based approach to learn structural code embeddings of smart contracts in Solidity, which is useful for clone detection, bug detection and contract validation on smart contracts. We apply our approach to more than 22K solidity contracts collected from the Ethereum blockchain, results show that the clone ratio of solidity code is at around 90%, much higher than traditional software. % Our work reveals homogeneous of the Ethereum ecosystem. We collect a list of 52 known buggy smart contracts belonging to 10 kinds of common vulnerabilities as our bug database. Our approach can identify more than 1000 clone related bugs based on our bug databases efficiently and accurately. (2) Secondly, according to developers’ feedback, we have implemented the approach in a web-based tool, named SmartEmbed, to facilitate Solidity developers for using our approach. Our tool can assist Solidity developers to efficiently identify repetitive smart contracts in the existing Ethereum blockchain, as well as checking their contract against a known set of bugs, which can help to improve the users’ confidence in the reliability of the contract. We optimize the implementations of SmartEmbed which is sufficient in supporting developers in real-time for practical uses. The Ethereum ecosystem as well as the individual Solidity developer can both benefit from our research. SmartEmbed website: http://www.smartembed.tools Demo video: https://youtu.be/o9ylyOpYFq8 Replication package: https://github.com/beyondacm/SmartEmbed
Cryptographic algorithms are widely used to protect data privacy in many aspects of daily lives from smart card to cyber-physical systems. Unfortunately, programs implementing cryptographic algorithms may be vulnerable to practical power side-channel attacks, which may infer private data via statistical analysis of the correlation between power consumptions of an electronic device and private data. To thwart these attacks, several masking schemes have been proposed. However, programs that rely on secure masking schemes are not secure a priori. Although some techniques have been proposed for formally verifying masking countermeasures and for quantifying masking strength, they are currently limited to Boolean programs and suffer from low accuracy. In this work, we propose an approach for formally verifying masking countermeasures of arithmetic programs. Our approach is more accurate for arithmetic programs and more scalable for Boolean programs comparing to the existing approaches. We have implemented our methods in a verification tool QMVerif which has been extensively evaluated on cryptographic benchmarks including full AES, DES and MAC-Keccak. The experimental results demonstrate the effectiveness and efficiency of our approach, especially for compositional reasoning.
Smart contracts are Turing-complete programs running on the blockchain. They cannot be modified, even when bugs are detected. The Selfdestruct function is the only way to destroy a contract on the blockchain system and transfer all the Ethers on the contract balance. Thus, many developers use this function to destroy a contract and redeploy a new one when bugs are detected. In this paper, we propose a deep learning-based method to find security issues of Ethereum smart contracts by finding the updated version of a destructed contract. After finding the updated versions, we use open card sorting to find security issues.
Program semantics learning is a vital problem in various AI for SE applications i.g., clone detection, code summarization. Learning to represent programs with Graph Neural Networks (GNNs) has achieved state-of-the-art performance in many applications i.g, vulnerability identification, type inference. However, currently, there is a lack of a unified framework with GNNs for distinct applications. Furthermore, most existing GNN-based approaches ignore global relations with nodes, limiting the model to learn rich semantics. In this paper, we propose a unified framework to construct two types of graphs to capture rich code semantics for various SE applications.
The data race problem is common in the interrupt-driven program, and it is difficult to find as a result of complicated interrupt interleaving. Static analysis is a mainstream technology to detect those problems, however, the synchronization mechanism of interrupt is hard to be processed by the existing method, which brings many false alarms. Eliminating false alarms in static analysis is the main challenge for precisely data race detection. In this paper, we present a framework of static analysis combined with program verification, which performs static analysis to find all potential races, and then verifies every race to eliminate false alarms. The experiment results on related race benchmarks show that our implementation finds all race bugs in the phase of static analysis, and eliminates all false alarms through program verification.
Prior study has identified common anti-patterns in automated repair for C programs. In this work, we study if the same problems exist in Java programs. We performed a manual inspection on the plausible patches generated by Java automated repair tools. We integrated anti-patterns in jGenProg2 and evaluated on Defects4J benchmark. The result shows that the average repair time is reduced by 22.6% and the number of generated plausible patches is reduced from 67 to 29 for 14 bugs in total. Our study provided evidence about the effectiveness of applying anti-patterns in future Java automated repair tools.
This article presents a description of a system for the automatic generation of predictive diagnostic models of CNC machine tools. This system allows machine tool maintenance specialists to select and operate models based on LSTM neural networks to determine the state of elements of CNC machines. Examples of changes in the accuracy of the models used during operation are given to determine the state of the cutting tool (more than 95%) and the bearings of electric motors (more than 91%).
One recent promising direction in reducing costs of mutation analysis is to identify redundant mutations. We propose a technique to discover redundant mutations by proving subsumption relations among method-level mutation operators using weak mutation testing. We conceive and encode a theory of subsumption relations in Z3 for 40 mutation targets (mutations of an expression or statement). Then we prove a number of subsumption relations using the Z3 theorem prover, and reduce the number of mutations in a number of mutation targets. MuJava-M includes some subsumption relations in MuJava. We apply MuJava and MuJava-M to 187 classes of 17 projects. Our approach correctly discards mutations in 74.97% of the cases, and reduces the number of mutations by 72.52%.