Diversity and Inclusion
Women at ASE are invited to meet up and talk over breakfast at the conference hotel.
To sign up for the breakfast please add the breakfast to your ASE conference registration.
This event is part of a series of events focused on diversity and inclusion; to see a full list of events planned, check out the list of Diversity and Inclusion activities.
Automatic program repair (APR) reduces the burden of debugging by directly suggesting likely fixes for the bugs. We believe scalability, applicability, and accurate patch validation are among the main challenges for building practical APR techniques that the researchers in this area are dealing with. In this paper, we describe the steps that we are taking toward addressing these challenges.
When a program is nondeterministic, it is difficult to test and debug. Nondeterminism occurs even in sequential programs: for example, as a result of iterating over the elements of a hash table. This seemingly innocuous and frequently used operation can result in diverging test results.
We have created a type system that can express determinism specifications in a program. The key ideas in the type system are type qualifiers for nondeterminism, order-nondeterminism, and determinism. While state-of-the-art nondeterminism detection tools unsoundly rely on observing runtime output, our approach verifies determinism at compile time, thereby providing stronger soundness guarantees.
We implemented our type system for Java. Our type checker, the Determinism Checker, warns if a program is nondeterministic or verifies that the program is deterministic. In a case study of a 24,000-line software project, it found previously-unknown nondeterminism errors in a program that had been heavily vetted by its developers, who were greatly concerned about nondeterminism errors.
Until 2017, Android smartphones occupied approximately 87% of the smartphone market. The vast market also promotes the development of Android malware. Nowadays, the number of malware targeting Android devices found daily is more than 38,000. With the rapid progress of mobile application programming and anti-reverse-engineering techniques, it is harder to detect all kinds of malware. To address challenges in existing detection techniques, such as data obfuscation and limited codes coverage, we propose a detection approach that directly learns features of malware from Dalvik bytecode based on deep learning technique (CNN). The average detection time of our model is 0.22 seconds, which is much lower than other existing detection approaches. In the meantime, the overall accuracy of our model achieves over 89%.
Providing user satisfaction is a major concern for on-demand multimedia service providers and Internet carriers in Wireless Communications. Traditionally, user satisfaction was measured objectively in terms of throughput and latency. Nowadays the user satisfaction is measured using subjective metrices such as Quality of Experience (QoE). Recently, Smart Media Pricing (SMP) was conceptualized to price the QoE rather than the binary data traffic in multimedia services. In this research, we have leveraged the SMP concept to chalk up a QoE-sensitive multimedia pricing framework to allot price, based on the user preference and multimedia quality achieved by the customer. We begin by defining the utility equations for the provider-carrier and the customer. Then we translate the profit maximizing interplay between the parties into a two-stage Stackelberg game. We model the user personal preference using Prelec weighting function which follows the postulates Prospect Theory (PT). An algorithm has been developed to implement the proposed pricing scheme and determine the Nash Equilibrium. Finally, the proposed smart pricing scheme was tested against the traditional pricing method and simulation results indicate a significant boost in the utility achieved by the mobile customers.
Designing usable APIs is critical to developers’ productivity and software quality but is quite difficult. In this paper, I focus on “boilerplate” code, sections of code that have to be included in many places with little or no alteration, which many experts in API design have said can be an indicator of API usability problems. I investigate what properties make code count as boilerplate, and present a novel approach to automatically mine boilerplate code from a large set of client code. The technique combines an existing API usage mining algorithm, with novel filters using AST comparison and graph partitioning. With boilerplate candidates identified by the technique, I discuss how this technique could help API designers in reviewing their design decisions and identifying usability issues.
Machine image sniping is a difficult-to-detect security vulnerability in cloud computing code. When programmatically initializing a machine, a developer must specify which machine image (operating system and file system) to use as the basis for the new machine. The developer should restrict the search to only those machine images which their organization controls: otherwise, an attacker can insert a similar but malicious image into the public database, where it might be selected instead of the image intended by the developer when initializing a new machine. We present a lightweight type and effect system that detects requests to a cloud provider that are vulnerable to an image sniping attack, or proves that no vulnerable request exists in a codebase. We prototyped our type system for Java programs that request Amazon Web Services machines, and evaluated it on more than 500 codebases, detecting 12 vulnerable requests with only 3 false positives.
Quality control is a challenge of crowdsourcing, especially in software testing. As having different levels of skills, some crowdworkers may not complete assigned tasks well. Therefore, it is necessary to propose some auxiliary methods to assist crowdworkers, raising bug report quality. This paper proposes a novel method, namely CroReG, to generate crowdsourcing reports based on bug screenshots. CroReG leverages image understanding techniques to analyze bug screenshots. Image understanding techniques including deep learning and optical character recognition (OCR) techniques are used comprehensively to analyze the screenshots and then CroReG can automatically generate corresponding bug reports. The preliminary experiment results show that CroReG can effectively generate bug reports containing accurate screenshot captions and providing guidance for crowdworkers.
From the point of view of a software engineer, having safe and optimal controllers for real life systems like cyber physical systems is a crucial requirement before deployment. Given the mathematical model of these systems along with their specifications, model checkers can be used to synthesize controllers for them. The given work proposes novel approaches for making controller analysis easier by using machine learning to represent the controllers synthesized by model checkers in a succinct manner, while also incorporating the domain knowledge of the system. It also proposes the implementation of a visualization tool which will be integrated into existing model checkers. A lucid controller representation along with a tool to visualize it will help the software engineer debug and monitor the system much more efficiently.
In recent years, the extensive application of the Python language in the field of artificial intelligence has made its analysis work more and more valuable. Many static analysis algorithms need to rely on the construction of call graphs. Therefore, the call graph construction for Python language deserves more attention. In this paper, we did a comparative empirical analysis of several widely used Python static call graph tools both quantitatively and qualitatively. Experiments show that the existing Python static call graph tools have a large difference in the construction effectiveness, and there is still room for improvement.
This paper presents a machine learning classifier designed to identify SQL injection vulnerabilities in PHP code. Both classical and deep learning based machine learning algorithms were used to train and evaluate classifier models using input validation and sanitization features extracted from source code files. On ten-fold cross validations a model trained using Convolutional Neural Network(CNN) achieved the highest precision (95.4%), while a model based on Multilayer Perceptron (MLP) achieved the highest recall (63.7%) and the highest f-measure (0.746).
It has been long suggested that commit messages can greatly facilitate code comprehension. However, developers may not write good commit messages in practice. Neural machine translation (NMT) has been suggested to automatically generate commit messages. Despite the efforts in improve NMT algorithms, the quality of the generated commit messages is not yet satisfactory. This paper, instead of improving NMT algorithms, suggests that proper preprocessing of code changes into concise inputs is quite critical to train NMT. We approach it with semantic analysis of code changes. We collect a real-world dataset with 50k+ commits of popular Java projects, and verify our idea with comprehensive experiments. The results show that preprocessing inputs with code semantic analysis can improve NMT significantly. This work sheds light to how to apply existing DNNs designed by the machine learning community, e.g., NMT models, to complete software engineering tasks.