no description available
Tenure is a machine. It has parts. The process can be broken up and each part systematically simplified.
No, it won’t make it a breeze But yes, it will make it manageable and (I hope) fun.
Pre-printFull prof, ex-nurse,rocketman,taxi-driver,journalist (it all made sense at the time).
The growth in the popularity of smart contracts has been accompanied by a rise in security attacks targeting smart contracts, which have led to financial losses of millions of dollars and erosion of trust. To enable developers discover vulnerabilities in smart contracts, several static analysis tools have been proposed. However, despite the numerous bug-finding tools, security vulnerabilities abound in smart contracts, and developers rely on finding vulnerabilities manually. Our goal in this dissertation study is to expand the space of security vulnerabilities detection by proposing effective static analysis approaches for smart contracts. We study the effectiveness of the existing static analysis tools and propose solutions for security vulnerabilities detection relying on analyzing the dependency of the contract code on user inputs that lead to security vulnerabilities. Our results of evaluating static analysis tools show that existing static tools for smart contracts have significant false-negatives and false-positives. Further, the results show that our first vulnerability detection approach achieves a significant improvement in the effectiveness of detecting vulnerabilities compared to the prior work.
Doctoral Symposium
Mon 10 Oct 2022 10:30 - 11:00 at Ambassador A - Session 2 - AI & Software EngineeringUnderstanding binary code is an essential but complex software engineering task for reverse engineering, malware analysis, and compiler optimization. Unlike source code, binary code has limited semantic information, which makes it challenging for human comprehension. At the same time, compiling source to binary code, or transpiling among different programming languages (PLs) can provide a way to introduce external knowledge into binary comprehension. We propose to develop Artificial Intelligence (AI) models that aid human comprehension of binary code. Specifically, we propose to incorporate domain knowledge from large corpora of source code (e.g., variable names, comments) to build AI models that capture a generalizable representation of binary code. Lastly, we will investigate metrics to assess the performance of models that apply to binary code by using human studies of comprehension.
Yifan is a researcher focusing on AI for Software Engineering (AI4SE), Graph Data Mining, and Domain Generalization. For the time being, he is pursuing a Ph.D. in Computer Science at Vanderbilt University, affiliated with Institute for Software Integrated Systems.
We are hiring Ph.D., Post-Doc and Research Intern. Feel free to check the recruitment documents if you are interested in:
Prof. Leach’s Lab: https://kjl.name/recruitment.pdf
Prof. Huang’s Lab: https://yuhuang-lab.github.io/index_files/Huang-Recruitment.pdf
Doctoral Symposium
Mon 10 Oct 2022 11:00 - 11:30 at Ambassador A - Session 2 - AI & Software EngineeringBackground: Automated Intelligent Toolchains are widely used in software engineering to deploy automated program repair techniques, or in software security to identify vulnerabilites. Overall Research Problem: Most studies with automated intelligent tool-chains report uncertainty and evaluations only of the individual components of the chain. How do we calculate the uncertainty and error propagation on the overall automated toolchain? Approach: I plan to replicate research case studies to collect data and design a methodology to reconstruct the overall correctness metrics of the toolchains, or identifying missing variables. Further confirmatory experiments with humans will be performed. Finally, I will implement an artifact to automate the overall assessment of automated toolchains. Current Status: A preliminary validation of published studies showed promising results.
Doctoral Symposium
Mon 10 Oct 2022 11:30 - 12:00 at Ambassador A - Session 2 - AI & Software EngineeringSoftware developers often have to make many decisions. The underlying logic behind these decisions, also called design rationale, represents beneficial and valuable information. In the past, researchers have tried to automatically extract and exploit this information, however, prior techniques are only applicable to specific contexts and there is insufficient progress on an automated end-to-end rationale extraction and management system. In this research project, we propose to use Natural Language Processing (NLP) and Machine Learning (ML) techniques to create a system for the automated extraction, structuring and management of design rationale. This system would support and ensure the consistency and the coherence of the development process.
Doctoral Symposium
Mon 10 Oct 2022 13:30 - 14:00 at Ambassador A - Session 3 - Human Factors & TestingToxic and unhealthy conversations during the developer’s communication may reduce the professional harmony and productivity of Free and Open Source Software (FOSS) projects. For example, toxic code review comments may raise pushback from an author to complete suggested changes. A toxic communication with another person may hamper future communication and collaboration. Research also suggests that toxicity disproportionately impacts newcomers, women, and other participants from marginalized groups. Therefore, toxicity is a barrier to promote diversity, equity, and inclusion. Since the occurrence of toxic communications is not uncommon among FOSS communities and such communications may have serious repercussions, the primary objective of my proposed dissertation is to automatically identify and mitigate toxicity during developers’ textual interactions. On this goal, I aim to: i) build an automated toxicity detector for Software Engineering (SE) domain, ii) identify the notion of toxicity across demographics, and iii) analyze the impacts of toxicity on the outcomes of Open Source Software (OSS) projects.
Pre-printDoctoral Symposium
Mon 10 Oct 2022 14:00 - 14:30 at Ambassador A - Session 3 - Human Factors & TestingContemporary software development organizations are dominated by straight males and lack diversity. As a result, people from other demographic such as women and LGBTQ+ often encounter bias, sexism, and misogyny. Due to negative experiences, many women switch careers. Therefore, biases pose barriers to promote diversity and inclusion. To get benefits from diverse pools of talents and reduce the attrition rate of minorities, we need to identify the degree and effect of various biases and develop mitigation strategies. Therefore, my dissertation study aims at promoting diversity and inclusion among software development organizations by identifying the manifestation, magnitude, and frequency of various gender biases. For this purpose, I plan to investigate i) the effect of gender of the contributors in the code review process of Free/Libre Open Source Software (FLOSS) projects, ii) the frequency of different dimensions of gender bias and their effect, and iii) develop a tool to identify sexist and misogynistic and derogatory (SMD) texts.
Doctoral Symposium
Mon 10 Oct 2022 14:30 - 15:00 at Ambassador A - Session 3 - Human Factors & TestingThe use of non-traditional computing devices is growing rapidly. One paradigm of interest is that of chemical reaction networks (CRNs), which can model and use chemical interactions for computation. These CRNs are used to develop programs at the nanoscale,for applications such as intelligent drug delivery. In practice, these programs are developed in simulation environments and then compiled into physical systems. A challenge when designing CRNs for computation is the lack of techniques to verify and validate correctness. In this work, I adapt software testing and repair techniques for use in this domain. Initial work has proposed a testing framework to handle the challenges caused by these types of programs, which includes CRN with stochastic behaviors and are capable of distributed computation. I further propose extending this work to implement automated program repair of CRN models and automated test generation via program invariants. Future work will develop a notion of fault localization for these programs, develop a theory of mutation generation, and address issues regarding flakiness present in this computing paradigm.
Refactoring code manually can be complex. Several refactoring tools were developed to mitigate the effort needed to create more readable, adaptable, and maintainable code. However, most of them continue to provide late feedback, assistance, and support on how developers should improve their software. That’s where the concept of Live Refactoring comes in. We believe the immediate and continuous suggestion of refactoring candidates to the code will help reduce this problem. Therefore, we prototyped a Live Refactoring Environment that identifies, recommends, and applies Extract Method refactorings. We carried out an empirical experiment that showed us that our approach helped developers reach better code, with more quality, improving their refactoring experience.
In the context of user interface-oriented software development, the task of translating a GUI into code requires sufficient knowledge to identify visual elements and how to code it for one or more platforms. In addition, other issues are important, such as reuse, componentization and understanding of the behavior of trivial visual elements. This is a repetitive and tedious task that could be automated. To perform automation this task many challenges depend on the starting point (hand-draw or hi-fidelity images), the detection and recognition of visual elements from images, their data representation and the code generation itself. This work aims to build a model that makes it possible to automate the process of code generation from images so that it is possible to infer which visual elements are reusable across a range of GUIs and in such a way that you can navigate between GUIs in the same application. This study is being conducted through a DSR (Design Science Research) and so far some classes of problems and artifacts have been found that help to understand how to extract and represent data from GUI and generate web, android and ios application code. The open questions about this problem concern how to identify individual elements and in a group, how to represent these visual elements in a way that can be used in code generators and what is the most efficient code generator for this type of problem. The answers to these questions are part of the next steps of this work.
DOIModern code review (MCR) is a widely adopted software quality assurance practice in the contemporary software industry. As software developers spend significant amounts of time on MCR activities, even a small improvement in MCR effectiveness will incur significant savings. As most of the MCR activities are heavily dependent on manual work, there are significant opportunities to improve effectiveness through tool support. To address the challenges, the primary objective of my proposed dissertation is to improve the effectiveness of modern code reviews with the automation of reviewer selection and bug identification. On this goal, I propose three studies. The first study aims to investigate the notion of useful MCRs and factors influencing MCR usefulness. The second study aims to develop a reviewer recommendation system that leverages a reviewer’s prior history of providing useful feedback under similar contexts. Finally, the third study aims to improve the effectiveness of static analysis tools by leveraging bugs identified during prior reviews.