While CUDA has been the dominated parallel computing platform and programming model for general-purpose GPU computing, CUDA synchronization undergoes significant challenges for GPU programmers due to its intricate parallel computing mechanism and coding practices. In this paper,we propose AuCS, the first general framework to automate synchronization for CUDA kernel functions. AuCS transforms the original LLVM-level CUDA program control flow graph in a semantic-preserving manner for exploring the possible barrier function locations. Accordingly, AuCS develops mechanisms to correctly place barrier functions for automating synchronization in multiple erroneous (challenging-to-be-detected) synchronization scenarios, including data race, barrier divergence, redundant barrier functions. To evaluate the effectiveness and efficiency of AuCS, we conduct an intensive set of experiments and the results suggest that AuCS can automate 20 out of 24 erroneous synchronization scenarios.
Thu 14 Nov
10:40 - 11:00 Talk | MAP-Coverage: a Novel Coverage Criterion for Testing Thread-Safe Classes Zan WangCollege of Intelligence and Computing, Tianjin University, Yingquan ZhaoCollege of Intelligence and Computing, Tianjin University, Shuang LiuCollege of Intelligence and Computing, Tianjin University, Jun SunSingapore Management University, Singapore, Xiang ChenSchool of Information Science and Technology, Nantong University, Huarui LinCollege of Intelligence and Computing, Tianjin University | |||||||||||||||||||||||||||||||||||||||||
11:00 - 11:20 Talk | Automating Non-Blocking Synchronization In Concurrent Data Abstractions Jiange ZhangUniversity of Colorado Colorado Springs, Qing YiUniversity of Colorado Colorado Springs, Damian DechevUniversity of Central Florida Pre-print | |||||||||||||||||||||||||||||||||||||||||
11:20 - 11:40 Talk | Automating CUDA Synchronization via Program Transformation Mingyuan WuSouthern University of Science and Technology, Lingming ZhangThe University of Texas at Dallas, Cong LiuEindhoven University of Technology, Shin Hwei TanSouthern University of Science and Technology, Yuqun ZhangSouthern University of Science and Technology | |||||||||||||||||||||||||||||||||||||||||
11:40 - 12:00 Talk | Efficient Transaction-Based Deterministic Replay for Multi-threaded Programs Ernest Bota PobeeCity University of Hong Kong, Xiupei MeiCity University of Hong Kong, Wing-Kwong ChanCity University of Hong Kong, Hong Kong | |||||||||||||||||||||||||||||||||||||||||
12:00 - 12:10 Demonstration | VeriSmart 2.0: Swarm-Based Bug-Finding for Multi-Threaded Programs with Lazy-CSeq Bernd FischerStellenbosch University, Salvatore La TorreUniversità degli Studi di Salerno, Gennaro ParlatoUniversity of Molise | |||||||||||||||||||||||||||||||||||||||||
12:10 - 12:20 Demonstration | ConVul: An Effective Tool for Detecting Concurrency Vulnerabilities Ruijie MengUniversity of Chinese Academy of Sciences, Biyun ZhuUniversity of Chinese Academy of Sciences, Hao YunUniversity of Chinese Academy of Sciences, Haicheng LiUniversity of Chinese Academy of Sciences, Yan CaiInstitute of Software, Chinese Academy of Sciences, Zijiang YangWestern Michigan University |