Blogs (61) >>

Reproducible experiments is an important pillar of well-founded research. Having benchmarks that are publicly available and representative of real-world applications is an important step towards that; it allows us to measure the results of a tool in terms of its precision, recall and overall accuracy. Having such benchmarks is different from having a corpus of programs—a benchmark needs to have labelled data that can be used as ground truth when measuring precision and recall.

With the increased advent in Artifact Evaluation Committees in most PL/SE conferences, reproducibility studies are making their way to the CFP of top conferences such as ECOOP and ISSTA. In some domains, there are established benchmarks used by a community, however, in other domains, the lack of a benchmark prevents researchers from measuring the true value of their newly developed technique.

BenchWork aims at providing a platform for researchers and practitioners to share their experience and thoughts, discussing key learnings from the PL and SE communities, to be able to improve on the sets of benchmarks that are available, or in some cases start/continue the discussion on developing a new benchmark, and their role in research and industry.

Supported By

Oracle Labs

Talks

Title
A Micro-Benchmark for Dynamic Program Behaviour
BenchWork
Analyzing Duplication in JavaScript
BenchWork
AndroZoo: Lessons Learnt After 2 Years of Running a Large Android App Collection
BenchWork
Benchmarking WebKit
BenchWork
File Attached
Building a Node.js Benchmark: Initial Steps
BenchWork
File Attached
In Search of Accurate Benchmarking
BenchWork
File Attached
InspectorClone: Evaluating Precision of Clone Detection Tools
BenchWork
Opening Remarks
BenchWork
Performance Monitoring in Eclipse OpenJ9
BenchWork
Real World Benchmarks for JavaScript
BenchWork
File Attached
The Architecture Independent Workload Characterization
BenchWork
File Attached
Towards a Data-Curation Platform for Code-Centric Research
BenchWork
File Attached

Call for Talks

We welcome contributions in the form of talk abstracts within (but not limited to) the following topics:

  • Experiences with benchmarking in the areas of program-analysis (e.g., finding bugs, measuring points-to sets)
  • Experiences with benchmarking of virtual machines (e.g., measuring memory management overhead)
  • Experiences with benchmarking in the areas of software engineering (e.g., clone detection, testing techniques)
  • Infrastructure related to support of a benchmark over time, across different versions of the relevant programs
  • Metrics that are valuable in the context of incomplete programs
  • Support for dynamic analysis, where the benchmark programs need to be run
  • Automation of creation of benchmarks
  • Licensing issues
  • What types of program should be included in program-analysis benchmarks?
  • What type of analysis do you perform?
  • What build systems do your tool support?
  • What program-analysis benchmarks do you typically use? What are their pros and cons?
  • What are the useful metrics to consider when creating program-analysis benchmarks?
  • How can we handle incomplete code in benchmarks?
  • How can program-analysis benchmarks provide good support for dynamic analyses?
  • How can we automate the creation of program-analysis benchmarks?

You're viewing the program in a time zone which is different from your device's time zone -

Wed 18 Jul
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna

11:00 - 12:30: Real-World BenchmarkingBenchWork at Hanoi
11:00 - 11:10
Opening Remarks
BenchWork
Karim AliUniversity of Alberta, Cristina CifuentesOracle Labs
11:10 - 11:40
Real World Benchmarks for JavaScript
BenchWork
File Attached
11:40 - 12:00
In Search of Accurate Benchmarking
BenchWork
Edd BarrettKing's College London, Sarah MountKing's College London, Laurence TrattKing's College London
File Attached
12:00 - 12:30
AndroZoo: Lessons Learnt After 2 Years of Running a Large Android App Collection
BenchWork
Kevin AllixUniversity of Luxembourg
14:00 - 15:30: JavaScript & Dynamic BehaviourBenchWork at Hanoi
14:00 - 14:30
Benchmarking WebKit
BenchWork
File Attached
14:30 - 14:50
Analyzing Duplication in JavaScript
BenchWork
Petr MajCzech Technical University, Celeste HollenbeckNortheastern University, USA, Shabbir HussainNortheastern University, Jan VitekNortheastern University
14:50 - 15:10
Building a Node.js Benchmark: Initial Steps
BenchWork
Petr MajCzech Technical University, François GauthierOracle Labs, Celeste HollenbeckNortheastern University, USA, Jan VitekNortheastern University, Cristina CifuentesOracle Labs
File Attached
15:10 - 15:30
A Micro-Benchmark for Dynamic Program Behaviour
BenchWork
Li SuiMassey University, New Zealand, Jens DietrichMassey University, Michael EmeryMassey University, Amjed TahirMassey University, Shawn RasheedMassey University
16:00 - 17:40: Software Engineering & CompilersBenchWork at Hanoi
16:00 - 16:30
InspectorClone: Evaluating Precision of Clone Detection Tools
BenchWork
16:30 - 16:50
Towards a Data-Curation Platform for Code-Centric Research
BenchWork
Ben HermannUniversity of Paderborn, Lisa Nguyen Quang DoPaderborn University, Eric BoddenHeinz Nixdorf Institut, Paderborn University and Fraunhofer IEM
File Attached
16:50 - 17:10
The Architecture Independent Workload Characterization
BenchWork
Beau JohnstonAustralian National University
File Attached
17:10 - 17:40
Performance Monitoring in Eclipse OpenJ9
BenchWork

BenchWork has some limited funding to support travel, accommodation, or registration for students who are undertaking studies in programming languages or software engineering and want to participate in the workshop. Funding will be available to students who have not received additional travel support.

To apply, please email Karim Ali your name and affiliation, supervisor name, topic of study for your Master’s or PhD, type of funding requested (travel, accommodation, registration) and cost. Application deadline is July 1st.