Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance (ASE 2020 - Research Papers)

Who

Viet Hung Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan

Track

ASE 2020 Research Papers

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Sep 2020 17:10 - 17:30 at Kangaroo - Software Engineering for AI (3) Chair(s): Iftekhar Ahmed

Abstract

Deep learning (DL) training algorithms utilize nondeterminism to improve models’ accuracy and training efficiency. Hence, multiple identical training runs (e.g., identical training data, algorithm, and network) produce different models with different accuracy and training time. In addition to these algorithmic factors, DL libraries (e.g., TensorFlow and cuDNN) introduce additional variance(referred to as implementation-level variance) due to parallelism, optimization, and floating-point computation. This work is the first to study the variance of DL systems and the awareness of this variance among researchers and practitioners. Our experiments on three datasets with six popular networks show large overall accuracy differences among identical training runs. Even after excluding weak models, the accuracy difference is still 10.8%. In addition, implementation-level factors alone cause the accuracy difference across identical training runs to be up to 2.9%, the per-class accuracy difference to be up to 52.4%, and the training time to convergence difference to be up to 145.3%. All core(TensorFlow, CNTK, and Theano) and low-level libraries exhibit implementation-level variance across all evaluated versions. Our researcher and practitioner survey shows that 83.8% of the901 participants are unaware of or unsure about any implementation-level variance. In addition, our literature survey shows that only 19.5±3% of papers in recent top software engineering (SE), AI, and systems conferences use multiple identical training runs to quantify the variance of their DL approaches. This paper raises awareness of DL variance and directs SE researchers to challenging tasks such as creating deterministic DL libraries for debugging and improving the reproducibility of DL software and results.

Link to Preprint

https://www.cs.purdue.edu/homes/lintan/publications/variance-ase20.pdf

Viet Hung Pham

University of Waterloo

Canada

Shangshu Qian

Purdue University

Jiannan Wang

Purdue University

Thibaud Lutellier

University of Waterloo

Jonathan Rosenthal

Purdue University

Lin Tan

Purdue University, USA

United States

Yaoliang Yu

University of Waterloo

Nachiappan Nagappan

Microsoft Research

United States

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Sep
Times are displayed in time zone: (UTC) Coordinated Universal Time

	17:10 - 18:10: Software Engineering for AI (3)Research Papers / Tool Demonstrations at Kangaroo Chair(s): Iftekhar AhmedUniversity of California at Irvine, USA

	17:10 - 17:30 Talk		Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance Research Papers Viet Hung PhamUniversity of Waterloo, Shangshu QianPurdue University, Jiannan WangPurdue University, Thibaud LutellierUniversity of Waterloo, Jonathan RosenthalPurdue University, Lin TanPurdue University, USA, Yaoliang YuUniversity of Waterloo, Nachiappan NagappanMicrosoft Research Pre-print
	17:30 - 17:50 Talk		NeuroDiff: Scalable Differential Verification of Neural Networks using Fine-Grained Approximation Research Papers Brandon PaulsenUniversity of Southern California, Jingbo WangUniversity of Southern California, Jiawei WangUniversity of Southern California, Chao WangUSC Pre-print
	17:50 - 18:00 Talk		RepoSkillMiner: Identifying software expertise from GitHub repositories using Natural Language Processing Tool Demonstrations Efstratios KourtzanidisUniversity Of Macedonia, Alexander ChatzigeorgiouUniversity of Macedonia, Apostolos AmpatzoglouUniversity of Macedonia Pre-print Media Attached File Attached

Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance