Supplemental Material for ESEC/FSE 2015 paper: Learning Performance-Influence Models for Highly Configurable Systems

This page provides supplemental material for the ESEC/FSE 2015 submission. In this submission, we propose a novel approach that combines binary and numeric configuration options for performance analysis for the first time. We use adapted machine-learning and sampling techniques to learn a performance-influence model. This model consists of terms that correspond to performance influences of configuration options and their interactions. We evaluated learning accuracy (in terms of prediction error), correctness, and measurement effort with two experiments.

First Study: Synthetic Models

In the first experiment, we use synthetic models that are derived from real-world performance models. The real-world models contain only binary options such that we enriched them with numeric options. Below, we show these synthetic models including the learning data. The learning data shows each iteration of our learning approach in which we begin with a simple model and extend it by more complex terms (i.e., interactions and more complex functions). The log file shows the error in each step and the model learned. Also, we give the variability model to be used in our tool SPL Conqueror (https://github.com/nsiegmun/SPLConqueror), which implements our approach and is open source.

System	Synthetic Model	Log-File	Variability Models
AJStats	11899,3227983062 * root + 2983,87559778446 * CodeFormatter + 10 * Num1 + -1,5 * Num1 * InterTypeDeclare	DL	DL
Apache	1206,25 * root + 935,625000000001 * KeepAlive + -173,125 * HostnameLookups + -186,875000000001 * AccessLog + 188,90625 * InMemory + -104,375000000001 * ExtendedStatus + 42,499999999999 * FollowSymLinks + 51,2499999999999 * EnableSendfile + 2 * Num1 * Num1 + 3 * Num2 + -2,5 * Num1 * Num2	DL	DL
BDB C	0,05 * PAGESIZE + -0,12 * CACHESIZE + 2,4 * HAVE_HASH + 29,88 * HAVE_CRYPTO * HAVE_HASH + 0,35 * DIAGNOSTIC + 0,12 * HAVE_STATISTICS + 7,2 * HAVE_CRYPTO + 0,5 * PAGESIZE * HAVE_CRYPTO + 5,5 * HAVE_CRYPTO * HAVE_VERIFY + 0,12 * HAVE_HASH * HAVE_REPLICATION + 5,63 * HAVE_CRYPTO * HAVE_STATISTICS + 0,26 * HAVE_CRYPTO * HAVE_HASH * HAVE_SEQUENCE + 8,04 * HAVE_HASH * HAVE_VERIFY * HAVE_STATISTICS + -2,37 * HAVE_CRYPTO * HAVE_HASH * HAVE_REPLICATION * HAVE_VERIFY + -0,59 * HAVE_CRYPTO * HAVE_HASH * HAVE_REPLICATION + -0.15 * CACHESIZE * HAVE_CRYPTO + 1 * root	DL	DL
BDB J	98599,5916666667 * root + 237630,811666667 * Finest + -46072,028333333 * S100MiB + -189466,115 * S100MiB * Finest + 2 * Num1 + 4 * Num2 + 3,5 * Num3 + 1,5 * Num4	DL	DL
Clasp	173554,681081086 * root + 318523,818532818 * heuristicUnit + -103411,870761673 * eq + -24600,5000000002 * heuristicVsids + -11816,7857142856 * heuristicVmtf + -33557,8961038976 * heuristic + -95375,3513513509 * heuristicUnit * satPreproYes + 3990,79729729646 * transExt * satPreproYes + -136928,416666666 * eq * heuristicUnit + 12309,4990990994 * eq * satPreproYes + 33925,0833333346 * eq * heuristic + -643,428571428088 * backprop * heuristicVsids + -11876,2857142853 * backprop * heuristicUnit + 1620,24242424222 * eq * backprop + -7205,2500000002 * eq * heuristicBerkmin + -2 * Num1 * Num2 + 10 * Num3 * Num4	DL	DL
LLVM	207 * time_passes + 16 * gvn + 16 * licm + 12 * instcombine + 14 * inline + 3,5 * time_passes * Num1 * Num2 + 5,5 * gvn * licm * Num1 * Num1 + -3,7 * instcombine * inline * Num2	DL	DL
lrzip	43838 * level + 2218747 * compressionZpaq + 288311 * compressionLrzip + 191662 * compressionBzip2 + 34718 * compressionGzip + 11946 * encryption + 6676 * compression + 3433850 * compressionZpaq * level9 + 836940 * compressionLrzip * level8 + 720098 * compressionLrzip * level7 + 3415670 * compressionZpaq * level8 + 485719 * compressionLrzip * level9 + -1597534 * compressionZpaq * level1 + -1597084 * compressionZpaq * level3 + -1596575 * compressionZpaq * level2 + 111344 * compressionGzip * level9 + 102375 * compressionGzip * level8 + 59973 * compressionGzip * level7 + -129840 * compressionLrzip * level2 + -128920 * compressionLrzip * level1 + 42831 * compressionGzip * level6 + 21313 * compressionGzip * level5 + -55078 * compressionLrzip * level3 + 43656 * compressionLrzip * level6 + -37020 * compressionBzip2 * level1 + 3,5 * Num1 * Num2 + 4 * Num3 + 5 * Num4 * Num4	DL	DL

Binary Sampling Heurisitcs: Option-Wise: OW; Pair-Wise: PW; Negative Option-Wise: nOW;
Numeric Sampling Designs: Plackett-Burman Design: PBD; Hyper Sampling: HS; Central-Composite Design: CCD; Box-Behnken Design: BB

Second Study: Real-World Systems

In the second experiment, we use six real-world configurable systems to evaluate measurement effort and prediction error. Again, we provide the log file for learning, the variability model, and also the raw measurement data.

System	Log-File	Variability Models	Raw Measurements
Dune	DL	DL	DL
JavaGC	DL	DL	DL
Hipacc	DL	DL	DL
HSMGP	DL	DL	DL
SaC	DL	DL	DL
x264	DL	DL	DL