Supplemental Material for ESEC/FSE 2015 paper: Learning Performance-Influence Models for Highly Configurable Systems



This page provides supplemental material for the ESEC/FSE 2015 submission. In this submission, we propose a novel approach that combines binary and numeric configuration options for performance analysis for the first time. We use adapted machine-learning and sampling techniques to learn a performance-influence model. This model consists of terms that correspond to performance influences of configuration options and their interactions. We evaluated learning accuracy (in terms of prediction error), correctness, and measurement effort with two experiments.

First Study: Synthetic Models


In the first experiment, we use synthetic models that are derived from real-world performance models. The real-world models contain only binary options such that we enriched them with numeric options. Below, we show these synthetic models including the learning data. The learning data shows each iteration of our learning approach in which we begin with a simple model and extend it by more complex terms (i.e., interactions and more complex functions). The log file shows the error in each step and the model learned. Also, we give the variability model to be used in our tool SPL Conqueror (https://github.com/nsiegmun/SPLConqueror), which implements our approach and is open source.
SystemSynthetic ModelLog-FileVariability Models
AJStats11899,3227983062 * root + 2983,87559778446 * CodeFormatter + 10 * Num1 + -1,5 * Num1 * InterTypeDeclareDLDL
Apache1206,25 * root + 935,625000000001 * KeepAlive + -173,125 * HostnameLookups + -186,875000000001 * AccessLog + 188,90625 * InMemory + -104,375000000001 * ExtendedStatus + 42,499999999999 * FollowSymLinks + 51,2499999999999 * EnableSendfile + 2 * Num1 * Num1 + 3 * Num2 + -2,5 * Num1 * Num2DLDL
BDB C0,05 * PAGESIZE + -0,12 * CACHESIZE + 2,4 * HAVE_HASH + 29,88 * HAVE_CRYPTO * HAVE_HASH + 0,35 * DIAGNOSTIC + 0,12 * HAVE_STATISTICS + 7,2 * HAVE_CRYPTO + 0,5 * PAGESIZE * HAVE_CRYPTO + 5,5 * HAVE_CRYPTO * HAVE_VERIFY + 0,12 * HAVE_HASH * HAVE_REPLICATION + 5,63 * HAVE_CRYPTO * HAVE_STATISTICS + 0,26 * HAVE_CRYPTO * HAVE_HASH * HAVE_SEQUENCE + 8,04 * HAVE_HASH * HAVE_VERIFY * HAVE_STATISTICS + -2,37 * HAVE_CRYPTO * HAVE_HASH * HAVE_REPLICATION * HAVE_VERIFY + -0,59 * HAVE_CRYPTO * HAVE_HASH * HAVE_REPLICATION + -0.15 * CACHESIZE * HAVE_CRYPTO + 1 * rootDLDL
BDB J98599,5916666667 * root + 237630,811666667 * Finest + -46072,028333333 * S100MiB + -189466,115 * S100MiB * Finest + 2 * Num1 + 4 * Num2 + 3,5 * Num3 + 1,5 * Num4DLDL
Clasp173554,681081086 * root + 318523,818532818 * heuristicUnit + -103411,870761673 * eq + -24600,5000000002 * heuristicVsids + -11816,7857142856 * heuristicVmtf + -33557,8961038976 * heuristic + -95375,3513513509 * heuristicUnit * satPreproYes + 3990,79729729646 * transExt * satPreproYes + -136928,416666666 * eq * heuristicUnit + 12309,4990990994 * eq * satPreproYes + 33925,0833333346 * eq * heuristic + -643,428571428088 * backprop * heuristicVsids + -11876,2857142853 * backprop * heuristicUnit + 1620,24242424222 * eq * backprop + -7205,2500000002 * eq * heuristicBerkmin + -2 * Num1 * Num2 + 10 * Num3 * Num4DLDL
LLVM207 * time_passes + 16 * gvn + 16 * licm + 12 * instcombine + 14 * inline + 3,5 * time_passes * Num1 * Num2 + 5,5 * gvn * licm * Num1 * Num1 + -3,7 * instcombine * inline * Num2DLDL
lrzip43838 * level + 2218747 * compressionZpaq + 288311 * compressionLrzip + 191662 * compressionBzip2 + 34718 * compressionGzip + 11946 * encryption + 6676 * compression + 3433850 * compressionZpaq * level9 + 836940 * compressionLrzip * level8 + 720098 * compressionLrzip * level7 + 3415670 * compressionZpaq * level8 + 485719 * compressionLrzip * level9 + -1597534 * compressionZpaq * level1 + -1597084 * compressionZpaq * level3 + -1596575 * compressionZpaq * level2 + 111344 * compressionGzip * level9 + 102375 * compressionGzip * level8 + 59973 * compressionGzip * level7 + -129840 * compressionLrzip * level2 + -128920 * compressionLrzip * level1 + 42831 * compressionGzip * level6 + 21313 * compressionGzip * level5 + -55078 * compressionLrzip * level3 + 43656 * compressionLrzip * level6 + -37020 * compressionBzip2 * level1 + 3,5 * Num1 * Num2 + 4 * Num3 + 5 * Num4 * Num4DLDL

Binary Sampling Heurisitcs: Option-Wise: OW; Pair-Wise: PW; Negative Option-Wise: nOW;
Numeric Sampling Designs: Plackett-Burman Design: PBD; Hyper Sampling: HS; Central-Composite Design: CCD; Box-Behnken Design: BB

Second Study: Real-World Systems


In the second experiment, we use six real-world configurable systems to evaluate measurement effort and prediction error. Again, we provide the log file for learning, the variability model, and also the raw measurement data.

SystemLog-FileVariability ModelsRaw Measurements
DuneDLDLDL
JavaGCDLDLDL
HipaccDLDLDL
HSMGPDLDLDL
SaCDLDLDL
x264DLDLDL