Rule-based specification mining leveraging learning to rank
Software systems are often released without formal specifications. To deal with the problem of lack of and outdated specifications, rule-based specification mining approaches have been proposed. These approaches analyze execution traces of a system to infer the rules that characterize the protocols, typically of a library, that its clients must obey. Rule-based specification mining approaches work by exploring the search space of all possible rules and use interestingness measures to differentiate specifications from false positives. Previous rule-based specification mining approaches often rely on one or two interestingness measures, while the potential benefit of combining multiple available interestingness measures is not yet investigated. In this work, we propose a learning to rank based approach that automatically learns a good combination of 38 interestingness measures. Our experiments show that the learning to rank based approach outperforms the best performing approach leveraging single interestingness measure by up to 66%.