Genetic programming for stacked generalization

Bakurov, I.; Castelli, M.; Gau, O.; Fontanella, F.; Vanneschi, L.

doi:10.1016/j.swevo.2021.100913

In machine learning, ensemble techniques are widely used to improve the performance of both classification and regression systems. They combine the models generated by different learning algorithms, typically trained on different data subsets or with different parameters, to obtain more accurate models. Ensemble strategies range from simple voting rules to more complex and effective stacked approaches. They are based on adopting a meta-learner, i.e. a further learning algorithm, and are trained on the predictions provided by the single algorithms making up the ensemble. The paper aims at exploiting some of the most recent genetic programming advances in the context of stacked generalization. In particular, we investigate how the evolutionary demes despeciation initialization technique, ϵ-lexicase selection, geometric-semantic operators, and semantic stopping criterion, can be effectively used to improve GP-based systems’ performance for stacked generalization (a.k.a. stacking). The experiments, performed on a broad set of synthetic and real-world regression problems, confirm the effectiveness of the proposed approach.