In machine learning, ensemble techniques are widely used to improve the performance of both classification and regression systems. They combine the models generated by different learning algorithms, typically trained on different data subsets or with different parameters, to obtain more accurate models. Ensemble strategies range from simple voting rules to more complex and effective stacked approaches. They are based on adopting a meta-learner, i.e. a further learning algorithm, and are trained on the predictions provided by the single algorithms making up the ensemble. The paper aims at exploiting some of the most recent genetic programming advances in the context of stacked generalization. In particular, we investigate how the evolutionary demes despeciation initialization technique, ϵ-lexicase selection, geometric-semantic operators, and semantic stopping criterion, can be effectively used to improve GP-based systems’ performance for stacked generalization (a.k.a. stacking). The experiments, performed on a broad set of synthetic and real-world regression problems, confirm the effectiveness of the proposed approach.

Genetic programming for stacked generalization

Fontanella F.;
2021-01-01

Abstract

In machine learning, ensemble techniques are widely used to improve the performance of both classification and regression systems. They combine the models generated by different learning algorithms, typically trained on different data subsets or with different parameters, to obtain more accurate models. Ensemble strategies range from simple voting rules to more complex and effective stacked approaches. They are based on adopting a meta-learner, i.e. a further learning algorithm, and are trained on the predictions provided by the single algorithms making up the ensemble. The paper aims at exploiting some of the most recent genetic programming advances in the context of stacked generalization. In particular, we investigate how the evolutionary demes despeciation initialization technique, ϵ-lexicase selection, geometric-semantic operators, and semantic stopping criterion, can be effectively used to improve GP-based systems’ performance for stacked generalization (a.k.a. stacking). The experiments, performed on a broad set of synthetic and real-world regression problems, confirm the effectiveness of the proposed approach.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11580/88351
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
social impact