Improved genomic prediction performance with ensembles of diverse models

Tomura S, Wilkinson MJ, Cooper M and Powell O

G3 Genes|Genomes|Genetics
https://doi.org/10.1093/g3journal/jkaf048

Abstract

The improvement of selection accuracy of genomic prediction is a key factor in accelerating genetic gain for crop breeding. Traditionally, efforts have focused on developing superior individual genomic prediction models. However, this approach has limitations due to the absence of a consistently “best” individual genomic prediction model, as suggested by the No Free Lunch Theorem. The No Free Lunch Theorem states that the performance of an individual prediction model is expected to be equivalent to the others when averaged across all prediction scenarios. To address this, we explored an alternative method: combining multiple genomic prediction models into an ensemble. The investigation of ensembles of prediction models is motivated by the Diversity Prediction Theorem, which indicates the prediction error of the many-model ensemble should be less than the average error of the individual models due to the diversity of predictions among the individual models. To investigate the implications of the No Free Lunch and Diversity Prediction Theorems, we developed a naïve ensemble-average model, which equally weights the predicted phenotypes of individual models. We evaluated this model using 2 traits influencing crop yield—days to anthesis and tiller number per plant—in the teosinte nested association mapping dataset. The results show that the ensemble approach increased prediction accuracies and reduced prediction errors over individual genomic prediction models. The advantage of the ensemble was derived from the diverse predictions among the individual models, suggesting the ensemble captures a more comprehensive view of the genomic architecture of these complex traits. These results are in accordance with the expectations of the Diversity Prediction Theorem and suggest that ensemble approaches can enhance genomic prediction performance and accelerate genetic gain in crop breeding programs.

TOP