Ensemble AnalySis with Interpretable Genomic Prediction (EasiGP): Computational tool for interpreting ensembles of genomic prediction models
Abstract
An ensemble of multiple genomic prediction models has grown in popularity due to consistent prediction performance improvements in crop breeding. However, technical tools that analyze the predictive behavior at the genome level are lacking. Here, we develop a computational tool called Ensemble AnalySis with Interpretable Genomic Prediction (EasiGP) that uses circos plots to visualize how different genomic prediction models quantify contributions of marker effects to trait phenotypes. As a demonstration of EasiGP, multiple genomic prediction models, spanning conventional statistical and machine learning algorithms, were used to infer the genetic architecture of days to anthesis (DTA) in a maize mapping population. The results indicate that genomic prediction models can capture different views of trait genetic architecture, even when their overall profiles of prediction accuracy are similar. Combinations of diverse views of the genetic architecture for the DTA trait in the teosinte nested association mapping study might explain the improved prediction performance achieved by ensembles, aligned with the implication of the Diversity Prediction Theorem. In addition to identifying well-known genomic regions contributing to the genetic architecture of DTA in maize, the ensemble of genomic prediction models highlighted several new genomic regions that have not been previously reported for DTA. Finally, different views of trait genetic architecture were observed across subpopulations, highlighting challenges for between-population genomic prediction. A deeper understanding of genomic prediction models with enhanced interpretability using EasiGP can reveal several critical findings at the genome level from the inferred genetic architecture, providing insights into the improvement of genomic prediction for crop breeding programs.

