241-3 Genomic Selection Using Data From Long-Term International Wheat Trials: Dealing with Gxe.

Poster Number 402

See more from this Division: C01 Crop Breeding & Genetics
See more from this Session: Use of Molecular Tools to Enhance Breeding Efforts
Tuesday, October 23, 2012
Duke Energy Convention Center, Exhibit Hall AB, Level 1
Share |

Julie C. Dawson1, Jeffrey Endelman2, Jose Crossa3, Jesse Poland4, Susanne Dreisigacker3, Yann Manes3, Jean-Luc Jannink2 and Mark E. Sorrells1, (1)Dept of Plant Breeding, Cornell University, Ithaca, NY
(2)USDA-ARS, Robert W. Holley Center for Agriculture and Health, Ithaca, NY
(3)International Maize and Wheat Improvement Center (CIMMYT), Mexico DF, Mexico
(4)Hard Winter Wheat Genetics Research Unit, USDA-ARS, Manhattan, KS
Genomic selection offers breeders the possibility of using historic data and unbalanced breeding trials to form training populations for predicting the performance of new linesHowever, in using datasets that are unbalanced over time and space, there is increasing exposure to particular genotype - environment combinations and interactions that may make predictions less accurate.  It is unclear if there is a limit to how unbalanced a dataset can be, and this question is related to how much specific genotype by environment interactions affect predictions.  Global cross-validated genomic estimated breeding value (GEBV) accuracies may be high when using large historic datasets but accuracies for individual years using a forward-prediction approach, or accuracies for individual environments are often much lower.  Using CIMMYT's international semi-arid wheat yield trials (SAWYT) we looked at the accuracy of predictions of breeding values of new lines in international nurseries over time, and the potential to subset these nurseries using CIMMYTs concept of mega-environments adapted to a genomic selection context.  The SAWYT has 17 years (cycles) worth of data in approximately 50 locations per year with 50 genotypes per cycle and very little overlap of genotypes across years. A total of 622 genotypes were included in the complete 17-cycle dataset.  To assess the accuracy of forward prediction, models were trained using phenotypic data for grain yield from all environments from the first three cycles and used to predict grain yield in the fourth cycle, then trained on the first four cycles and used to predict the fifth cycle, continuing until the training dataset included the first 16 cycles to predict the 17th.  We found the accuracy of forward prediction to be highly variable across cycles, ranging from 0 to 0.65.  In part because of the unpredictable effects of the year on the accuracy of predictions, we investigated whether it was possible to subdivide the international trial data into megaenvironments where GEBV predictions for each megaenvironment would be more accurate than a global GEBV prediction.  The last cycle of data was used for validation purposes.   Three models were tested using ASREML, a global model using data from all megaenvironments that produced one global GEBV per genotype, a megaenvironment-specific model using only data from the target megaenvironment to generate predictions for each genotype in that megaenvironment, and a factor analytic model that used all data and the estimated correlations among megaenvironments to generate predictions for each genotype in each megaenvironment.
See more from this Division: C01 Crop Breeding & Genetics
See more from this Session: Use of Molecular Tools to Enhance Breeding Efforts