Maize is one of the most valuable global crops and is used for human consumption, for livestock feed, and in many industrial and chemical products. For years, researchers have tried using molecular markers for maize improvement but have fallen short of this goal, mainly, due to maize’s high genomic complexity and the genetic complexity of most traits.
Maize geneticists were researching single gene at a time, but when implementing these single genes into a breeding program the expected phenotypic improvements did not occur except in isolated cases. However, about 20 years ago, plant researchers embraced the successful practice of the whole genome approach the marker development, that was first developed and deployed implemented in the animal breeding field. Relying on the entire genome, especially for maize, a crop genome characterized by its large size, abundant repetitive sequences, numerous duplicated chromosomal regions, and high structural variation, genomic selection became a mainstay practice for crop and maize breeding programs.
Genomic selection is the practice of testing many individual plants and genotyping across their genomes, while measuring their phenotypic performance. This data produces a statistical model where performance of new individuals can be predicted against all genes in the genome, and thus no longer relying on a single gene. When using genomic selection, researchers do not even need to know the specific genes or their functions, yet genomic predictions have proven to be more successful for crop improvement involving complex traits than anything done previously.
The figure illustrates the significant growth in maize productions following the advancements of research and biotechnology. https://vitalbypoet.com/stories/reliving-the-80s-ag-crisis
Maize’s popularity and high demand led to large R&D infrastructures for breeding pipelines and the motivation to apply the most cutting-edge technologies, including genotyping. However, even for large companies and institutions, genotyping maize at the scale required for genomic prediction can be costly. Other factors that have promoted the need for efficient breeding programs include limited experimental field space and unpredictable environmental conditions that can cause collected field data to be skewed or missing. Accordingly, the shift towards computational tools is becoming increasingly important.
To mitigate costs and time, a company could test fewer progenies or reduce their marker set analysis. Neither option is ideal and can damage breeding pipelines by reducing the size of the breeding program or overlooking important data. Maize researchers and companies want to reduce their costs without compromising on the size of their progenies or the amount of generated data and knowledge.
Imputation is the answer to this problem – it fully optimizes a genotyping strategy by maximizing the information generated from a minimal marker set. Imputation can assist all genotyping applications as it relies on a specific breeding program’s genetic background and generates a small and accurate subset of markers to serve as a skeleton for the imputation of a larger marker dataset. NRGene’s genotyping solution, SNPer™, does just that. The intent is to reduce the costs of genotyping by minimizing the number of markers directly genotyped in the lab, yet deliver maximum marker data through computational algorithms. Because the cost of genotyping scales with marker number but not with imputation, substituting imputation for genotyping can greatly reduce the costs of acquiring genotypic data.
It is not unusual to accurately impute a number of markers that is ten times greater than the number genotyped. For example, in the absence of marker optimization and data imputation, one might need to genotype 5,000 or more loci in a breeding population to obtain data for 5,000 loci that are non-redundant and informative for genomic selection. With imputation however, one might genotype 500 loci and accurately impute data for the remaining 4,500. The resulting dataset is nearly identical to directly genotyping the 5,000 loci, but the lab costs are greatly reduced.
Quality control for genotypic data is an important step in the imputation process. Erroneous data may be detected and corrected, and progenies for which genotypic data are inconsistent with pedigree records may be detected and discarded. The imputed data can then be used to make genomic predictions on new progenies just as if the data came from direct genotyping in the lab. These predictions are made prior to field testing, allowing breeders to select only those individuals for field testing that are expected to exhibit superior performance. Taken together, the methods of marker optimization and data imputation allow breeding programs to reduce the costs of genotyping, make accurate genomic predictions, use field resources efficiently, and improve year-to-years gains.
Understanding the full picture is important…. but doing that at the lowest possible cost is essential. Whether to catch mistakes in advance or to help generate and develop markers for accurate breeding predictions,companies have selected marker optimization using low-cost genomic imputation to assist their genotyping programs.