The Genomic Ideotype: Using pan-genomics to create more vigorous crops

Project Summary

Though “re-sequencing” hundreds of genomes now seems routine, these sequences are always characterized with regard to this one individual or “reference” genome. Unfortunately, a reference genome is missing many genes – ~10,000 in rice, for example (Zhao et al, 2018). This missing information substantially limits the degree to which we can evaluate the gene-content variation and thus the deleterious load of an individual. Actual pan-genomic data – not limited by a single reference – are now on the horizon thanks to innovations in long-read sequencing technologies. Unfortunately, the bioinformatic tools to fully explore these pan-genomes have not kept pace with sequencing capacity. Scalable pan-genomic computational frameworks are being developed in the human genetics field. Though powerful, it is still unclear how to integrate phylogenetic and functional information into these data structures, particularly regarding crops. Such advancements will allow breeders and geneticists to apply pan-genomic information toward tangible gains in crop performance.

Relevant publications

Zhao Q, Feng Q, Lu H, Li Y, Wang A, Tian Q, et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nature Genetics. 2018;50: 278.

Collaborators

Brian Scheffler, USDA-ARS

Return to home page