The Genomic Ideotype: Using pan-genomics to create more vigorous crops

Project Summary

Though “re-sequencing” hundreds of genomes now seems routine, these sequences are always characterized with regard to this one individual or “reference” genome. Unfortunately, a reference genome is missing many genes – ~10,000 in rice, for example (Zhao et al, 2018). This missing information substantially limits the degree to which we can evaluate the gene-content variation and thus the deleterious load of an individual. Actual pan-genomic data – not limited by a single reference – are now on the horizon thanks to innovations in long-read sequencing technologies. Unfortunately, the bioinformatic tools to fully explore these pan-genomes have not kept pace with sequencing capacity. Scalable pan-genomic computational frameworks are being developed in the human genetics field. Though powerful, it is still unclear how to integrate phylogenetic and functional information into these data structures, particularly regarding crops. Such advancements will allow breeders and geneticists to apply pan-genomic information toward tangible gains in crop performance.

Relevant publications

Vaughn, J. N. et al. Graph-based pangenomics maximizes genotyping density and reveals structural impacts on fungal resistance in melon. Nat Commun 13, 7897 (2022).

Vaughn, J. N. et al. Gene disruption by structural mutations drives selection in US rice breeding over the last century. PLOS Genetics 17, e1009389 (2021).

Collaborators

Brian Scheffler, USDA-ARS

Return to home page