The Genomic Ideotype: Using pan-genomics to create more vigorous crops
Project Summary
Though “re-sequencing” hundreds of genomes now seems routine, these sequences are always characterized with regard to this one individual or “reference” genome. Unfortunately, a reference genome is missing many genes – ~10,000 in rice, for example (Zhao et al, 2018). This missing information substantially limits the degree to which we can evaluate the gene-content variation and thus the deleterious load of an individual. Actual pan-genomic data – not limited by a single reference – are now on the horizon thanks to innovations in long-read sequencing technologies. Unfortunately, the bioinformatic tools to fully explore these pan-genomes have not kept pace with sequencing capacity. Scalable pan-genomic computational frameworks are being developed in the human genetics field. Though powerful, it is still unclear how to integrate phylogenetic and functional information into these data structures, particularly regarding crops. Such advancements will allow breeders and geneticists to apply pan-genomic information toward tangible gains in crop performance.
Relevant publications
Collaborators
Brian Scheffler, USDA-ARS