Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding
Bayer, P., Hu, R., Valliyodan, B., Marsh, J., Yuan, A., Vuong, T., Patil, G., Song, Q., Batley, J., Varshney, R., Lam, H. M., Edwards, D., Nguyen, H. Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. Plant Genome. 2021 June;e20109 doi: 10.1002/tpg2.20109 Epub ahead of print. PMID: 34169673
The objective of this project was to assemble a soybean pangenome representing more than 1,000 soybean accessions derived from the USDA Soybean Germplasm Collection, including both wild and cultivated lineages, to assess genome-wide changes in gene and allele frequency during domestication and breeding. We identified 3,765 genes that are absent from the Lee reference genome assembly and assessed the presence/absence of all genes across this population. The pangenome is presented here together with the presence/absence state of each gene in this population and SNPs between individuals.
Files available for download include the variant data in VCF (variant call format) for the 1000 accessions from the USDA Soybean Germplasm Collection, presence/absence matrix for gene models present or absent from the Lee assembly, and protein sequences unique to each accession.
Funded by the USDA-ARS. Developed by the USDA-ARS SoyBase and Legume Clade Database group at the Iowa State University, Ames, IA | ||