SoyBase SoyBase transitions to NEW site on 10/1/2024
Integrating Genetics and Genomics to Advance Soybean Research

Download Data

SoyBase Data

Sequence Downloads

Pan Glycine Sequences

FASTA files of genomic, gene model and protein sequences from Glycine cultivars assembled at the Data Store.
Glycine max Downloads
Wm82.a1 Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82.a2 Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82.a4 Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82.a5 Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82.a6 (ISU01) Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82_IGA1008 Williams 82 Download Genome

Download CDS

Download Protein

Download GFF file

Wm82_NJAU Willaims 82 Download Genome

Download CDS

Download Protein

Download GFF file

Lee.a1 Lee Download Genome

Download CDS

Download Protein

Download GFF file

Lee.a2 Lee Download Genome

Download CDS

Download Protein

Download GFF file

Lee.a3 Lee Download Genome

Download CDS

Download Protein

Download GFF file

FiskebyIII.a1 FiskebyIII Download Genome

Download CDS

Download Protein

Download GFF file

Amsoy.a1 Amsoy Download Genome

Download CDS

Download Protein

Download GFF file

PI_398296.a1 PI 398296, KAS 173-3 Download Genome

Download CDS

Download Protein

Download GFF file

PI_548362.a1 PI 548362, Lincoln Download Genome

Download CDS

Download Protein

Download GFF file

ZH13.a1 Zhonghuang 13 Download Genome

Download CDS

Download Protein

Download GFF file

ZH13.a2 Zhonghuang 13 Download Genome

Download CDS

Download Protein

Download GFF file

ZH13_IGA1005.a1 Zhonghuang 13 Download Genome

Download CDS

Download Protein

Download GFF file

ZH35_IGA1004.a1 Zhonghuang 35 Download Genome

Download CDS

Download Protein

Download GFF file

Glycine soja Downloads
PI483463.a1 PI 483463 Download Genome

Download CDS

Download Protein

Download GFF File

W05.a1 W05 Download Genome

Download CDS

Download Protein

Download GFF File

PI549046.a1 PI 549046, ZYD 3728 Download Genome

Download CDS

Download Protein

Download GFF File

PI562565.a1 PI 562565, KF13 Download Genome

Download CDS

Download Protein

Download GFF File

PI578357.a1 PI 578357, KZ-6352/91 Download Genome

Download CDS

Download Protein

Download GFF File

Genetic Map

Download genetic map coordinates for selected features

Data are available for the Composite Genetic Map or the Consensus Genetic Map
First choose the Linkage Group. Then choose a genetic map feature type.
Enter the range of interest in cM. Use a range of 0-9999 to retrieve all features for a linkage group.

Download sequences for genetic loci

Genome Sequence

Download sequences from SoyBase BLAST target databases

SoyBase provides a number of specialized databases used with BLAST sequence similarity searching. Their sequences can be downloaded in FASTA format here.

Glyma 1.1 to Glyma2.0 Correspondence Lookup

Gene Model Version Glyma 1.1 to Glyma2.0 Correspondence Lookup

Download genome sequence coordinates for selected features

Retrieve Gene Model positions by Gene Model Name

Use this tool to retrieve the sequence coordinates for a list of soybean gene calls. A table containing the chromosome and the beginning and end positions (in bp on the chromosome) for each gene call will be returned along with a text file that you can download to your computer from the "Download Results" button.
First upload a file of gene call names as seen on the genome sequence.


Download genome sequence coordinates for selected features by chromosome

Use this tool to retrieve the sequence coordinates for all of the markers or gene calls on a single chromosome or for the whole genome. Currently only gene calls or molecular markers are available. A table containing the chromosome and the beginning and end positions (in bp on the chromosome) will be returned.
First choose the assembly you are interested in. Next choose the chromosome you are interested in. Then choose the type of feature you want to retrieve. The result of the search will be a text file that can be downloaded when the "Download Results" button appears.

The "Retrieve All" button will collect the information for the chosen feature type for all chromosomes.

Download a list of names and sequence coordinates for gene models or markers in a chromosomal region

Retrieve a list of names and sequence coordinates for gene models or markers in a chromosomal region.

Download longest transcript or predicted protein sequence for gene calls

Use this tool to submit a list of one or more gene models. A table containing the chromosome and the beginning and end positions (in bp on the chromosome) will be returned.
First upload a file of gene calls as seen on the genome sequence. Then choose the sequence type desired. The result of the search will be a text file avaliable for download to your computer.


Nucleic Acid Sequence Download list of sequences as a FASTA file of nucleic acid sequences
Protein Sequence Download list of sequences as a FASTA file of protein sequences

The "Retrieve All" button will retrieve the gene call sequences for all chromosomes. (Please select a Sequence Type first).

Download annotations for selected gene calls

Gene Annotation Download
This tool accepts a list of Wm82 gene model names (eg. Glyma12g10780) and returns functional, biological process, cellular compartment and other annotations for a gene list.

Download Gene Model Flanking Sequence

Flanking Sequence Download
This tool accepts a list of Wm82 gene model names (eg. Glyma12g10780) and returns the 5' and/or 3' flanking sequence.

Download Gene Model 3' And 5' UTR Sequences

Gene Model 3' And 5' UTR Sequence Download
This tool accepts a list of Wm82 gene model names (eg. Glyma12g10780) and returns the sequences of the 5' and 3' UTRs of all mRNAs for that gene.

Download SoySNP50K Data

The SoySNP50K iSelect BeadChip has been used to genotype the USDA Soybean Germplasm Collection (Song, Qijian, David L. Hyten, Gaofeng Jia, Charles V. Quigley, Edward W. Fickus, Randall L. Nelson, and Perry B. Cregan. 2015. Fingerprinting soybean germplasm and its utility in genomic research. G3: Genes| Genomes| Genetics 50(10):1999-2006.) and the data generously provided by the authors.

The complete data set for 20,087 G. max and G. soja accessions genotyped with 42,509 SNPs is available for Wm82.a1 in vcf or bcf format and Wm82.a2 in either vcf or bcf format can be downloaded here.

SoySNP50K haplotypes for a user-selected subset of the genotyped cultivars can be downloaded from this page.

Download SNP Position Data

SNP position data can be retrieved in either the Wm82.a1.v1.1 or Wm82.a2.v1 coordinate system. Not all SNP's called in the Wm82.a1.v1.1 coordinate system could be positioned on the Wm82.a2.v1 coordinate system because the flanking sequence was not present in the assembly. Not all published SNPs are available here, only those that are referred to on SoyBase pages are available.

Get SNP positions based on SNP name

This is a SNP name based search service. The SNP name must be the same as that used at SoyBase. The output will be printed to your browser separated by tabs (TSV).

Paste a list of SNP names here:

Or load an .


Choose assembly

Wm82.a1 Wm82.a4

Download GWAS QTL Position Data

Download GWAS QTL position data as a text file to your computer.
NCBI G. max sequences
NCBI G. soja sequences
PlantDGB transcript assemblies
Google Scholar soybean literature
PubMed soybean literature

Funded by the USDA-ARS. Developed by the USDA-ARS SoyBase and Legume Clade Database group at the Iowa State University, Ames, IA
Iowa State University Logo