Multi-Population Selective Genotyping to Identify Soybean (Glycine max (L.) Merr.) Seed Protein and Oil QTLs
Piyaporn Phansak, Watcharin Soonsuwon, David L. Hyten, Qijian Song, Perry B. Cregan, George L. Graef, and James E. Specht
BioProject ID: PRJNA314872
doi:10.1534/g3.116.027656
Abstract:
Plant breeders continually generate ever-higher yielding cultivars, but also want to improve seed constituent value, which is mainly protein and oil in soybean [Glycine max (L.) Merr.]. Identification of genetic loci governing those two traits would facilitate that effort. Though genome-wide association offers one such approach, selective genotyping of multiple bi-parental populations offers a complementary alternative, and was evaluated here, using 48 F2:3 populations (n = ca. 224 plants) created by mating 48 high protein germplasm accessions to cultivars of similar maturity, but with normal seed protein content. All F2:3 progeny were phenotyped for seed protein and oil, but only 22 high and 22 low extreme progeny in each F2:3 phenotypic distribution were genotyped with a 1536-SNP chip (ca. 450 bi-morphic SNPs detected per mating). A significant QTL on one or more chromosomes was detected for protein in 35 (73%) and for oil in 25 (52%) of the 48 matings, and these QTLs exhibited additive effects of ≥ 4 g kg-1 and R2 values of 0.07 or more. These results demonstrated that a multiple-population selective genotyping strategy, when focused on matings between parental phenotype extremes, can be successfully used to identify germplasm accessions possessing large-effect QTL alleles. Such accessions would be of interest to breeders to serve as parental donors of those alleles in cultivar development programs, though 17 of the 48 accessions were not unique in terms of SNP genotype, indicating that diversity amongst high protein accessions in the germplasm collection is less than what might ordinarily be assumed.
Phansak et al. (2016) mated 48 high protein soybean accessions in seven MGs (000 to IV) to a matching MG high yield cultivar of ordinary protein content. From these matings they generated 48 populations that each had about 220 F2 plants.
Table relating the Mating Number, Parent Code and GRIN accession number [Download]
R/qtl was used for the QTL analysis of selectively genotyped populations. Phansak et al. (2016) have provided the phenotype/genotype data and R/qtl program code for each of the 48 individual populations and the three composite populations (Cmm04, Cm005, C0008). The links below will download these data for one of the individual or multipopulation samples.
Instructions for installing R and R/qtl [Download]
To accommodate the requrements of R/qtl the 1536 SNPs originally described by Hyten et al. were renamed to a shortened version. As described in Supplemental Table S1 of Phansak et al. (2016), the shortened names consist of a "s" followed by the last 5 digits of the dbSNP ID (ie. ss#).
File showing SNP name correspondences [Download]
The links below will download a compressed ZIP file containing the *.csv file (phenotype and genotype data) and the matchiing *.txt file (R/qtl program code) for each of the 48 individual populations or the three composite populations (Cmm04, Cm005, C0008).
Funded by the USDA-ARS. Developed by the USDA-ARS SoyBase and Legume Clade Database group at the Iowa State University, Ames, IA | ||