CottonMD

A Multiomics Database for cotton biological study


About Population

  
  • Sample information

  • The collected accessions showed rich genetic diversity and wide geographical distribution, a total of 4,180 accessions including 3,743 G. hirsutum, 393 G. barbadense, 7 G. tomentosum, 6 G. darwinii, 6 G. mustelinum and 25 others accession[1-10]. 2,723 of these accessions came from Asia, including 2,619 from China and 104 from other parts of Asia. 726 accessions were from North America, 204 accessions from South America, 141 accessions from Europe, 82 accessions from Africa, 36 accessions from Oceania and 268 unknown.

      
  • Population structure

  • In this database, 4,180 cotton accessions can be devided into eight groups according to SNP genotypes and their origins, named as G0-G7. G7 contains most G. barbadense accessions (n=400). G0 (n=39) consisting of wild G. hirsutum accessions from America. G1 (n=243) consisting of G. hirsutum landraces of median American. G2 (n=317) mainly consisting of G. hirsutum landraces of southern China. G3-6 comprising of the cultivated G. hirsutum accessions. Among them, most of accessions from Northwest China (NWC) and North China (NC) were grouped into G3 (n=538); G4 (n=795) contains accessions from three historical Chinese cotton planting areas; G5 (n=728) contains accessions from Yangzi River region (YZR); G6 (n=1,120) contains accessions from Yangtze River region (YZR) of China and the United States.

      
  • Selective signals

  • To identify genomic regions during the domestication and selection process, genetic diversity (π), Tajima's D pairwise fixation statistic (FST) and XP–CLR values were calculated. Average pairwise fixation statistic (FST) values among subgroups demonstrated that the genetic divergence within cultivated subgroups (G3-6) was low (0.007-0.028) compared with those among cultivated accessions and G1 (0.229-0.263) and between landraces (0.189), and those among cultivated accessions and G2 were intermediate (0.036-0.047).

      References


    [1] Fang L, Wang Q, Hu Y, et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits[J]. Nature Genetics, 2017.
    [2] Wang M, Tu L, Lin M, et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication[J]. Nature Genetics, 2017, 49 (4):579.
    [3] Ma Z, He S, Wang X, et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield[J]. Nature Genetics, 2018.
    [4] Fang L, Gong H, Hu Y, et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons[J]. Genome Biology, 2017, 18 (1):33.
    [5] Nie X, Wen T, Shao P, et al. High‐density genetic variation maps reveal the correlation between asymmetric interspecific introgressions and improvement of agronomic traits in Upland and Pima cotton varieties developed in Xinjiang, China[J]. The Plant Journal, 2020, 103 (2): 677-689.
    [6] He S, Sun G, Geng X, et al. The genomic basis of geographic differentiation and fiber improvement in cultivated cotton[J]. Nature Genetics, 2021, 53 (6): 916-924.
    [7] Li J, Yuan D, Wang P, et al. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection[J]. Genome biology, 2021, 22 (1): 1-26.
    [8] Li B, Chen L, Sun W, et al. Phenomics‐based GWAS analysis reveals the genetic architecture for drought resistance in cotton[J]. Plant biotechnology journal, 2020, 18 (12): 2533-2544.
    [9] Yuan D, Grover C E, Hu G, et al. Parallel and intertwining threads of domestication in allopolyploid cotton[J]. Advanced Science, 2021: 2003634.
    [10] Huang, C. et al. Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol J. 15, 1374-1386 (2017).