CottonMD

A Multiomics Database for cotton biological study


About Multi-omics

  Multi-omics


Multi-omics is function module integrating multiple omics datasets. It is constructed based on association analysis, such genome-wide association study (GWAS), eQTL, transciptome-wide association study (TWAS) and fine-mapping methods of causal variations, such as colocation analysis and SMR (Summary databased Mendelian randomization analysis).

  
  • GWAS

  • Genome-wide association study is an approach for identifying the genes that underlie common diseases and related quantitative traits. This strategy combines a comprehensive and unbiased survey of the genome with the power to detect common alleles with modest phenotypic effects. In CottonMD, we collected 9 traits of 4180 cotton accessions, including Fiber elongation rate (FE), Fiber length (FL), Fiber strength (FS), Fiber uniformity (FU), Flowering day (FD), Short fiber rate (FR), Leaf pubescence amount (LPA), Micronaire value (MV), Verticillium wilt disease index (DI), Effective boll number (EBN), Fiber upper half mean length (FUHML), First fruit branch position (FFBP), First fruit spur height (FFSH), Flowering period (FP), Fruit spur branch number (FSBN), Lint percentage (LP), Lint weight (LW), Plant height (PH), Seed cotton weight (SCW), Whole growth period (WGP)[1-11] and performed GWAS combining their genotypes using GEMMA. The most significant SNP in every 500kb-window was retained. User can browse GWAS results and search significant SNPs and genes of 6 fiber-associated traits.

      
  • eQTL

  • Expression quantitative trait loci (eQTL) are the loci in the genome that contribute to the expression levels of messenger RNAs in an organism. They link the static information of DNA sequence variation with the dynamic information of gene expression variation in an organism. In CottonMD, we collected RNA-seq data of fibres at 15 DPA (day post anthesis)[12] from 251 Upland cotton accessions and performed eQTL mapping combining their genotypes using GEMMA. User can browse the links between SNPs and genes by inputting the interested region or genes.

      
  • TWAS

  • Transcriptome-wide association studies (TWASs) have been widely used to integrate gene expression and genetic data to identify gene–trait associations. In CottonMD, we collected RNA-seq data of fibres at 15 DPA (day post anthesis)[12] and 6 fiber-associated traits from 251 Upland cotton accessions and performed TWAS using mixed-linear module (MLM). User can browse the links between phenotypes and genes by inputting the interested genes.

      
  • COLOC

  • Colocalization analyses assess the degree to which independent signals of association, including eQTL and GWAS signals, share the same causal variant. In CottonMD, user can make full use of RNA-seq data in CottonMD to perform colocalization analyses by uploading user’s phenotype datasets. Colocalization analyses in CottonMD was performed using COLOC R package.

      
  • SMR

  • SMR (Summary data–based Mendelian randomization analysis) is an method based on multi-omics, which integrates summary-level data from GWAS with
    data from expression quantitative trait locus (eQTL) studies to identify genes whose expression levels are associated with a complex trait because of pleiotropy.

      References


    [1] Wang M, Tu L, Lin M, et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication[J]. Nature Genetics, 2017, 49 (4):579.
    [2] Fang L, Wang Q, Hu Y, et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits[J]. Nature genetics, 2017, 49 (7): 1089.
    [3] Ma Z, He S, Wang X, et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield[J]. Nature Genetics, 2018.
    [4] Fang L, Gong H, Hu Y, et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons[J]. Genome biology, 2017, 18 (1): 1-13.
    [5] Nie X, Wen T, Shao P, et al. High‐density genetic variation maps reveal the correlation between asymmetric interspecific introgressions and improvement of agronomic traits in Upland and Pima cotton varieties developed in Xinjiang, China[J]. The Plant Journal, 2020, 103 (2): 677-689.
    [6] He S, Sun G, Geng X, et al. The genomic basis of geographic differentiation and fiber improvement in cultivated cotton[J]. Nature Genetics, 2021, 53 (6): 916-924.
    [7] Li J, Yuan D, Wang P, et al. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection[J]. Genome biology, 2021, 22 (1): 1-26.
    [8] Li B, Chen L, Sun W, et al. Phenomics‐based GWAS analysis reveals the genetic architecture for drought resistance in cotton[J]. Plant biotechnology journal, 2020, 18 (12): 2533-2544.
    [9] Yuan D, Grover C E, Hu G, et al. Parallel and intertwining threads of domestication in allopolyploid cotton[J]. Advanced Science, 2021: 2003634.
    [10] Guo C , Pan Z , You C , et al. Association mapping and domestication analysis to dissect genetic improvement process of upland cotton yield-related traits in China[J]. Journal of Cotton Research, 2021, 4(2):12.
    [11] Nie X , Huang C , You C , et al. Genome-wide SSR-based association mapping for fiber quality in nation-wide upland cotton inbreed cultivars in China[J]. BMC Genomics, 2016, 17(1).
    [12] Wang M, Tu L, Yuan D, et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense[J]. Nature genetics, 2019, 51 (2): 224-229.