Selected article for: "gene expression and large dataset"

Author: Riquier, Sébastien; Bessiere, Chloé; Guibert, Benoit; Bouge, Anne-Laure; Boureux, Anthony; Ruffle, Florence; Audoux, Jérôme; Gilbert, Nicolas; Xue, Haoliang; Gautheret, Daniel; Commes, Thérèse
Title: Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets.
  • Cord-id: 6uqi1avs
  • Document date: 2021_9_1
  • ID: 6uqi1avs
    Snippet: The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mer
    Document: The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k-mer signatures, quantify these k-mers into RNA-seq datasets and quickly visualize large dataset characteristics. The core tool, Kmerator, produces specific k-mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor gene-specific k-mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualized through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non-coding RNAs for human health applications.

    Search related documents:
    Co phrase search for related documents
    • Try single phrases listed below for: 1
    Co phrase search for related documents, hyperlinks ordered by date