Selected article for: "differential expression and RNA seq"

Author: Nelly Mostajo Berrospi; Marie Lataretu; Sebastian Krautwurst; Florian Mock; Daniel Desirò; Kevin Lamkiewicz; Maximilian Collatz; Andreas Schoen; Friedemann Weber; Manja Marz; Martin Hölzer
Title: A comprehensive annotation and differential expression analysis of short and long non-coding RNAs in 16 bat genomes
  • Document date: 2019_8_19
  • ID: ihqvcxv6_59
    Snippet: Current genome annotations, mostly generated by automatic annotation pipelines provided by databases such as the NCBI 60 or Ensembl 45 , are predominantly focusing on protein-coding genes and well studied ncRNAs such as tRNAs and rRNAs. Accordingly, the available bat genome annotations vary a lot regarding their quality, ranging from more comprehensive annotations for long-standing bat genomes such as M. lucifugus or P. vampyrus to annotations on.....
    Document: Current genome annotations, mostly generated by automatic annotation pipelines provided by databases such as the NCBI 60 or Ensembl 45 , are predominantly focusing on protein-coding genes and well studied ncRNAs such as tRNAs and rRNAs. Accordingly, the available bat genome annotations vary a lot regarding their quality, ranging from more comprehensive annotations for long-standing bat genomes such as M. lucifugus or P. vampyrus to annotations on region level, completely missing any coding or noncoding gene annotations at the current NCBI The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/738526 doi: bioRxiv preprint version ( Fig. 3 and Tab. 4). Furthermore, and by using strand-specific RNA-Seq data, we could show that some genes (e.g. IFNA5/IFNW2 in the Ensembl annotation of M. lucifugus 36 ) are annotated on the false strand and are therefore entirely missed by differential expression studies when relying on a strand-specific read quantification. For all publicly available bat genomes, ncRNAs are generally annotated on low levels and are highly incomplete, mostly only comprising some tRNAs, rRNAs, snRNAs, snoRNAs, and lncRNAs ( Fig. 3 and Tab. 4) . Therefore, many ncRNAs, especially miRNAs, are simply overlooked by current molecular studies, for example from RNA-Seq studies that aim to call differential expressed genes based on such in-complete genome annotation files. Studies that have made additional effort on annotating ncRNAs in bats 8, 33, 55, 61, 62 are not reporting their results on a level that can be directly used for further computational assessment (e.g. as an direct input for RNA-Seq abundance estimation). Currently, in the NCBI database, five bat assemblies are entirely lacking any coding/non-coding annotations and miRNAs are not annotated at all (Tab. 4). The Rfam database 47 contains mainly for M. lucifugus and P. vampyrus 336 ncRNA families. Other ncRNAs are currently unknown from bat genomes or not well documented.

    Search related documents:
    Co phrase search for related documents
    • annotation file and bat assembly: 1
    • annotation file and bat genome: 1
    • annotation file and bat genome annotation: 1
    • annotation pipeline and comprehensive annotation: 1, 2