Selected article for: "NCBI database and reference database"

Author: Wilson, Michael R.; Suan, Dan; Duggins, Andrew; Schubert, Ryan D.; Khan, Lillian M.; Sample, Hannah A.; Zorn, Kelsey C.; Rodrigues Hoffman, Aline; Blick, Anna; Shingde, Meena; DeRisi, Joseph L.
Title: A novel cause of chronic viral meningoencephalitis: Cache Valley virus
  • Document date: 2017_7_25
  • ID: 5mddyv0n_11
    Snippet: Sequences were analyzed using a rapid computational pipeline developed by the DeRisi Laboratory to classify mNGS reads and identify potential pathogens by comparison to the entire National Center for Biotechnology Information (NCBI) nucleotide (nt) reference database, which has previously been described in detail. 11, 13 Briefly, all paired-end reads were aligned to the human reference genome 38 (hg38) and the Pan troglodytes genome (pan-Tro4; 20.....
    Document: Sequences were analyzed using a rapid computational pipeline developed by the DeRisi Laboratory to classify mNGS reads and identify potential pathogens by comparison to the entire National Center for Biotechnology Information (NCBI) nucleotide (nt) reference database, which has previously been described in detail. 11, 13 Briefly, all paired-end reads were aligned to the human reference genome 38 (hg38) and the Pan troglodytes genome (pan-Tro4; 2011, University of California, Santa Cruz). 14 Unaligned (ie, nonhuman) reads were quality filtered using PriceSeqFilter. 15 Quality filtered reads were then compressed by cd-hit-dup (v4.6.1), and low-complexity reads were removed via the Lempel-Ziv-Welch algorithm. 16, 17 Next, human removal was repeated using Bowtie2 (v2.2.4) with the same hg38 and PanTro4 reference genomes as described above. 18 GSNAPL (v2015-12-31) 19 was used to align the remaining nonhuman read pairs to the NCBI nt database and preprocessed to remove known repetitive sequences with RepeatMasker (vOpen-4.0; www.repeatmasker.org). The same reads were also aligned to the NCBI nonredundant protein (nr) database using the Rapsearch2 algorithm. 20 The resulting sequence hits identified at both the nt and

    Search related documents:
    Co phrase search for related documents
    • cd hit and NCBI nonredundant protein: 1
    • cd hit dup and low complexity: 1
    • low complexity and NCBI nonredundant protein: 1