Author: Shifman, Ohad; Cohen-Gihon, Inbar; Beth-Din, Adi; Zvi, Anat; Laskar, Orly; Paran, Nir; Epstein, Eyal; Stein, Dana; Dorozko, Marina; Wolf, Dana; Yitzhaki, Shmuel; Shapira, Shmuel C.; Melamed, Sharon; Israeli, Ofir
Title: Identification and genetic characterization of a novel Orthobunyavirus species by a straightforward high-throughput sequencing-based approach Document date: 2019_3_4
ID: 15cxc32n_8
Snippet: Primary analysis of sequencing results. To identify the virus in the sample, we initially utilized two rapid bioinformatic tools, MetaPhlAn2 and Pathoscope, which profile obtained reads by comparing the reads to databases of microbial genomic sequences. MetaPhlAn2 maps reads against a database of predefined clade-specific genetic markers originating from bacterial, fungal and viral genomes, while Pathoscope uses various databases of whole-genome .....
Document: Primary analysis of sequencing results. To identify the virus in the sample, we initially utilized two rapid bioinformatic tools, MetaPhlAn2 and Pathoscope, which profile obtained reads by comparing the reads to databases of microbial genomic sequences. MetaPhlAn2 maps reads against a database of predefined clade-specific genetic markers originating from bacterial, fungal and viral genomes, while Pathoscope uses various databases of whole-genome sequences, containing over 10,000 complete bacterial, fungal and viral genome sequences. However, no significant viral hits were found by either computational method, indicating that the viral sequence is not present in the databases currently used for analysis. Four percent of the reads could be attributed to bacterial and fungal sequences or to controls that were added to the sample (PhiX174 and carrier RNA). Approximately 78% of the reads were mapped to the green monkey (C. sabaeus) genome, while in the negative control sample, over 90% of the reads were mapped, implying that these reads originated from the Vero cells. Notably, a significant portion of the reads (18%) could not be assigned to sequences in the databases. We hypothesized that the as-yet-unidentified viral sequences might be among these reads. To determine the origins of these unmapped reads, we applied a de novo assembly approach in order to obtain long continuous sequences (contigs) and reconstruct the viral sequence. Such contigs may then be subjected to sequence similarity searches against the entire NCBI nr/nt nucleotide collection and to characterization of the genome sequence of the virus.
Search related documents:
Co phrase search for related documents- computational method and genetic marker: 1
- computational method and genome sequence: 1, 2, 3
- continuous sequence and genome sequence: 1, 2
- continuous sequence and genomic sequence: 1, 2, 3
- control sample and database sequence: 1
- control sample and genome sequence: 1, 2
- currently analysis and genome sequence: 1, 2, 3
- database sequence and genome sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- database sequence and genome sequence database: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
- database sequence and genomic sequence: 1, 2, 3, 4, 5
- genetic marker and genome sequence: 1
Co phrase search for related documents, hyperlinks ordered by date