Author: Stenglein, Mark D.; Jacobson, Elliott R.; Wozniak, Edward J.; Wellehan, James F. X.; Kincaid, Anne; Gordon, Marcus; Porter, Brian F.; Baumgartner, Wes; Stahl, Scott; Kelley, Karen; Towner, Jonathan S.; DeRisi, Joseph L.
Title: Ball Python Nidovirus: a Candidate Etiologic Agent for Severe Respiratory Disease in Python regius Document date: 2014_9_9
ID: rb3qdunj_50
Snippet: Sequence analysis. The sequence analysis pipeline was similar to that employed previously (59, 76) . Low-quality sequences and low-complexity sequences (defined as having a ratio of the length of the Lempel-Ziv-Welch compressed sequence to the uncompressed length of less than 0.46) were removed from further analysis. The first 5 bases of each sequence were trimmed, as was the last base. The CD-HIT sequence clustering tool was then used to collaps.....
Document: Sequence analysis. The sequence analysis pipeline was similar to that employed previously (59, 76) . Low-quality sequences and low-complexity sequences (defined as having a ratio of the length of the Lempel-Ziv-Welch compressed sequence to the uncompressed length of less than 0.46) were removed from further analysis. The first 5 bases of each sequence were trimmed, as was the last base. The CD-HIT sequence clustering tool was then used to collapse reads with Ͼ98% global pairwise identity (77) . Host-derived sequences were then filtered, first by using the BLASTn alignment tool (version 2.2.25ϩ [30] ) to query a database of snake ribosomal and mitochondrial sequences and then by using the Bowtie2 alignment tool (version 2.0.0-beta7 [26] ) to query databases composed of draft assemblies of the Burmese python and boa constrictor genomes (20, 21) . Sequences aligning with an expect value of less than 10 Ϫ12 (BLASTn) or with -local mode alignment scores greater than 86 (Bowtie) were filtered. Similarly, sequences that aligned to the Illumina adapter sequences (see Table S1 in the supplemental material) or to PhiX-174 control sequence were removed. This filtering removed on average 98% of sequences (see Table S2 ). The remaining sequences were searched against custom databases of viral protein sequences using the BLASTx alignment tool. Sequences aligning to any viral protein sequence with an expect value of less than 0.25 were further examined. False positives were reduced by using BLAST to align putative viral sequences to the NCBI nonredundant nucleotide (nt) and protein (nr) databases. Only sequences whose best hit and whose pair's best hit were still to viral sequences were retained. The PRICE de novo targeted genome assembler was used to generate initial contiguous virus sequences (22) . The reference assembly (NCBI accession no. KJ541759) was assembled using reads from a single snake (no. 2); other animals lacked sufficient coverage to assemble the complete genome. To validate this assembly and generate coverage and pairwise identity data, reads were aligned to Sanger validated assemblies using the BLASTn algorithm. Sequencing data have been deposited in the UCSF Integrated Data Repository.
Search related documents:
Co phrase search for related documents- low quality and pairwise identity: 1
- low quality and protein sequence: 1, 2, 3
- low quality and sequence analysis: 1, 2, 3, 4
- low quality and snake ribosomal: 1
- low quality and supplemental material: 1, 2, 3, 4
- low quality and supplemental material Table S1: 1
- low quality and uncompressed length: 1
- low quality and viral expect value protein sequence: 1
- low quality and viral protein sequence: 1
- low quality and viral sequence: 1, 2, 3, 4
- low quality and virus sequence: 1, 2, 3, 4
- low quality sequence and viral sequence: 1, 2, 3
Co phrase search for related documents, hyperlinks ordered by date