Author: Hayden C. Metsky; Katherine J. Siddle; Adrianne Gladden-Young; James Qu; David K. Yang; Patrick Brehio; Andrew Goldfarb; Anne Piantadosi; Shirlee Wohl; Amber Carter; Aaron E. Lin; Kayla G. Barnes; Damien C. Tully; Björn Corleis; Scott Hennigan; Giselle Barbosa-Lima; Yasmine R. Vieira; Lauren M. Paul; Amanda L. Tan; Kimberly F. Garcia; Leda A. Parham; Ikponmwonsa Odia; Philomena Eromon; Onikepe A. Folarin; Augustine Goba; Etienne Simon-Lorière; Lisa Hensley; Angel Balmaseda; Eva Harris; Douglas Kwon; Todd M. Allen; Jonathan A. Runstadler; Sandra Smole; Fernando A. Bozza; Thiago M. L. Souza; Sharon Isern; Scott F. Michael; Ivette Lorenzana; Lee Gehrke; Irene Bosch; Gregory Ebel; Donald Grant; Christian Happi; Daniel J. Park; Andreas Gnirke; Pardis C. Sabeti; Christian B. Matranga
Title: Capturing diverse microbial sequence with comprehensive and scalable probe design Document date: 2018_3_12
ID: a9lkhayg_94
Snippet: For each biological sample, we first subsampled raw reads to 200,000 reads using SAMtools 85 (except for samples with < 200,000 reads, for which we used all available reads). Then, we removed highly similar (likely PCR duplicate) reads from the unaligned reads with the mvicuna tool through viral-ngs. We ran kraken through viral-ngs and separately ran kraken-filter with a threshold of 0.1 for classification. For samples where two independent libra.....
Document: For each biological sample, we first subsampled raw reads to 200,000 reads using SAMtools 85 (except for samples with < 200,000 reads, for which we used all available reads). Then, we removed highly similar (likely PCR duplicate) reads from the unaligned reads with the mvicuna tool through viral-ngs. We ran kraken through viral-ngs and separately ran kraken-filter with a threshold of 0.1 for classification. For samples where two independent libraries had been prepared and used for V ALL and V WAFR , or where the same pre-capture library had been sequenced more than once, we merged the raw sequence files prior to downsampling. To account for laboratory contaminants we also ran kraken on water controls; we first merged all water controls together, and classified reads as described above. We evaluated the presence and enrichment of viral and other taxa using the cumulative species-level counts, as above. To do so we calculated two measures: abundance, which was calculated by dividing pre-capture read counts for each species by counts in pooled water controls, and enrichment, which was calculated by dividing post-capture read counts for each species by pre-capture read counts in the same sample. For our uncharacterized mosquito pools and human plasma samples from Nigeria and Sierra Leone, after capture with V ALL we searched for viral species with more than 10 matched reads and a read count greater than 2-fold higher than in the pooled water control after capture with V ALL . For each virus identified we assembled viral genomes and calculated per-base read depth as described above ( Supplementary Fig. 11 , Supplementary Table 8 ). When producing coverage plots, we calculated per-base read depth as described above for known samples, except we removed supplementary alignments before calculating depth to remove artificial chimeras.
Search related documents:
Co phrase search for related documents- biological sample and human plasma: 1, 2, 3, 4, 5, 6
- coverage plot and depth calculate: 1
Co phrase search for related documents, hyperlinks ordered by date