Author: Hayden C. Metsky; Katherine J. Siddle; Adrianne Gladden-Young; James Qu; David K. Yang; Patrick Brehio; Andrew Goldfarb; Anne Piantadosi; Shirlee Wohl; Amber Carter; Aaron E. Lin; Kayla G. Barnes; Damien C. Tully; Björn Corleis; Scott Hennigan; Giselle Barbosa-Lima; Yasmine R. Vieira; Lauren M. Paul; Amanda L. Tan; Kimberly F. Garcia; Leda A. Parham; Ikponmwonsa Odia; Philomena Eromon; Onikepe A. Folarin; Augustine Goba; Etienne Simon-Lorière; Lisa Hensley; Angel Balmaseda; Eva Harris; Douglas Kwon; Todd M. Allen; Jonathan A. Runstadler; Sandra Smole; Fernando A. Bozza; Thiago M. L. Souza; Sharon Isern; Scott F. Michael; Ivette Lorenzana; Lee Gehrke; Irene Bosch; Gregory Ebel; Donald Grant; Christian Happi; Daniel J. Park; Andreas Gnirke; Pardis C. Sabeti; Christian B. Matranga
Title: Capturing diverse microbial sequence with comprehensive and scalable probe design Document date: 2018_3_12
ID: a9lkhayg_84
Snippet: We performed demultiplexing and data analysis of all sequencing runs using viral-ngs v1.17.0 83, 84 with default settings, except where described below. To enable comparisons between pre-and post-capture results, we downsampled all raw reads to 200,000 reads using SAMtools 85 . We performed all analyses on downsampled data sets unless otherwise stated. We chose this number as 90% of all samples sequenced on the MiSeq (among the 30 patient and env.....
Document: We performed demultiplexing and data analysis of all sequencing runs using viral-ngs v1.17.0 83, 84 with default settings, except where described below. To enable comparisons between pre-and post-capture results, we downsampled all raw reads to 200,000 reads using SAMtools 85 . We performed all analyses on downsampled data sets unless otherwise stated. We chose this number as 90% of all samples sequenced on the MiSeq (among the 30 patient and environmental samples used for validation) were sequenced to a depth of at least 200,000 reads. For those few low coverage samples for which we did not obtain > 200,000 reads, we performed all analyses using all available reads unless otherwise noted (Supplementary Table 3 ). Downsampling normalizes sequencing depth across runs and allows us to more readily evaluate the effectiveness of capture on genome assembly (i.e., the fraction of the genome we can assemble) than an approach such as comparing viral reads per million. It also allows us to more readily compare unique content (see below). A statistic like unique viral reads per unique million reads can be distorted based on sequencing depth in the presence of a high fraction of viral PCR duplicate reads: sequencing to a lower depth can inflate the value of this statistic compared to sequencing to a higher depth.
Search related documents:
Co phrase search for related documents- analysis perform and default setting: 1
- analysis perform and high fraction: 1, 2
- analysis perform and low coverage: 1, 2, 3, 4
- assemble genome and data set: 1
- assemble genome and high depth: 1
- available read and data set: 1, 2
- available read and high depth: 1
- data set and depth sequence: 1
- data set and high depth: 1
- data set and high fraction: 1
- data set and low coverage: 1, 2
- depth sequence and high depth: 1, 2, 3, 4, 5, 6, 7
- depth sequence and high depth sequence: 1, 2, 3, 4, 5
- depth sequence and low coverage: 1, 2, 3, 4, 5
- high depth and low coverage: 1, 2, 3, 4, 5, 6
- high depth sequence and low coverage: 1, 2, 3, 4
Co phrase search for related documents, hyperlinks ordered by date