Selected article for: "data analysis and PCR duplicate"

Author: Hayden C. Metsky; Katherine J. Siddle; Adrianne Gladden-Young; James Qu; David K. Yang; Patrick Brehio; Andrew Goldfarb; Anne Piantadosi; Shirlee Wohl; Amber Carter; Aaron E. Lin; Kayla G. Barnes; Damien C. Tully; Björn Corleis; Scott Hennigan; Giselle Barbosa-Lima; Yasmine R. Vieira; Lauren M. Paul; Amanda L. Tan; Kimberly F. Garcia; Leda A. Parham; Ikponmwonsa Odia; Philomena Eromon; Onikepe A. Folarin; Augustine Goba; Etienne Simon-Lorière; Lisa Hensley; Angel Balmaseda; Eva Harris; Douglas Kwon; Todd M. Allen; Jonathan A. Runstadler; Sandra Smole; Fernando A. Bozza; Thiago M. L. Souza; Sharon Isern; Scott F. Michael; Ivette Lorenzana; Lee Gehrke; Irene Bosch; Gregory Ebel; Donald Grant; Christian Happi; Daniel J. Park; Andreas Gnirke; Pardis C. Sabeti; Christian B. Matranga
Title: Capturing diverse microbial sequence with comprehensive and scalable probe design
  • Document date: 2018_3_12
  • ID: a9lkhayg_84
    Snippet: We performed demultiplexing and data analysis of all sequencing runs using viral-ngs v1.17.0 83, 84 with default settings, except where described below. To enable comparisons between pre-and post-capture results, we downsampled all raw reads to 200,000 reads using SAMtools 85 . We performed all analyses on downsampled data sets unless otherwise stated. We chose this number as 90% of all samples sequenced on the MiSeq (among the 30 patient and env.....
    Document: We performed demultiplexing and data analysis of all sequencing runs using viral-ngs v1.17.0 83, 84 with default settings, except where described below. To enable comparisons between pre-and post-capture results, we downsampled all raw reads to 200,000 reads using SAMtools 85 . We performed all analyses on downsampled data sets unless otherwise stated. We chose this number as 90% of all samples sequenced on the MiSeq (among the 30 patient and environmental samples used for validation) were sequenced to a depth of at least 200,000 reads. For those few low coverage samples for which we did not obtain > 200,000 reads, we performed all analyses using all available reads unless otherwise noted (Supplementary Table 3 ). Downsampling normalizes sequencing depth across runs and allows us to more readily evaluate the effectiveness of capture on genome assembly (i.e., the fraction of the genome we can assemble) than an approach such as comparing viral reads per million. It also allows us to more readily compare unique content (see below). A statistic like unique viral reads per unique million reads can be distorted based on sequencing depth in the presence of a high fraction of viral PCR duplicate reads: sequencing to a lower depth can inflate the value of this statistic compared to sequencing to a higher depth.

    Search related documents:
    Co phrase search for related documents
    • analysis perform and default setting: 1
    • analysis perform and high fraction: 1, 2
    • analysis perform and low coverage: 1, 2, 3, 4
    • analysis perform and MiSeq sequence: 1
    • analysis perform and Supplementary table: 1, 2
    • analysis perform and viral ngs: 1
    • assemble genome and data set: 1
    • assemble genome and high depth: 1
    • assemble genome and read depth: 1, 2, 3
    • assemble genome and Supplementary table: 1, 2, 3, 4
    • assemble genome and unique content: 1, 2
    • assemble genome and viral ngs: 1
    • assemble genome and viral read: 1, 2, 3
    • assemble genome fraction and Supplementary table: 1
    • assemble genome fraction and unique content: 1
    • available read and data set: 1, 2