Author: Christina J. Castro; Rachel L. Marine; Edward Ramos; Terry Fei Fan Ng
Title: The effect of variant interference on de novo assembly for viral deep sequencing Document date: 2019_10_22
ID: d5ghy39g_6_1
Snippet: formance. 15 136 137 Finally, Geneious and CLC were the least affected by VI in the simulated datasets tested, returning only PID, but tens to thousands of contigs were generated at a slightly lower PID of 99.21%. This PID threshold, 158 99.21%, marked the drastic transition from VS to VI, whereas the transition from VI to VD (i.e., the VD 159 threshold) occurred at 98.99% PID [ Figure 4b ]. A correlation was observed between genome length and th.....
Document: formance. 15 136 137 Finally, Geneious and CLC were the least affected by VI in the simulated datasets tested, returning only PID, but tens to thousands of contigs were generated at a slightly lower PID of 99.21%. This PID threshold, 158 99.21%, marked the drastic transition from VS to VI, whereas the transition from VI to VD (i.e., the VD 159 threshold) occurred at 98.99% PID [ Figure 4b ]. A correlation was observed between genome length and the 160 number of contigs produced during VI, where longer genomes returned proportionally more contigs as 161 expected as total VI occurrence should increase with length [r 2 = 0.967; p <0.0001 Figure 4b and 4c]. The read length of a given NGS dataset will vary depending on the sequencing platform and kits utilized 166 to generate the data. Since read length is an important factor for de novo assembly success, 16 we 167 hypothesized that it may also influence the ability to distinguish viral variants. For Experiment 3, using SPAdes For clinical samples, assembly of viral genomes is affected by multiple factors other than the presence 176 of variants, including sequencing error rate, host background reads, depth of genome coverage, and the 177 distribution (i.e., pattern) of genome coverage. We next utilized viral NGS data generated from four 178 picornavirus-positive clinical samples (one coxsackievirus B5, one enterovirus A71, and two parechovirus A3) 179 to explore VI in datasets representative of data that may be encountered during routine NGS. The NGS data 180 for each sample was partitioned into four bins of read data: (1) total reads after quality control (T); (2) major 181 variants only (M); (3) major and minor variants only (Mm); and (4) major variants and background non-viral 182 reads only (MB) [ Figure 5 ]. These binned datasets were then assembled separately using three assembly 183 programs: SPAdes, Cap3, and Geneious. By comparing these manipulations, we aimed to test the hypothesis 184 that minor variants directly affect the performance of assembly through VI in real clinical NGS data. 185 186 All rights reserved. No reuse allowed without permission.
Search related documents:
Co phrase search for related documents- clinical sample and genome coverage: 1, 2, 3, 4, 5
- clinical sample and genome coverage depth: 1, 2, 3
- clinical sample and genome length: 1, 2
- error rate and genome coverage: 1, 2, 3
- error rate and genome length: 1, 2
- error rate sequencing and genome coverage: 1
- error rate sequencing and genome length: 1
Co phrase search for related documents, hyperlinks ordered by date