Selected article for: "genome cover and read length"

Author: Ye, Fuqiang; Han, Yifang; Zhu, Juanjuan; Li, Peng; Zhang, Qi; Lin, Yanfeng; Wang, Taiwu; Lv, Heng; Wang, Changjun; Wang, Chunhui; Zhang, Jinhai
Title: First Identification of Human Adenovirus Subtype 21a in China With MinION and Illumina Sequencers
  • Document date: 2020_4_7
  • ID: 18b2foud_36
    Snippet: A total of 5,887 reads could be aligned to the genomes of the above three AdVs and were then de novo assembled into a draft sequence containing 35,180 bp. BLASTN analysis revealed that this sequence had 99.53% identity against HAdV21 strain CDC V2148A (GenBank accession no. KJ364588.1, length = 35,371 bp) with a 100% query coverage. Moreover, among the reads longer than 10 kb (n = 637), 20 kb (n = 115), and 30 kb (n = 21), 94.82, 94.78, and 95.24.....
    Document: A total of 5,887 reads could be aligned to the genomes of the above three AdVs and were then de novo assembled into a draft sequence containing 35,180 bp. BLASTN analysis revealed that this sequence had 99.53% identity against HAdV21 strain CDC V2148A (GenBank accession no. KJ364588.1, length = 35,371 bp) with a 100% query coverage. Moreover, among the reads longer than 10 kb (n = 637), 20 kb (n = 115), and 30 kb (n = 21), 94.82, 94.78, and 95.24% of reads could be aligned to this strain with a genome coverage ≥73.10, 75.93, and 82.27%, respectively. We then extracted 5,999 reads aligned to this specific strain and obtained a draft genome containing 35,364 bp, with the read depth across the draft genome ranging from 65 X to 737 X (Supplementary Figure S1) . Online We further investigated the minimal read number sufficient to identify this isolate via a downsampling procedure. With respect to reference genome coverage (Figure 1A) , a sequencing depth of 5 would generate an average genome coverage of 7.83% (standard deviation = 13.05%). When the depth gradually increased from 10 to 400, the minimal genome coverage ranged from 0 to 92.43%. A minimum of 500 reads could cover at least 96.49% of the reference genome as the sequencing depth increased. Notably, if more than 5,000 reads were randomly selected each time, the genome coverage was always 100% (Supplementary Figure S2) . When decoding the genome coverage in the view of aligned read length, a downsampling procedure owning aligned reads with a maximal read length ≥25kb always had a genome coverage of ≥69.10% regardless of the sampling depth ( Figure 1B) . When sequencing depth was set to 500, aligned reads with a maximal read length ≥25 or <25 kb, respectively, obtained a ≥99.31% or ≥96.49% genome coverage. Moreover, 16.93%, on average, ranging from 16.69 to 17.32% for each sequencing depth, of reads could be aligned to the reference. Regardless of the maximal aligned read length, larger or smaller than 25 kb, the hit ratios (number of aligned reads divided by the corresponding sequencing depth) fluctuated around but finally converged to 16.93% (Figure 1C) , which is close to the actual hit ratio of 16.92% (6,002/35,466).

    Search related documents: