Author: Marie Hoffmann; Michael T. Monaghan; Knut Reinert
Title: PriSeT: Efficient De Novo Primer Discovery Document date: 2020_4_7
ID: 3b3hv53b_60
Snippet: As a reference library, we sampled from NCBI GenBank's nt dataset (Benson et al., 2012) 11 , which contains non-human sequences from various sources. The prevalent sequence length range is between 400 to 2500 bases. We picked 19 clades that include Eukarya typically found in freshwater plankton samples ranging from phyto-to zooplankton and fungi. For each taxon within a clade that contained at least one accession assigned to it, we sampled at mos.....
Document: As a reference library, we sampled from NCBI GenBank's nt dataset (Benson et al., 2012) 11 , which contains non-human sequences from various sources. The prevalent sequence length range is between 400 to 2500 bases. We picked 19 clades that include Eukarya typically found in freshwater plankton samples ranging from phyto-to zooplankton and fungi. For each taxon within a clade that contained at least one accession assigned to it, we sampled at most three accessions to remove the sequence bias introduced by highly populated taxa. Table 4 lists the clades, the number of taxa, taxa with at least one accession (Covered), the total number of accessions and the library size in megabytes (MB). Between 1.38 (Rotifera) to 2.5 (Charophyceae) accessions were sampled from covered nodes, which indicates the sparse population of the reference library. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.06.027961 doi: bioRxiv preprint Table 4 . Data set used for the primer verification test, the de novo search, and the runtime analysis. Taxonomic identifiers in the clade column follow the NCBI nomenclatura, Taxa refers to the total number of nodes (including virtual ancestors), Covered the number of taxa having at least one accession assigned to it, Accs to the total number of collected accessions and Lib Size to the size of the fasta file containing all accessions. trained experts. Their role in ecosystem function and for environmental monitoring and water quality assessment cannot be underestimated -they contribute approximately 20 % of global oxygen production and represent nearly half of the organic material in the oceans. The DIV4 primer pair (see Table 5 ) was specifically designed for Bacillariophyta by Visco et al. (2015) with an expected amplicon length of ∼280 nt. Hadziavdic et al. (2014) Stoeck et al. (2010) for the V9 region of marine Eukaryota, EUKAF/R by Moreno et al. (2018) for the 18S region of Protozoa (EA hereafter), G18S4/22R by Blaxter et al. (1998) for Nematoda (nSSU hereafter), and SSU556F/SSU911R by Kirsty et al. (2017) for Dinoflagellata (SSU hereafter). Some of these primers were designed with respect to a specific organism group, however, it is expected that they are also effective in other clades. Table 5 lists the 10 selected primer pairs targeting 18S that we searched for in the reference library. It is remarkable that not a single pair has chemically optimal properties. At least one sequence of a pair shows a self-annealing pattern, seven pairs differ significantly in melting temperatures (independent of computation method, i.e. Wallace rule or nearestneighbour method), three sequences have CG clamps at their 3' ends, eight sequences have exceeding CG contents, 23S forward contains a run of five adenine bases (R substitutes A or G), and SSU911R has an (A|T) 3 tail. We therefore relaxed the chemical constraints for the verification experiment by allowing a larger melting temperature range and difference in ∆Tm, a larger CG content range, and we deactivated the self-annealing filter (see column 'Verification' in Table 6 ).
Search related documents:
Co phrase search for related documents- amplicon length and cg content range: 1
- amplicon length and chemical constraint: 1
Co phrase search for related documents, hyperlinks ordered by date