Selected article for: "design sequence and probe set"

Author: Hayden C. Metsky; Katherine J. Siddle; Adrianne Gladden-Young; James Qu; David K. Yang; Patrick Brehio; Andrew Goldfarb; Anne Piantadosi; Shirlee Wohl; Amber Carter; Aaron E. Lin; Kayla G. Barnes; Damien C. Tully; Björn Corleis; Scott Hennigan; Giselle Barbosa-Lima; Yasmine R. Vieira; Lauren M. Paul; Amanda L. Tan; Kimberly F. Garcia; Leda A. Parham; Ikponmwonsa Odia; Philomena Eromon; Onikepe A. Folarin; Augustine Goba; Etienne Simon-Lorière; Lisa Hensley; Angel Balmaseda; Eva Harris; Douglas Kwon; Todd M. Allen; Jonathan A. Runstadler; Sandra Smole; Fernando A. Bozza; Thiago M. L. Souza; Sharon Isern; Scott F. Michael; Ivette Lorenzana; Lee Gehrke; Irene Bosch; Gregory Ebel; Donald Grant; Christian Happi; Daniel J. Park; Andreas Gnirke; Pardis C. Sabeti; Christian B. Matranga
Title: Capturing diverse microbial sequence with comprehensive and scalable probe design
  • Document date: 2018_3_12
  • ID: a9lkhayg_49
    Snippet: There are many problems related to probe design that map well to generalizations of the set cover problem. Relevant generalizations are the weighted and partial cover problems 31, 78, 79 . Using the weighted cover problem, CATCH allows a user to perform differential identification of taxa and also to blacklist sequences from the probe design. For these purposes, we introduce the concept of a "rank" to our implementation of the set cover solution......
    Document: There are many problems related to probe design that map well to generalizations of the set cover problem. Relevant generalizations are the weighted and partial cover problems 31, 78, 79 . Using the weighted cover problem, CATCH allows a user to perform differential identification of taxa and also to blacklist sequences from the probe design. For these purposes, we introduce the concept of a "rank" to our implementation of the set cover solution. A rank of a set is analogous to a weight and makes it straightforward to assign levels of penalties on sets. For two sets S and T , if rank(S) < rank(T ) then S is always considered before T -i.e., if coverage is needed and S provides that coverage, then the greedy algorithm always chooses S before T even if T provides more. These can be emulated entirely using weights (i.e., costs), by assigning sufficiently high weights to each set. To perform differential identification, CATCH accepts groupings of sequences as input (for example, each grouping might encompass the available genomes of a species). Then, CATCH finds the number of groupings that each candidate probe p "hits". (p hits a grouping if it covers a part of at least one sequence in that grouping.) A probe that hits only one grouping is suitable for differential identification, whereas ones that hit more are poor choices. Thus, CATCH assigns a rank to each p equal to the number of groupings hit by p. CATCH can also accept a collection of sequences to blacklist from the probe design. It determines the number of nucleotides in blacklisted sequence that each p covers and assigns to p a rank equal to this value; therefore, candidate probes that cover blacklisted sequence are highly penalized in the design. (When a user opts to perform differential identification while also blacklisting sequences, the ranks are assigned such that a candidate probe that covers a part of a blacklisted sequence always receives a higher rank than one that does not.) For the purposes of determining whether p hits an identification grouping or blacklisted sequence, CATCH accepts three additional parameters, holding more tolerant values for m, lcf, and i as defined above, that f map uses to evaluate probe-target hybridization. We note as well that weights can have other applications in probe design, e.g., if there is a reason to prefer some candidate probes over others due to base composition. Finally, CATCH solves an instance of the weighted cover problem by assigning the rank of each set to be the rank of the candidate probe it represents.

    Search related documents: