Author: Chan, Joseph M.; Rabadan, Raul
Title: Quantifying Pathogen Surveillance Using Temporal Genomic Data Document date: 2013_1_29
ID: u2t1x89m_25
Snippet: Comparison to clustering methods. Another possible surveillance measurement characterizes the cluster structure of isolates. In an ideal situation, a well-sampled population of sequences separated by genetic distance would be represented by points densely and homogeneously spread across a continuum. Therefore, clustering techniques such as hierarchical, k-means, or expectationmaximization clustering can be used to ascertain how poorly sampled a p.....
Document: Comparison to clustering methods. Another possible surveillance measurement characterizes the cluster structure of isolates. In an ideal situation, a well-sampled population of sequences separated by genetic distance would be represented by points densely and homogeneously spread across a continuum. Therefore, clustering techniques such as hierarchical, k-means, or expectationmaximization clustering can be used to ascertain how poorly sampled a pathogen is on the basis of the number of clusters in a data set. Bar coding is an alternative strategy based on the field of per-sistent homology that identifies topologically invariant clusters in cloud data; in particular, it calculates the b 0 Betti number, the number of connected components in a set of simplicial complexes constructed from sequences at different filtration Hamming distances (see Materials and Methods) (40) . A lower b 0 would indicate better sampling.
Search related documents:
Co phrase search for related documents- bar coding and data set: 1
- bar coding and different filtration: 1
- bar coding and genetic distance: 1, 2
- cluster number and data set: 1
- connected component and data set: 1
- data set and different filtration: 1
- data set and genetic distance: 1, 2, 3, 4, 5
- different filtration and genetic distance: 1
Co phrase search for related documents, hyperlinks ordered by date