Author: Katz, Kenneth S.; Shutov, Oleg; Lapoint, Richard; Kimelman, Michael; Brister, J. Rodney; O’Sullivan, Christopher
                    Title: STAT: a fast, scalable, MinHash-based k-mer tool to assess Sequence Read Archive next-generation sequence submissions  Cord-id: zb8dxpv2  Document date: 2021_9_20
                    ID: zb8dxpv2
                    
                    Snippet: Sequence Read Archive submissions to the National Center for Biotechnology Information often lack useful metadata, which limits the utility of these submissions. We describe the Sequence Taxonomic Analysis Tool (STAT), a scalable k-mer-based tool for fast assessment of taxonomic diversity intrinsic to submissions, independent of metadata. We show that our MinHash-based k-mer tool is accurate and scalable, offering reliable criteria for efficient selection of data for further analysis by the scie
                    
                    
                    
                     
                    
                    
                    
                    
                        
                            
                                Document: Sequence Read Archive submissions to the National Center for Biotechnology Information often lack useful metadata, which limits the utility of these submissions. We describe the Sequence Taxonomic Analysis Tool (STAT), a scalable k-mer-based tool for fast assessment of taxonomic diversity intrinsic to submissions, independent of metadata. We show that our MinHash-based k-mer tool is accurate and scalable, offering reliable criteria for efficient selection of data for further analysis by the scientific community, at once validating submissions while also augmenting sample metadata with reliable, searchable, taxonomic terms. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02490-0.
 
  Search related documents: 
                                Co phrase  search for related documents- accuracy evaluation and acute respiratory syndrome coronavirus: 1, 2, 3, 4, 5, 6
- accuracy methods test and acute respiratory syndrome: 1, 2
- accuracy methods test and acute respiratory syndrome coronavirus: 1, 2
- accuracy robust and acute respiratory syndrome: 1
- accuracy robust and acute respiratory syndrome coronavirus: 1
- accuracy test and active infection: 1
- accuracy test and acute respiratory syndrome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70
- accuracy test and acute respiratory syndrome coronavirus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66
- accuracy test and additional time: 1, 2
 
                                Co phrase  search for related documents, hyperlinks ordered by date