Selected article for: "dimensionality reduction and feature extraction"

Author: Phillip Davis; John Bagnoli; David Yarmosh; Alan Shteyman; Lance Presser; Sharon Altmann; Shelton Bradrick; Joseph A. Russell
Title: Vorpal: A novel RNA virus feature-extraction algorithm demonstrated through interpretable genotype-to-phenotype linear models
  • Document date: 2020_3_2
  • ID: 48mtdwuv_62
    Snippet: This clustering of K-mers, and subsequent representation as degenerate motifs, is another layer 722 of dimensionality reduction similar to lemmatization of words in a Natural Language Processing 723 (NLP) feature extraction technique 40 . Much of this approach could be described as modifications 724 of equivalent NLP feature extraction and modeling strategies. It should be noted however, that 725 data preparation techniques such as term frequency.....
    Document: This clustering of K-mers, and subsequent representation as degenerate motifs, is another layer 722 of dimensionality reduction similar to lemmatization of words in a Natural Language Processing 723 (NLP) feature extraction technique 40 . Much of this approach could be described as modifications 724 of equivalent NLP feature extraction and modeling strategies. It should be noted however, that 725 data preparation techniques such as term frequency-inverse document frequency (tf-idv), were 726 considered inappropriate to apply in this circumstance for multiple reasons. First, "document" 727 length was invariant in the sense that complete assemblies were the only instances allowed in the 728 training data, and differences in genome sizes within the taxonomies considered were considered 729 irrelevant. Second, document terms, in this case K-mer motifs, that follow a frequency pattern 730 similar to the word "the" in the English language are not present. Additionally, for this reason, 731 the data was not normalized, however to improve convergence speed this could be a future 732 improvement. 733 734

    Search related documents:
    Co phrase search for related documents
    • Try single phrases listed below for: 1