Author: Taguchi, Y-h.; Turki, Turki
Title: Projection in genomic analysis: A theoretical basis to rationalize tensor decomposition and principal component analysis as feature selection tools Cord-id: 9kkj5j5e Document date: 2020_10_4
ID: 9kkj5j5e
Snippet: Identifying differentially expressed genes is difficult because of the small number of available samples compared with the large number of genes. Conventional gene selection methods employing statistical tests have the critical problem of heavy dependence of P-values on sample size. Although the recently proposed principal component analysis (PCA) and tensor decomposition (TD)-based unsupervised feature extraction (FE) has often outperformed these statistical test-based methods, the reason why t
Document: Identifying differentially expressed genes is difficult because of the small number of available samples compared with the large number of genes. Conventional gene selection methods employing statistical tests have the critical problem of heavy dependence of P-values on sample size. Although the recently proposed principal component analysis (PCA) and tensor decomposition (TD)-based unsupervised feature extraction (FE) has often outperformed these statistical test-based methods, the reason why they worked so well is unclear. In this study, we aim to understand this reason in the context of projection pursuit that was proposed a long time ago to solve the problem of dimensions; we can relate the space spanned by singular value vectors with that spanned by the optimal cluster centroids obtained from K-means. Thus, the success of PCA- and TD-based unsupervised FE can be understood by this equivalence. In addition to this, empirical threshold adjusted P-values of 0.01 assuming the null hypothesis that singular value vectors attributed to genes obey the Gaussian distribution empirically corresponds to threshold-adjusted P-values of 0.1 when the null distribution is generated by gene order shuffling. These findings thus rationalize the success of PCA- and TD-based unsupervised FE for the first time.
Search related documents:
Co phrase search for related documents- Try single phrases listed below for: 1
Co phrase search for related documents, hyperlinks ordered by date