Results

Selected article for: "cross validation and machine learning"

Author: Tanujit Chakraborty; Indrajit Ghosh

Title: Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis

Document date: 2020_4_14

ID: ba6mdgq3_43

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: For the risk assessment with the CFR dataset for 50 countries, we apply the regression tree (RT) [7] that has built-in feature selection mechanism, easy interpretability, and provides better visualization. Rt, as a widely used simple machine learning algorithm, can model arbitrary decision boundaries. The methodology outlined in [7] can be summarized into three stages. The first stage involves growing the tree using a recursive partitioning techn.....

KG: Link to Knowledge Graph

Complete Snippet

Document: For the risk assessment with the CFR dataset for 50 countries, we apply the regression tree (RT) [7] that has built-in feature selection mechanism, easy interpretability, and provides better visualization. Rt, as a widely used simple machine learning algorithm, can model arbitrary decision boundaries. The methodology outlined in [7] can be summarized into three stages. The first stage involves growing the tree using a recursive partitioning technique to select essential variables from a set of possible causal variables and split points using a splitting criterion. The standard splitting criteria for RT is the mean squared error (MSE). After a large tree is identified, the second stage of RT methodology uses a pruning procedure that gives a nested subset of trees starting from the largest tree grown and continuing the process until only one node of the tree remains. The cross-validation technique is popularly used to provide estimates of future prediction errors for each subtree. The last stage of the RT methodology selects the optimal tree that corresponds to a tree yielding the lowest cross-validated or testing set error rate. To avoid instability of trees in this stage, trees with smaller sizes, but comparable in terms of accuracy, are chosen as an alternative. This process can be tuned to obtain trees of varying sizes and complexity. A measure of variable importance can be achieved by observing the drop in the error rate when another variable is used instead of the primary split. In general, the more frequent a variable appears as a primary split, the higher the importance score assigned. A detailed description of the tree building process is available at [17] .

Search related documents:

Co phrase search for related documents

accuracy term and cross validation technique: 1
accuracy term and error rate: 1
accuracy term and feature selection: 1
accuracy term and mean squared error: 1
accuracy term and MSE mean squared error: 1
accuracy term and prediction error: 1, 2
accuracy term and risk assessment: 1
accuracy term and small size: 1
accuracy term and squared error: 1

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: