Author: Galasso, J.; Cao, D. M.; Hochberg, R.
                    Title: A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data  Cord-id: 3szz34i3  Document date: 2021_5_25
                    ID: 3szz34i3
                    
                    Snippet: During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric 
                    
                    
                    
                     
                    
                    
                    
                    
                        
                            
                                Document: During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number R(t) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number from a compartmental model with real time testing data until maximally correlated, helping our model fit better to the epidemic's trajectory ascertained by traditional models. Any poor reliability of R(t) due to flaws in the compartmental model are mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization.
 
  Search related documents: 
                                
                                Co phrase  search for related documents, hyperlinks ordered by date