Forecastability of infectious disease time series: are some seasons and pathogens intrinsically more difficult to forecast?

doi:10.1101/2025.04.29.25326677

Forecastability of infectious disease time series: are some seasons and pathogens intrinsically more difficult to forecast?

2025 · doi:10.1101/2025.04.29.25326677

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 5,960 characters · extracted from oa-doi-fallback · click to expand

Abstract For infectious disease forecasting challenges, individual model performance typically varies across space and time. This phenomenon raises the question: are there properties of the target time series that contribute to a particular season, location, or disease being more difficult to forecast? Here we characterize a time series’ future predictability using a forecastability metric that calculates the spectral density of the time series. Forecastability of syndromic influenza hospital admissions for the state of California varied widely across seasons and was positively correlated with peak burden. Next, using archived U.S. state and national forecasts targeting laboratory-confirmed COVID-19 and influenza hospital admissions, we investigated the relationship between forecastability and: (i) population size of the forecasting target, and (ii) forecast performance as measured by mean absolute error, weighted interval score (WIS), and scaled relative WIS. Forecastability increased with increasing population size of the forecasting target, and forecasting performance generally improved with higher forecastability when controlling for population size across scales. These preliminary results support the idea that some targets and respiratory virus seasons may be inherently more difficult to forecast and could help explain inter-seasonal variation in model performance. Author summary Could intrinsic properties of an epidemiological time series help explain why a particular season, location, or disease is more difficult to predict in the future? To answer this question, this analysis uses a measure of a time series’ future predictability called “forecastability,” which describes the inherent uncertainty or surprise in the signal. Influenza and COVID-19 hospital admissions had higher forecastability scores for locations with larger population sizes, possibly due to larger counts leading to smoother time series. At the same time, forecasting performance generally improved for time series with higher forecastability scores when controlling for population size, suggesting that this metric is helpful for understanding ease of forecasting. These preliminary results support the idea that some epidemiological targets and respiratory virus seasons may be inherently more difficult to forecast and could help explain why forecasting model performance changes across different respiratory virus seasons. Competing Interest Statement The authors have declared no competing interest. Funding Statement This work was supported by the California Department of Public Health. The findings and conclusions in this article are those of the author(s) and do not necessarily represent the views or opinions of the California Department of Public Health or the California Health and Human Services Agency. This study used the California Patient Discharge Dataset. The interpretation and reporting of these data are the sole responsibility of the authors. The authors acknowledge the California Department of Healthcare Access and Information for compilation of these data. This work was funded by Centers for Disease Control and Prevention, Epidemiology and Laboratory Capacity for Infectious Diseases, Cooperative Agreement Number 6 NU50CK000539. Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: The California Health and Human Services Agency Committee for the Protection of Human Subjects (CPHS) has determined that this research (project number 2024-210) is classified as exempt under the federal Common Rule. This decision is issued under the California Health and Human Services Agency's Federal wide Assurance #00000681 with the Office of Human Research Protections (OHRP). I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes Footnotes Added author summary; added additional background literature into the introduction; added additional context for the calculation of the forecastability score in the methods; added Figure 5 which explores the relationship between the amount of signal smoothing and the forecastability score. Data availability Code and underlying data to reproduce the analyses are available at: https://github.com/cdphmodeling/forecastability. Laboratory-confirmed hospital admissions data are publicly available data sets through HHS/NHSN for state: https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/g62h-syeh/about_data and facility-level time series: https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data Syndromic influenza hospitalization data derived from the California Department of Healthcare Access and Information (HCAI) contain individual level patient data and so are only available upon a data request: https://hcai.ca.gov/data/request-data/

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00