AI-Enhanced Subseasonal Forecasting of Extreme Temperature Risks | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article AI-Enhanced Subseasonal Forecasting of Extreme Temperature Risks Jia Xing, Siwei Li, Shuxin Zheng, Ge Song, Jiaxin Dong, Gonzalo Ferrada, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7314380/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Sub-seasonal weather prediction remains a significant scientific challenge due to the chaotic nature of the atmosphere, with current numerical and AI-driven models exhibiting limited skill, particularly at the fine spatial scales for human exposure, agriculture, and infrastructure. Here, we introduce DeepMet, a high-resolution, AI-driven sub-seasonal forecasting system designed to improve the prediction of temperature extremes and their associated health risks, demonstrated successfully over the continental United States. Specifically, DeepMet substantially outperforms the benchmark of European Centre for Medium-Range Weather Forecasts, reducing the root mean square error by 20–60% for key surface variables, including daily maximum and minimum 2-meter temperature, specific humidity, and 10-meter wind speed. The model also improves the detection of extreme heat and cold events by over 40% across all evaluation metrics. By enhancing early warning capabilities, DeepMet enables more accurate identification of extreme weather conditions, potentially improving risk communication to prevent additional extreme-weather related deaths in the United States. Remarkably, such performance is achieved using only a single GPU for training, making the method highly accessible for local agencies to enhance early warning systems and protect public health. This underscores its strong potential to transform long-range forecasting and significantly enhance public health preparedness in a changing climate. Earth and environmental sciences/Climate sciences Physical sciences/Engineering Earth and environmental sciences/Environmental sciences Physical sciences/Mathematics and computing Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction In the context of climate change, extreme weather events are becoming more frequent and increasingly threaten human health and living conditions 1 – 2 . Among all weather-related hazards, extreme temperatures associated with heatwaves and cold spells are the leading cause of mortality, contributing to over five million deaths globally each year 3 – 6 . Early warning systems, especially those extending to the sub-seasonal timescale, are essential for improving preparedness 7 . Numerous previous studies have been conducted to predict extreme temperatures at sub-seasonal scales 8 – 10 . In recognition of the importance of early warnings, the United Nations launched the Early Warnings for All initiative 11 , aiming to ensure that every person on Earth is protected from hazardous weather, water, or climate events through life-saving early warning systems. Apparently, accurate and timely forecasts enable proactive healthcare planning, effective risk communication, and efficient resource allocation, particularly for vulnerable populations such as the elderly, children, and individuals with chronic illnesses. Traditional numerical models face significant challenges in sub-seasonal forecasting due to error propagation across time and space, stemming from the inherently chaotic nature of the atmosphere 12 . While AI-driven approaches have shown promise, they are mostly constrained to short-term forecasts with limited skill beyond two weeks 13 – 16 , due to the challenge of effectively balancing focus across the multi-dimensional atmospheric system, even with substantial computational resources. Moreover, many of these models do not focus on key surface-level variables, and forecasting typically global in scale with coarse spatial resolution 17 , making it difficult to incorporate accurate ground-based observations due to the high spatial heterogeneity of surface conditions. To better support public health applications operating with minimal computational cost for local agency, there is a growing need for high-resolution, regionally focused weather forecasting models that emphasize surface variables relevant to human thermal stress over extended temporal horizons. To address the limitations mentioned above, we extend AI-based forecasting to the high-resolution sub-seasonal scale (noted as DeepMet, see Fig. 1 ), with a focus on surface variables that are critical for assessing and managing the increasing risks associated with temperature extremes. First, we leverage multi-year dynamical downscaling using a numerical weather model, incorporating abundant ground-based and upper-atmosphere historical observations through Four-Dimensional Data Assimilation 18 . This approach produces regional forecasts at a 12 km × 12 km resolution ten times finer than the widely used ERA5 dataset from the European Centre for Medium-Range Weather Forecasts (ECMWF Reanalysis v5) 19 making it significantly more suitable for assessing human exposure. Additionally, it also enables fine-tuning of the model with high-quality ground observations, leading to improved forecasts of surface variables compared to traditional AI models that rely solely on coarse-resolution global datasets like ERA5. Second, we developed a deep learning architecture for meteorological forecasting that is both streamlined and computationally efficient, running on a single NVIDIA A100 GPU and reducing hardware demands by up to 60-fold compared to traditional multi-GPU systems. By avoiding unnecessary global-scale predictions of numerous unrelated factors for localized applications, DeepMet concentrates on key variables relevant to temperature extremes. Specifically, it targets daily maximum 2 meter temperature (T2max) and specific humidity (Q2) for heatwaves, and daily minimum 2 meter temperature (T2min) combined with wind speed for cold events, which are core components of widely used public health indices such as the Heat Index 20 and Wind Chill Index 21 . This low-cost design enables more efficient support for local agencies, allowing them to develop improved localized forecasting systems with limited resources. Third, building on our previous findings regarding the importance of memory structures for long-term forecasting 22 , we implement a ConvLSTM-based model 23 (Figure S1 ) that utilizes the past 24 hours of multi-variable inputs to predict daily variations up to 45 days ahead aligning with the global sub-seasonal to seasonal (S2S) forecasting scale used by systems such as ECMWF. This architecture mitigates the chaotic nature of the atmosphere by leveraging temporal memory and optimizing the overall evolution of weather patterns through a statistical machine learning process, which helps constrain error propagation over time, which is an essential feature for achieving reliable long-term predictions in S2S systems. Together, these innovations enable our approach to deliver high-resolution sub-seasonal forecasts with enhanced accuracy in predicting temperature extremes—all while operating at minimal computational cost, as detailed in the following section. Results Improved sub-seasonal prediction on crucial ground meteorological variables DeepMet demonstrates significant improvements in sub-seasonal forecasting of key surface meteorological variables, as illustrated in Fig. 2 . Using dynamically downscaled reanalysis fields as ground truth, DeepMet outperforms a typical physics-based S2S forecast system (benchmarked against ECMWF) across the critical forecast range of 14–42 days. It not only captures the magnitude of meteorological variables more accurately but also more effectively reproduces their temporal evolution and spatial structure at high resolution. Specifically, DeepMet increases the temporal anomaly correlation coefficient (ACC) by 4–11% for T2max and T2min, 20–32% for Q2, and up to 138% for WSPD10, reflecting a stronger agreement with observed temporal variability, independent of mean bias, comparing to the ECMWF benchmark. Notably, these improvements become more pronounced increasing lead time, underscoring DeepMet’s effectiveness in mitigating error propagation within the inherently chaotic S2S forecasting regime. In terms of spatial accuracy, DeepMet substantially improves the Structural Similarity Index (SSIM) by 80–250% for T2max, T2min and WSPD10, demonstrating enhanced spatial fidelity in structure, luminance, and contrast, largely attributable to its finer spatial resolution compared to global systems (as ECMWF). Additionally, DeepMet reduces RMSE by approximately 20–60% across all four variables, indicating a substantial decrease in average prediction error and closer alignment with observed data. We further validated DeepMet’s performance using local ground-based observations from the NOAA National Climatic Data Center (NCDC) network. In addition to substantially reducing RMSE compared to ECMWF forecasts, DeepMet achieved significantly higher Ranked Probability Skill Scores (RPSS), with improvements of up to 0.5 for all four variables, demonstrating improved probabilistic forecasting skill relative to climatology (i.e., the historical average distribution). These results highlight DeepMet’s effectiveness in delivering skillful and reliable S2S forecasts. Enhanced predictive capability for heat-wave and cold event With enhanced predictability of key meteorological variables, DeepMet demonstrates superior capability in forecasting heatwaves and cold events, as shown in Fig. 3 . Using NCDC observations as ground truth, DeepMet significantly outperforms ECMWF in long-range extreme temperature prediction. Specifically, DeepMet achieves substantially higher F1 scores by 40% for heatwaves and 90% for cold events, indicating a better balance between high recall (capturing true events) and high precision (reducing false alarms). The Critical Success Index (CSI) also shows marked improvement, increasing by 53% for heatwaves and 123% and cold events, confirming DeepMet’s superior ability to detect extreme events accurately while minimizing both misses and false positives. Moreover, the Probability of Detection (POD) increases by 50% for heatwaves and 185% for cold events, demonstrating that DeepMet captures significantly more true extreme events than ECMWF benchmark. Simultaneously, the False Alarm Ratio (FAR) decreases by 29% and 1.4% for heatwaves and cold events, respectively, resulting in fewer false alerts and more trustworthy forecasts. We also compared the predictions of the two models for representative single-day heatwaves and cold events, using forecast lead times ranging from 14 to 42 days. The results show that DeepMet significantly outperforms ECMWF, consistently capturing extreme temperature events across the spatial domain, with 30–50% more successful detections. Potential health benefits of early warning systems By accurately forecasting extreme heatwaves and cold events, DeepMet can significantly strengthen early warning systems, enabling communities, especially vulnerable populations and better preparing more effectively. We assessed the added value of DeepMet over ECMWF for both types of events throughout 2023. As shown in Fig. 4 , substantial improvements in cold event prediction were observed across northern states, which are more frequently impacted by cold extremes. In contrast, the more significant improvements in heatwave prediction occurred in southern states, where extreme heat events are more common. Compared to ECMWF forecasts, DeepMet enables more accurate identification of extreme weather events, potentially improving early warnings for an additional 600 million people-day during cold events and 3,300 million people-day during heatwaves. This represents a 30% increase in population coverage during cold spells and a 60% increase during heatwaves. Given the annual estimates of 700–1,300 heat-related deaths and 1,200–2,000 cold-related deaths in the U.S. 24 – 26 , enhanced preparedness enabled by DeepMet such as timely healthcare interventions or emergency services could help prevent considerable deaths annually. These benefits become even more substantial as extreme heat and cold events intensify under a business-as-usual climate change scenario 26 , potentially preventing additional premature deaths and avoiding associated economic losses each year, not to mention the additional risks posed to agriculture, infrastructure, and broader societal systems. These findings highlight the tangible value of more accurate and timely sub-seasonal extreme temperature forecasts. Discussion This study demonstrates a successful application of AI in extending the temporal scale of weather forecasts, an area that has traditionally posed significant challenges for numerical models. The key to DeepMet’s success lies in its ability to incorporate time-series memory into the prediction process, enabling the model to effectively constrain error propagation over time. Unlike traditional numerical methods, which rely heavily on step-by-step progression and are prone to cumulative errors, DeepMet leverages information from multiple previous time steps, thereby reducing the risk of runaway inaccuracies. This finding is particularly important for forecasting highly turbulent systems such as the atmosphere, highlighting that overcoming the limitations of long-range weather prediction may require moving beyond purely mathematical approaches to models that incorporate historical memory. In the current version of DeepMet, we use the previous 24 hours of data as input. While we initially expected that incorporating longer historical records (e.g., extending to 10 days) might improve model performance, our experiments indicated otherwise. In fact, extending the input window to 10 days often resulted in reduced accuracy (Figure S2), likely because the most recent 24-hour data contains the most relevant predictive signals for S2S forecasting, while older data contributes diminishing value. Additionally, we did not include more than 24 hours of high-resolution data due to the computational resource constraints (mostly due to RAM limitation), which would significantly reduce the effective size of the training dataset and compromise training efficiency. Nevertheless, future work may explore methods to effectively incorporate longer historical time series into the model, particularly when computational resources become available. The success of DeepMet also challenges traditional perspectives in S2S forecasting. Conventional wisdom holds that long-term predictions require global-scale models, such as ECMWF, to capture the influence of large-scale atmospheric circulation across regions. However, our results demonstrate that a regional-scale S2S model can outperform global models even without explicitly representing global influences. This is because error propagation occurs not only temporally but also spatially. Errors originating outside the target domain can amplify and be transported into the region of interest, potentially degrading forecast accuracy. While global models are capable of capturing cross-boundary flows, they may also introduce additional uncertainty from distant regions, which can limit their benefit. Moreover, global models often face a trade-off between resolution and spatial coverage, often allocating computational resources to areas that are not directly relevant to the target region. That said, this does not imply that boundary conditions are unimportant. In fact, our experiments indicate that incorporating simplified boundary condition information improves forecast skill during the first two weeks (Figure S3). This finding suggests that while global context is useful for short-term predictions, it may be less critical or even counterproductive for longer-term S2S forecasting. Another challenge to traditional thinking lies in the assumption that meteorological variables are highly interdependent and should be predicted simultaneously. However, our findings suggest that this approach does not always yield better performance. Due to error propagation, inaccuracies in one variable can adversely affect the prediction of others. In contrast to other AI-based weather models that attempt to forecast multiple variables concurrently, DeepMet employs a single-variable prediction strategy. This allows the model to focus on learning the dynamics of each variable independently, without the need to balance competing priorities during optimization. It also enables the selection of more relevant 3D input features tailored to each specific variable (Table S1 ). Moreover, this modular approach enables parallel training across multiple GPUs, improving both efficiency and scalability while preserving task-specific accuracy. We scaled the time series input to a daily resolution for S2S forecasting to reduce error propagation across time steps. Additionally, using hourly data would require significantly more memory, which limits the amount of training data that can be processed and ultimately hinders model performance. In our experiments, increasing the temporal resolution of predictions from daily to 3-hourly, or 6-hourly intervals which did not yield meaningful improvements (Figure S4). Therefore, daily resolution remains a practical and effective choice, especially for S2S forecasts where long lead times are the primary focus. While large GPU resources can enhance model performance, our study demonstrates the feasibility of generating fine-scale weather forecasts using limited computational resources. This makes the approach more realistic and accessible. Importantly, the method well-aligned with localized policy needs for protecting public health, agriculture, and infrastructure. Looking ahead, there is substantial potential to further enhance forecasting skill, both in accuracy and scope by expanding applications to other critical variables such as precipitation, wildfires, and floods, ultimately contributing to the protection of more lives. Incorporating additional future-relevant information such as slowly varying or accurately predictable variables can further improve forecasting performance. For example, incorporating improved representations of time-series day for slowly varying geophysical inputs, such as leaf area index, vegetation cover, albedo, downward shortwave radiation, into DeepMet may lead to noticeable improvements in predictive accuracy (Figure S5). There is no doubt that AI will play an increasingly vital role in extreme weather prediction, especially at the S2S scale, which continues to pose significant challenges for traditional numerical models. Method Dynamical downscaling with numerical model To better represent the human exposure to the extreme temperatures, we leverage a mesoscale numerical weather prediction system which is the Weather Research and Forecasting (WRF) 27 to conduct six full-year simulations for the years 2008, 2012, 2014, 2019, 2021, and 2023. These simulations are dynamically downscaled to a regional scale with a 12 km resolution, which is approximately 10 times finer than the original 1.5 by 1.5 degree resolution (about 110–160 km) of the global ECMWF S2S dataset. The WRF model was configured following the setup used in our previous study 28 , including the Morrison two-moment microphysics scheme; the Rapid Radiative Transfer Model for Global Climate Models (RRTMG) for both longwave and shortwave radiation scheme; the Yonsei University (YSU) planetary boundary layer (PBL) scheme; the Pleim–Xiu land surface model; the revised MM5 (Jimenez) surface layer scheme; and Grell–Freitas (GF), with a radiative feedback cumulus parameterization option. The WRF model simulations were driven by the North American Mesoscale (NAM) model analyses from the National Centers for Environmental Prediction (NCEP), incorporating four-dimensional data assimilation (FDDA) with both surface and upper-air observations. Observation nudging was applied using NCEP’s Automated Data Processing (ADP) Global Surface and Upper-Air Observational Weather Data. For comparison, we obtained ECMWF sub-seasonal forecast data spanning six years from the open-access repository. The dataset includes daily control run forecasts of T2, T2max, T2min, U10, V10, and pressure-level specific humidity (q) with a lead time of up to 46 days. The original data, provided at a 1.5° × 1.5° resolution, were re-gridded to the target domain using bilinear interpolation to facilitate comparison. Ground-based observational data were obtained from the NOAA NCDC ISD-Lite archive, which provides hourly records of T2, Q2, and 10-meter wind speed (WSPD10) from approximately 6,775 sites across the U.S. domain. The downscaled WRF model presents better agreement with the ground-based measurements from NCDC (Figure S6), providing a reliable foundation for training DeepMet and enabling it to deliver substantial improvement over global forecasting models such as ECMWF. DeepMet model structure Consistent with our previous applications in atmospheric chemistry forecasting, the DeepMet model architecture integrates two ConvLSTM modules, each consisting of three layers with varying channel sizes (256, 128, and 64) and a 3×3 kernel (Figure S1 ). The first module processes historical records from the past 24 hours, incorporating multiple variables, such as 2D and 3D inputs from model reanalysis and ground-based measurements to extract key information relevant to forecasting future variations of the target variable. This processed historical data, combined with time-invariant or slowly varying geographical and meteorological features, is then passed to a second ConvLSTM module to generate the forecast. During prediction, the hidden and cell states (h, c) are dynamically updated, enabling the model to retain and leverage long-term historical dependencies. This design is consistent with our prior work in atmospheric chemistry, where predictions at earlier time steps are recursively used as inputs for future steps. The methodology effectively captures the dual role of meteorological factors: their gradual modulation of baseline conditions over time (Role 1), and their direct interaction with other variables (Role 2), both of which are critical in shaping future atmospheric states. The geographical features used in this study consist of eight time-invariant variables: dominant land use category (DLUSE), terrain elevation (HT), land–water mask (LWMASK), map-scale factor squared (MSFX2), land use fraction (LUFRAC), percentage of urban area (PURB), latitude (LAT), and longitude (LON). These static features are consistently applied throughout the entire prediction period. Four 2D surface characteristics, Leaf Area Index (LAI), vegetation cover (VEG), albedo, and clear-sky shortwave downwelling radiation (SWDNBC), are prescribed from climatological datasets, given their relatively stable annual cycles (Figure S7). These features are incorporated as inputs in both the historical and future prediction modules. In addition, 18 time-dependent 2D meteorological variables are incorporated into the historical module. These include: U- and V-component winds (UV-wind), cloud fraction (CFRAC), planetary boundary layer height (PBL), precipitation (PREP), 2-meter specific humidity (Q2), shortwave radiation at the ground surface (RGRND), 2-meter temperature (T2), 10-meter wind speed (WSPD10), convective velocity scale (WSTAR), sensible heat flux (HFX), latent heat flux (LH), cell-averaged friction velocity (USTAR), surface roughness length (ZRUF), surface pressure (PRSFC), average liquid water content of clouds (WBAR), canopy moisture content (WR), and snow cover (SNOCOV). Furthermore, six 3D meteorological variables are used, including air temperature (TA), cloud water mixing ratio (QC), water vapor mixing ratio (QV), three-dimensional cloud fraction (CFRAC_3D), and the U and V components of wind (UW, VW). These variables are resolved across 35 vertical levels, from the surface up to 100 mb, and incorporated into the historical module to enhance the model’s ability to capture vertical atmospheric structure. DeepMet training and testing We leverage multiple WRF model simulations for pre-training and incorporate ground-based measurements from the NCDC dataset to enhance model performance at the surface level. The model is initially trained using data from five historical years (2008, 2012, 2014, 2019, and 2021), followed by fine-tuning with NCDC ground-based measurements. This strategy helps mitigate sample imbalance in the observational data by avoiding direct training solely on the ground measurements 29 . Model performance is evaluated using data from the year of 2023, emulating the strategy of leveraging historical data to forecast recent conditions. Given the large dataset size and limited computational resources, particularly RAM constraints, we adopted a subset training strategy. Specifically, the model was trained on batches of 20 subsets of all approximately 1,500 samples, over a total of 4,000 epochs, preventing the need to load the entire dataset into memory simultaneously. For data augmentation, random cropping was applied to the feature maps, resizing them to 120 × 120 grid cells. The model was trained using the Mean Squared Error loss function. The learning rate was initialized at 0.0001 and linearly decayed to zero over the course of training. The Adam optimizer was employed to improve model convergence and stability 30 . We evaluate the performance of the DeepMet model using several metrics: the Anomaly Correlation Coefficient (ACC), Structural Similarity Index Measure (SSIM), Root Mean Square Error (RMSE), and the Ranked Probability Skill Score (RPSS). Spatially averaged statistics are computed by comparing DeepMet predictions with both ECMWF forecasts and WRF downscaled meteorological reanalysis data. In addition, RPSS is used to evaluate forecast skill against surface-level observations from the NCDC ground-based measurement network. To evaluate the heatwave and cold event prediction performance, we use four key metrics: F1 Score, Critical Success Index (CSI), Probability of Detection (POD), and False Alarm Rate (FAR). These metrics assess the models’ ability to accurately detect impactful extreme temperature events, which are defined using T2max and Q2 for heatwaves, and T2min and 10-meter wind speed (WSPD10) for cold spells. The heatwave event is defined as the T2max > 32 ℃, and Q2 > 0.014, and the cold event is defined as T2min 3 m/s. Health impact assessment To quantify the potential public health benefits of improved early warning provide by DeepMet, we estimated the number of individuals accurately identified as exposed to heatwaves by DeepMet compared to ECMWF. Population data were obtained from the Gridded Population of the World, Version 4 (GPWv4), for the most recent year available (2020) 31 , at a spatial resolution of 2.5 arc-minutes, and regridded to match the target domain. Declarations Data availability The WRF simulation data can be obtained from the corresponding authors; The ECMWF data was obtained from https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/; The NCDC data was downloaded from ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite; The Gridded Population of the World was obtained from https://www.earthdata.nasa.gov/data/projects/gpw. Code availability The DeepMet code will be available at zenodo. Acknowledgements This work was supported by U.S. National Science Foundation [NO: 2100582], National Natural Science Foundation of China [NO: 42375131], UT AI Tennessee Initiative Seed Funds, and MSRA collaborative research project. The author would like to acknowledge the support of the Bellagio Center Residency Program, funded by the Rockefeller Foundation. The author also acknowledges Dr. Kristen Foley and Dr. Christian Hogrefe from US EPA, Dr. Daniel Tong and Dr. Bok H. Baek from GMU for supporting the meteorological dataset. Author contributions. J.X.: Conceptualization, Methodology, Model Development, Formal analysis, Investigation, Writing- Original draft preparation, Writing- Reviewing and Editing; S.L.: Conceptualization, Methodology, Model Development, Formal analysis, Writing- Original draft preparation, Writing- Reviewing and Editing; S.Z.: Methodology, Model Development, Writing- Reviewing and Editing; G.S.: Formal analysis; J.D.: Formal analysis; G.A.F: Formal analysis; T.Y.L.: Conceptualization, Supervision, Writing- Reviewing and Editing; J.S.F.: Conceptualization, Supervision, Writing- Reviewing and Editing. Competing interests The authors declare no competing financial interests. Inclusion & Ethics The study does not involve human participants or sensitive data. References Grant, L., Vanderkelen, I., Gudmundsson, L., Fischer, E., Seneviratne, S.I. and Thiery, W., 2025. Global emergence of unprecedented lifetime exposure to climate extremes. Nature, 641(8062), pp.374-379. Patz, J., Campbell-Lendrum, D., Holloway, T. et al. Impact of regional climate change on human health. Nature 438, 310–317 (2005). https://doi.org/10.1038/nature04188 Zhao, Q., Guo, Y., Ye, T., Gasparrini, A., Tong, S., Overcenco, A., Urban, A., Schneider, A., Entezari, A., Vicedo-Cabrera, A.M. and Zanobetti, A., 2021. Global, regional, and national burden of mortality associated with non-optimal ambient temperatures from 2000 to 2019: a three-stage modelling study. The Lancet Planetary Health, 5(7), pp.e415-e425. Berko, J., Ingram, D.D., Saha, S. and Parker, J.D., 2014. Deaths Attributed to Heat, Cold, and other Weather Events in the United States, 2006–2010, National Health Statistics Reports. US Department of Health and Human Services. Masselot, P., Mistry, M.N., Rao, S. et al. Estimating future heat-related and cold-related mortality under climate change, demographic and adaptation scenarios in 854 European cities. Nat Med 31, 1294–1302 (2025). https://doi.org/10.1038/s41591-024-03452-2 Yang, J., Zhou, M., Ren, Z., Li, M., Wang, B., Liu, D.L., Ou, C.Q., Yin, P., Sun, J., Tong, S. and Wang, H., 2021. Projecting heat-related excess mortality under climate change scenarios in China. Nature communications, 12(1), p.1039. McGregor, G., 2024. Heatwave Responses: Early Warning Systems. In Heatwaves: Causes, Consequences and Responses (pp. 549-599). Cham: Springer International Publishing. Vitart, F. and Robertson, A.W., 2018. The sub-seasonal to seasonal prediction project (S2S) and the prediction of extreme events. npj climate and atmospheric science, 1(1), p.3. Lin, H., Mo, R. and Vitart, F., 2022. The 2021 western North American heatwave and its subseasonal predictions. Geophysical Research Letters, 49(6), p.e2021GL097036. Xie, J., Hsu, P.C., Hu, Y., Zhang, H. and Ye, M., 2024. Advancing subseasonal surface air temperature and heat wave prediction skill in China by incorporating scale interaction in a deep learning model. Geophysical Research Letters, 51(20), p.e2024GL111076. WMO, EARLY WARNINGS FOR ALL, The UN Global Early Warning Initiative for the Implementation of Climate Adaptation, Executive Action Plan 2023-2027, 2022 Bauer, P., Thorpe, A., & Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525(7567), 47-55. Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., & Tian, Q. (2023). Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619(7970), 533-538. Bodnar, C., Bruinsma, W.P., Lucic, A., Stanley, M., Brandstetter, J., Garvan, P., Riechert, M., Weyn, J., Dong, H., Vaughan, A. and Gupta, J.K., 2024. Aurora: A foundation model of the atmosphere. arXiv preprint arXiv:2405.13063. Kochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., Klöwer, M., Lottes, J., Rasp, S., Düben, P. and Hatfield, S., 2024. Neural general circulation models for weather and climate. Nature, 632(8027), pp.1060-1066. Price, I., Sanchez-Gonzalez, A., Alet, F., Andersson, T.R., El-Kadi, A., Masters, D., Ewalds, T., Stott, J., Mohamed, S., Battaglia, P. and Lam, R., 2025. Probabilistic weather forecasting with machine learning. Nature, 637(8044), pp.84-90. Chen, L., Zhong, X., Li, H., Wu, J., Lu, B., Chen, D., ... & Qi, Y. (2024). A machine learning model that outperforms conventional global subseasonal forecast models. Nature Communications, 15(1), 6425. Stauffer, D.R. and Seaman, N.L., 1990. Use of four-dimensional data assimilation in a limited-area mesoscale model. Part I: Experiments with synoptic-scale data. Monthly Weather Review, 118(6), pp.1250-1277. Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D. and Simmons, A., 2020. The ERA5 global reanalysis. Quarterly journal of the royal meteorological society, 146(730), pp.1999-2049. Anderson, G.B., Bell, M.L. and Peng, R.D., 2013. Methods to calculate the heat index as an exposure metric in environmental health research. Environmental health perspectives, 121(10), pp.1111-1119. Osczevski, R.J., 1995. The basis of wind chill. Arctic, pp.372-382. Xing, J., Zheng, S., Li, S., Huang, L., Wang, X., Kelly, J.T., Wang, S., Liu, C., Jang, C., Zhu, Y. and Zhang, J., 2022. Mimicking atmospheric photochemical modeling with a deep neural network. Atmospheric research, 265, p.105919. Shi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., & Woo, W. C. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems, 28. Howard, J. T., Androne, N., Alcover, K. C., & Santos-Lozada, A. R. (2024). Trends of heat-related deaths in the US, 1999-2023. JAMA, 332(14), 1203-1204. Berko, J. (2014). Deaths attributed to heat, cold, and other weather events in the United States, 2006-2010 (No. 76). US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics. USGCRP, 2018: Impacts, Risks, and Adaptation in the United States: Fourth National Climate Assessment, Volume II [Reidmiller, D.R., C.W. Avery, D.R. Easterling, K.E. Kunkel, K.L.M. Lewis, T.K. Maycock, and B.C. Stewart (eds.)]. U.S. Global Change Research Program, Washington, DC, USA, 1515 pp. doi: 10.7930/NCA4.2018. Skamarock, W. C., J. B. Klemp, J. Dudhia, D. O. Gill, D. M. Barker, M. G Duda, X.-Y. Huang, W. Wang, and J. G. Powers, A Description of the Advanced Research WRF Version 3. NCAR Tech. Note NCAR/TN-475+STR, 113 pp. 2008. Baek, B. H., Coats, C., Ma, S., Wang, C.-T., Li, Y., Xing, J., Tong, D., Kim, S., and Woo, J.-H.: Dynamic Meteorology-induced Emissions Coupler (MetEmis) development in the Community Multiscale Air Quality (CMAQ): CMAQ-MetEmis, Geosci. Model Dev., 16, 4659–4676, https://doi.org/10.5194/gmd-16-4659-2023, 2023. Li, S., Ding, Y., Xing, J., and Fu, J. S.: Retrieving ground-level PM2.5 concentrations in China (2013–2021) with a numerical-model-informed testbed to mitigate sample-imbalance-induced biases, Earth Syst. Sci. Data, 16, 3781–3793, https://doi.org/10.5194/essd-16-3781-2024, 2024. Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint, 2014, arXiv:1412.6980 Gridded Population of the World, Version 4 (GPWv4): Population Count Adjusted to Match 2015 Revision of UN WPP Country Totals, Revision 11. Center for International Earth Science Information Network - CIESIN - Columbia University. (2018). NASA Socioeconomic Data and Applications Center (SEDAC). DOI: https://doi.org/10.7927/H4PN93PB Additional Declarations No competing interests reported. Supplementary Files DeepMetSI.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7314380","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":501928738,"identity":"76d04fcc-0f56-47b3-9ae9-26781739edbe","order_by":0,"name":"Jia Xing","email":"","orcid":"","institution":"University of Tennessee at Knoxville","correspondingAuthor":false,"prefix":"","firstName":"Jia","middleName":"","lastName":"Xing","suffix":""},{"id":501928739,"identity":"731c26ec-bd88-4f37-9665-282c8340d498","order_by":1,"name":"Siwei Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAz0lEQVRIiWNgGAWjYDACZiBOYLDg4WfggXGJ0yLBI9lAtBYIkGAwOECsFvl2HsMHDxgkZIxv5B6TYKiwTmxgP3sArxaDwzzGBiCHmd3IS5NgOJOe2MCTl4BfCzPvNgmwlts5ZhKMbYcTGyR4DPA7rJl3+w+QFuPZIC3/iNDCcJh3GzjEDKRBWhqI0GJwmP+zRIKBBI/E/TfGFgnH0o3beHIIOKz/WOLHHxU29vw9ZwxvfKixlu1nP0PAYRC7oHQCELMRoX4UjIJRMApGAQEAAPgGN/7ZIUlTAAAAAElFTkSuQmCC","orcid":"","institution":"Wuhan University","correspondingAuthor":true,"prefix":"","firstName":"Siwei","middleName":"","lastName":"Li","suffix":""},{"id":501928740,"identity":"b7f59bed-35a3-4325-b7b0-578d0c039330","order_by":2,"name":"Shuxin Zheng","email":"","orcid":"","institution":"Zhongguancun Institute of Artificial Intelligence","correspondingAuthor":false,"prefix":"","firstName":"Shuxin","middleName":"","lastName":"Zheng","suffix":""},{"id":501928741,"identity":"7a76fee6-a002-428b-b0f9-b31e2d90b6c4","order_by":3,"name":"Ge Song","email":"","orcid":"","institution":"Wuhan University","correspondingAuthor":false,"prefix":"","firstName":"Ge","middleName":"","lastName":"Song","suffix":""},{"id":501928742,"identity":"ec865ab3-6bad-4f8d-a78a-f15865dcee4c","order_by":4,"name":"Jiaxin Dong","email":"","orcid":"","institution":"Wuhan University","correspondingAuthor":false,"prefix":"","firstName":"Jiaxin","middleName":"","lastName":"Dong","suffix":""},{"id":501928743,"identity":"2aa30c19-5e8a-411f-8058-1645524ec00b","order_by":5,"name":"Gonzalo Ferrada","email":"","orcid":"","institution":"University of Tennessee at Knoxville","correspondingAuthor":false,"prefix":"","firstName":"Gonzalo","middleName":"","lastName":"Ferrada","suffix":""},{"id":501928744,"identity":"fe264952-5d4f-4596-b665-a490878540c5","order_by":6,"name":"Tie-Yan Liu","email":"","orcid":"","institution":"Zhongguancun Institute of Artificial Intelligence","correspondingAuthor":false,"prefix":"","firstName":"Tie-Yan","middleName":"","lastName":"Liu","suffix":""},{"id":501928745,"identity":"5e39750a-82f3-42bb-9431-b5aba7493cf3","order_by":7,"name":"Joshua Fu","email":"","orcid":"","institution":"University of Tennessee at Knoxville","correspondingAuthor":false,"prefix":"","firstName":"Joshua","middleName":"","lastName":"Fu","suffix":""}],"badges":[],"createdAt":"2025-08-07 03:53:21","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7314380/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7314380/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":89557678,"identity":"bb97d965-5590-4d85-a0b8-2f8d69302a2b","added_by":"auto","created_at":"2025-08-21 09:48:50","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":344627,"visible":true,"origin":"","legend":"\u003cp\u003eFramework and Advantages of a New Forecasting Paradigm\u003c/p\u003e","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/e22b8f9db17ba5d058d76631.png"},{"id":89557398,"identity":"515406dc-3652-485f-b6d5-135a6da7b175","added_by":"auto","created_at":"2025-08-21 09:40:51","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":12777237,"visible":true,"origin":"","legend":"\u003cp\u003eComparison of DeepMet and ECMWF predictions on crucial meteorological variables relate to extreme temperature at S2S scale\u003c/p\u003e","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/6650df998ad13120bd6422ab.png"},{"id":89557681,"identity":"5f747e86-5dc1-4c6d-846b-344395078212","added_by":"auto","created_at":"2025-08-21 09:48:51","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":8381476,"visible":true,"origin":"","legend":"\u003cp\u003eComparison of DeepMet and ECMWF predictions on extreme temperature events\u003c/p\u003e","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/59ec40c723c95d6a6e736347.png"},{"id":89557393,"identity":"c6fa6b3d-09b0-4f25-9121-87eacfb79b0b","added_by":"auto","created_at":"2025-08-21 09:40:51","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":3057601,"visible":true,"origin":"","legend":"\u003cp\u003eEstimated benefits of DeepMet in identifying extreme heatwaves and cold events during 2023\u003c/p\u003e","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/636d21ca44b5b45564a59663.png"},{"id":90728487,"identity":"0c477a4e-c7c3-4699-9b04-d583fd7fec2e","added_by":"auto","created_at":"2025-09-06 15:46:53","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":5325856,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/1c39ae0f-1cad-4573-83eb-aa74e495da66.pdf"},{"id":89558554,"identity":"8ce59ef6-5a5b-4a33-9659-6c176a21c119","added_by":"auto","created_at":"2025-08-21 09:56:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":5258029,"visible":true,"origin":"","legend":"","description":"","filename":"DeepMetSI.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7314380/v1/4c52369e3cfa6b03acb992ce.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"AI-Enhanced Subseasonal Forecasting of Extreme Temperature Risks","fulltext":[{"header":"Introduction","content":"\u003cp\u003eIn the context of climate change, extreme weather events are becoming more frequent and increasingly threaten human health and living conditions \u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Among all weather-related hazards, extreme temperatures associated with heatwaves and cold spells are the leading cause of mortality, contributing to over five million deaths globally each year \u003csup\u003e\u003cspan additionalcitationids=\"CR4 CR5\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. Early warning systems, especially those extending to the sub-seasonal timescale, are essential for improving preparedness \u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. Numerous previous studies have been conducted to predict extreme temperatures at sub-seasonal scales \u003csup\u003e\u003cspan additionalcitationids=\"CR9\" citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. In recognition of the importance of early warnings, the United Nations launched the Early Warnings for All initiative \u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, aiming to ensure that every person on Earth is protected from hazardous weather, water, or climate events through life-saving early warning systems. Apparently, accurate and timely forecasts enable proactive healthcare planning, effective risk communication, and efficient resource allocation, particularly for vulnerable populations such as the elderly, children, and individuals with chronic illnesses.\u003c/p\u003e\u003cp\u003eTraditional numerical models face significant challenges in sub-seasonal forecasting due to error propagation across time and space, stemming from the inherently chaotic nature of the atmosphere \u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. While AI-driven approaches have shown promise, they are mostly constrained to short-term forecasts with limited skill beyond two weeks\u003csup\u003e\u003cspan additionalcitationids=\"CR14 CR15\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e, due to the challenge of effectively balancing focus across the multi-dimensional atmospheric system, even with substantial computational resources. Moreover, many of these models do not focus on key surface-level variables, and forecasting typically global in scale with coarse spatial resolution\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e, making it difficult to incorporate accurate ground-based observations due to the high spatial heterogeneity of surface conditions. To better support public health applications operating with minimal computational cost for local agency, there is a growing need for high-resolution, regionally focused weather forecasting models that emphasize surface variables relevant to human thermal stress over extended temporal horizons.\u003c/p\u003e\u003cp\u003eTo address the limitations mentioned above, we extend AI-based forecasting to the high-resolution sub-seasonal scale (noted as DeepMet, see Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), with a focus on surface variables that are critical for assessing and managing the increasing risks associated with temperature extremes.\u003c/p\u003e\u003cp\u003eFirst, we leverage multi-year dynamical downscaling using a numerical weather model, incorporating abundant ground-based and upper-atmosphere historical observations through Four-Dimensional Data Assimilation \u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. This approach produces regional forecasts at a 12 km \u0026times; 12 km resolution ten times finer than the widely used ERA5 dataset from the European Centre for Medium-Range Weather Forecasts (ECMWF Reanalysis v5) \u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e making it significantly more suitable for assessing human exposure. Additionally, it also enables fine-tuning of the model with high-quality ground observations, leading to improved forecasts of surface variables compared to traditional AI models that rely solely on coarse-resolution global datasets like ERA5.\u003c/p\u003e\u003cp\u003eSecond, we developed a deep learning architecture for meteorological forecasting that is both streamlined and computationally efficient, running on a single NVIDIA A100 GPU and reducing hardware demands by up to 60-fold compared to traditional multi-GPU systems. By avoiding unnecessary global-scale predictions of numerous unrelated factors for localized applications, DeepMet concentrates on key variables relevant to temperature extremes. Specifically, it targets daily maximum 2 meter temperature (T2max) and specific humidity (Q2) for heatwaves, and daily minimum 2 meter temperature (T2min) combined with wind speed for cold events, which are core components of widely used public health indices such as the Heat Index\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e and Wind Chill Index\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e. This low-cost design enables more efficient support for local agencies, allowing them to develop improved localized forecasting systems with limited resources.\u003c/p\u003e\u003cp\u003eThird, building on our previous findings regarding the importance of memory structures for long-term forecasting\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e, we implement a ConvLSTM-based model\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e (Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) that utilizes the past 24 hours of multi-variable inputs to predict daily variations up to 45 days ahead aligning with the global sub-seasonal to seasonal (S2S) forecasting scale used by systems such as ECMWF. This architecture mitigates the chaotic nature of the atmosphere by leveraging temporal memory and optimizing the overall evolution of weather patterns through a statistical machine learning process, which helps constrain error propagation over time, which is an essential feature for achieving reliable long-term predictions in S2S systems.\u003c/p\u003e\u003cp\u003eTogether, these innovations enable our approach to deliver high-resolution sub-seasonal forecasts with enhanced accuracy in predicting temperature extremes\u0026mdash;all while operating at minimal computational cost, as detailed in the following section.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eImproved sub-seasonal prediction on crucial ground meteorological variables\u003c/h2\u003e\u003cp\u003eDeepMet demonstrates significant improvements in sub-seasonal forecasting of key surface meteorological variables, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Using dynamically downscaled reanalysis fields as ground truth, DeepMet outperforms a typical physics-based S2S forecast system (benchmarked against ECMWF) across the critical forecast range of 14\u0026ndash;42 days. It not only captures the magnitude of meteorological variables more accurately but also more effectively reproduces their temporal evolution and spatial structure at high resolution.\u003c/p\u003e\u003cp\u003eSpecifically, DeepMet increases the temporal anomaly correlation coefficient (ACC) by 4\u0026ndash;11% for T2max and T2min, 20\u0026ndash;32% for Q2, and up to 138% for WSPD10, reflecting a stronger agreement with observed temporal variability, independent of mean bias, comparing to the ECMWF benchmark. Notably, these improvements become more pronounced increasing lead time, underscoring DeepMet\u0026rsquo;s effectiveness in mitigating error propagation within the inherently chaotic S2S forecasting regime.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eIn terms of spatial accuracy, DeepMet substantially improves the Structural Similarity Index (SSIM) by 80\u0026ndash;250% for T2max, T2min and WSPD10, demonstrating enhanced spatial fidelity in structure, luminance, and contrast, largely attributable to its finer spatial resolution compared to global systems (as ECMWF). Additionally, DeepMet reduces RMSE by approximately 20\u0026ndash;60% across all four variables, indicating a substantial decrease in average prediction error and closer alignment with observed data.\u003c/p\u003e\u003cp\u003eWe further validated DeepMet\u0026rsquo;s performance using local ground-based observations from the NOAA National Climatic Data Center (NCDC) network. In addition to substantially reducing RMSE compared to ECMWF forecasts, DeepMet achieved significantly higher Ranked Probability Skill Scores (RPSS), with improvements of up to 0.5 for all four variables, demonstrating improved probabilistic forecasting skill relative to climatology (i.e., the historical average distribution). These results highlight DeepMet\u0026rsquo;s effectiveness in delivering skillful and reliable S2S forecasts.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eEnhanced predictive capability for heat-wave and cold event\u003c/h3\u003e\n\u003cp\u003eWith enhanced predictability of key meteorological variables, DeepMet demonstrates superior capability in forecasting heatwaves and cold events, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Using NCDC observations as ground truth, DeepMet significantly outperforms ECMWF in long-range extreme temperature prediction.\u003c/p\u003e\u003cp\u003eSpecifically, DeepMet achieves substantially higher F1 scores by 40% for heatwaves and 90% for cold events, indicating a better balance between high recall (capturing true events) and high precision (reducing false alarms). The Critical Success Index (CSI) also shows marked improvement, increasing by 53% for heatwaves and 123% and cold events, confirming DeepMet\u0026rsquo;s superior ability to detect extreme events accurately while minimizing both misses and false positives. Moreover, the Probability of Detection (POD) increases by 50% for heatwaves and 185% for cold events, demonstrating that DeepMet captures significantly more true extreme events than ECMWF benchmark. Simultaneously, the False Alarm Ratio (FAR) decreases by 29% and 1.4% for heatwaves and cold events, respectively, resulting in fewer false alerts and more trustworthy forecasts.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe also compared the predictions of the two models for representative single-day heatwaves and cold events, using forecast lead times ranging from 14 to 42 days. The results show that DeepMet significantly outperforms ECMWF, consistently capturing extreme temperature events across the spatial domain, with 30\u0026ndash;50% more successful detections.\u003c/p\u003e\n\u003ch3\u003ePotential health benefits of early warning systems\u003c/h3\u003e\n\u003cp\u003eBy accurately forecasting extreme heatwaves and cold events, DeepMet can significantly strengthen early warning systems, enabling communities, especially vulnerable populations and better preparing more effectively. We assessed the added value of DeepMet over ECMWF for both types of events throughout 2023. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, substantial improvements in cold event prediction were observed across northern states, which are more frequently impacted by cold extremes. In contrast, the more significant improvements in heatwave prediction occurred in southern states, where extreme heat events are more common.\u003c/p\u003e\u003cp\u003eCompared to ECMWF forecasts, DeepMet enables more accurate identification of extreme weather events, potentially improving early warnings for an additional 600\u0026nbsp;million people-day during cold events and 3,300\u0026nbsp;million people-day during heatwaves. This represents a 30% increase in population coverage during cold spells and a 60% increase during heatwaves. Given the annual estimates of 700\u0026ndash;1,300 heat-related deaths and 1,200\u0026ndash;2,000 cold-related deaths in the U.S.\u003csup\u003e\u003cspan additionalcitationids=\"CR25\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, enhanced preparedness enabled by DeepMet such as timely healthcare interventions or emergency services could help prevent considerable deaths annually. These benefits become even more substantial as extreme heat and cold events intensify under a business-as-usual climate change scenario\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, potentially preventing additional premature deaths and avoiding associated economic losses each year, not to mention the additional risks posed to agriculture, infrastructure, and broader societal systems. These findings highlight the tangible value of more accurate and timely sub-seasonal extreme temperature forecasts.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThis study demonstrates a successful application of AI in extending the temporal scale of weather forecasts, an area that has traditionally posed significant challenges for numerical models. The key to DeepMet\u0026rsquo;s success lies in its ability to incorporate time-series memory into the prediction process, enabling the model to effectively constrain error propagation over time. Unlike traditional numerical methods, which rely heavily on step-by-step progression and are prone to cumulative errors, DeepMet leverages information from multiple previous time steps, thereby reducing the risk of runaway inaccuracies. This finding is particularly important for forecasting highly turbulent systems such as the atmosphere, highlighting that overcoming the limitations of long-range weather prediction may require moving beyond purely mathematical approaches to models that incorporate historical memory.\u003c/p\u003e\u003cp\u003eIn the current version of DeepMet, we use the previous 24 hours of data as input. While we initially expected that incorporating longer historical records (e.g., extending to 10 days) might improve model performance, our experiments indicated otherwise. In fact, extending the input window to 10 days often resulted in reduced accuracy (Figure S2), likely because the most recent 24-hour data contains the most relevant predictive signals for S2S forecasting, while older data contributes diminishing value. Additionally, we did not include more than 24 hours of high-resolution data due to the computational resource constraints (mostly due to RAM limitation), which would significantly reduce the effective size of the training dataset and compromise training efficiency. Nevertheless, future work may explore methods to effectively incorporate longer historical time series into the model, particularly when computational resources become available.\u003c/p\u003e\u003cp\u003eThe success of DeepMet also challenges traditional perspectives in S2S forecasting. Conventional wisdom holds that long-term predictions require global-scale models, such as ECMWF, to capture the influence of large-scale atmospheric circulation across regions. However, our results demonstrate that a regional-scale S2S model can outperform global models even without explicitly representing global influences. This is because error propagation occurs not only temporally but also spatially. Errors originating outside the target domain can amplify and be transported into the region of interest, potentially degrading forecast accuracy. While global models are capable of capturing cross-boundary flows, they may also introduce additional uncertainty from distant regions, which can limit their benefit. Moreover, global models often face a trade-off between resolution and spatial coverage, often allocating computational resources to areas that are not directly relevant to the target region. That said, this does not imply that boundary conditions are unimportant. In fact, our experiments indicate that incorporating simplified boundary condition information improves forecast skill during the first two weeks (Figure S3). This finding suggests that while global context is useful for short-term predictions, it may be less critical or even counterproductive for longer-term S2S forecasting.\u003c/p\u003e\u003cp\u003eAnother challenge to traditional thinking lies in the assumption that meteorological variables are highly interdependent and should be predicted simultaneously. However, our findings suggest that this approach does not always yield better performance. Due to error propagation, inaccuracies in one variable can adversely affect the prediction of others. In contrast to other AI-based weather models that attempt to forecast multiple variables concurrently, DeepMet employs a single-variable prediction strategy. This allows the model to focus on learning the dynamics of each variable independently, without the need to balance competing priorities during optimization. It also enables the selection of more relevant 3D input features tailored to each specific variable (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). Moreover, this modular approach enables parallel training across multiple GPUs, improving both efficiency and scalability while preserving task-specific accuracy.\u003c/p\u003e\u003cp\u003eWe scaled the time series input to a daily resolution for S2S forecasting to reduce error propagation across time steps. Additionally, using hourly data would require significantly more memory, which limits the amount of training data that can be processed and ultimately hinders model performance. In our experiments, increasing the temporal resolution of predictions from daily to 3-hourly, or 6-hourly intervals which did not yield meaningful improvements (Figure S4). Therefore, daily resolution remains a practical and effective choice, especially for S2S forecasts where long lead times are the primary focus.\u003c/p\u003e\u003cp\u003eWhile large GPU resources can enhance model performance, our study demonstrates the feasibility of generating fine-scale weather forecasts using limited computational resources. This makes the approach more realistic and accessible. Importantly, the method well-aligned with localized policy needs for protecting public health, agriculture, and infrastructure. Looking ahead, there is substantial potential to further enhance forecasting skill, both in accuracy and scope by expanding applications to other critical variables such as precipitation, wildfires, and floods, ultimately contributing to the protection of more lives. Incorporating additional future-relevant information such as slowly varying or accurately predictable variables can further improve forecasting performance. For example, incorporating improved representations of time-series day for slowly varying geophysical inputs, such as leaf area index, vegetation cover, albedo, downward shortwave radiation, into DeepMet may lead to noticeable improvements in predictive accuracy (Figure S5). There is no doubt that AI will play an increasingly vital role in extreme weather prediction, especially at the S2S scale, which continues to pose significant challenges for traditional numerical models.\u003c/p\u003e"},{"header":"Method","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eDynamical downscaling with numerical model\u003c/h2\u003e\u003cp\u003eTo better represent the human exposure to the extreme temperatures, we leverage a mesoscale numerical weather prediction system which is the Weather Research and Forecasting (WRF) \u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e to conduct six full-year simulations for the years 2008, 2012, 2014, 2019, 2021, and 2023. These simulations are dynamically downscaled to a regional scale with a 12 km resolution, which is approximately 10 times finer than the original 1.5 by 1.5 degree resolution (about 110\u0026ndash;160 km) of the global ECMWF S2S dataset.\u003c/p\u003e\u003cp\u003eThe WRF model was configured following the setup used in our previous study \u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e, including the Morrison two-moment microphysics scheme; the Rapid Radiative Transfer Model for Global Climate Models (RRTMG) for both longwave and shortwave radiation scheme; the Yonsei University (YSU) planetary boundary layer (PBL) scheme; the Pleim\u0026ndash;Xiu land surface model; the revised MM5 (Jimenez) surface layer scheme; and Grell\u0026ndash;Freitas (GF), with a radiative feedback cumulus parameterization option. The WRF model simulations were driven by the North American Mesoscale (NAM) model analyses from the National Centers for Environmental Prediction (NCEP), incorporating four-dimensional data assimilation (FDDA) with both surface and upper-air observations. Observation nudging was applied using NCEP\u0026rsquo;s Automated Data Processing (ADP) Global Surface and Upper-Air Observational Weather Data.\u003c/p\u003e\u003cp\u003eFor comparison, we obtained ECMWF sub-seasonal forecast data spanning six years from the open-access repository. The dataset includes daily control run forecasts of T2, T2max, T2min, U10, V10, and pressure-level specific humidity (q) with a lead time of up to 46 days. The original data, provided at a 1.5\u0026deg; \u0026times; 1.5\u0026deg; resolution, were re-gridded to the target domain using bilinear interpolation to facilitate comparison.\u003c/p\u003e\u003cp\u003eGround-based observational data were obtained from the NOAA NCDC ISD-Lite archive, which provides hourly records of T2, Q2, and 10-meter wind speed (WSPD10) from approximately 6,775 sites across the U.S. domain.\u003c/p\u003e\u003cp\u003eThe downscaled WRF model presents better agreement with the ground-based measurements from NCDC (Figure S6), providing a reliable foundation for training DeepMet and enabling it to deliver substantial improvement over global forecasting models such as ECMWF.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eDeepMet model structure\u003c/h3\u003e\n\u003cp\u003eConsistent with our previous applications in atmospheric chemistry forecasting, the DeepMet model architecture integrates two ConvLSTM modules, each consisting of three layers with varying channel sizes (256, 128, and 64) and a 3\u0026times;3 kernel (Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). The first module processes historical records from the past 24 hours, incorporating multiple variables, such as 2D and 3D inputs from model reanalysis and ground-based measurements to extract key information relevant to forecasting future variations of the target variable. This processed historical data, combined with time-invariant or slowly varying geographical and meteorological features, is then passed to a second ConvLSTM module to generate the forecast. During prediction, the hidden and cell states (h, c) are dynamically updated, enabling the model to retain and leverage long-term historical dependencies. This design is consistent with our prior work in atmospheric chemistry, where predictions at earlier time steps are recursively used as inputs for future steps. The methodology effectively captures the dual role of meteorological factors: their gradual modulation of baseline conditions over time (Role 1), and their direct interaction with other variables (Role 2), both of which are critical in shaping future atmospheric states.\u003c/p\u003e\u003cp\u003eThe geographical features used in this study consist of eight time-invariant variables: dominant land use category (DLUSE), terrain elevation (HT), land\u0026ndash;water mask (LWMASK), map-scale factor squared (MSFX2), land use fraction (LUFRAC), percentage of urban area (PURB), latitude (LAT), and longitude (LON). These static features are consistently applied throughout the entire prediction period.\u003c/p\u003e\u003cp\u003eFour 2D surface characteristics, Leaf Area Index (LAI), vegetation cover (VEG), albedo, and clear-sky shortwave downwelling radiation (SWDNBC), are prescribed from climatological datasets, given their relatively stable annual cycles (Figure S7). These features are incorporated as inputs in both the historical and future prediction modules.\u003c/p\u003e\u003cp\u003eIn addition, 18 time-dependent 2D meteorological variables are incorporated into the historical module. These include: U- and V-component winds (UV-wind), cloud fraction (CFRAC), planetary boundary layer height (PBL), precipitation (PREP), 2-meter specific humidity (Q2), shortwave radiation at the ground surface (RGRND), 2-meter temperature (T2), 10-meter wind speed (WSPD10), convective velocity scale (WSTAR), sensible heat flux (HFX), latent heat flux (LH), cell-averaged friction velocity (USTAR), surface roughness length (ZRUF), surface pressure (PRSFC), average liquid water content of clouds (WBAR), canopy moisture content (WR), and snow cover (SNOCOV).\u003c/p\u003e\u003cp\u003eFurthermore, six 3D meteorological variables are used, including air temperature (TA), cloud water mixing ratio (QC), water vapor mixing ratio (QV), three-dimensional cloud fraction (CFRAC_3D), and the U and V components of wind (UW, VW). These variables are resolved across 35 vertical levels, from the surface up to 100 mb, and incorporated into the historical module to enhance the model\u0026rsquo;s ability to capture vertical atmospheric structure.\u003c/p\u003e\n\u003ch3\u003eDeepMet training and testing\u003c/h3\u003e\n\u003cp\u003eWe leverage multiple WRF model simulations for pre-training and incorporate ground-based measurements from the NCDC dataset to enhance model performance at the surface level. The model is initially trained using data from five historical years (2008, 2012, 2014, 2019, and 2021), followed by fine-tuning with NCDC ground-based measurements. This strategy helps mitigate sample imbalance in the observational data by avoiding direct training solely on the ground measurements\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. Model performance is evaluated using data from the year of 2023, emulating the strategy of leveraging historical data to forecast recent conditions.\u003c/p\u003e\u003cp\u003eGiven the large dataset size and limited computational resources, particularly RAM constraints, we adopted a subset training strategy. Specifically, the model was trained on batches of 20 subsets of all approximately 1,500 samples, over a total of 4,000 epochs, preventing the need to load the entire dataset into memory simultaneously. For data augmentation, random cropping was applied to the feature maps, resizing them to 120 \u0026times; 120 grid cells. The model was trained using the Mean Squared Error loss function. The learning rate was initialized at 0.0001 and linearly decayed to zero over the course of training. The Adam optimizer was employed to improve model convergence and stability \u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eWe evaluate the performance of the DeepMet model using several metrics: the Anomaly Correlation Coefficient (ACC), Structural Similarity Index Measure (SSIM), Root Mean Square Error (RMSE), and the Ranked Probability Skill Score (RPSS). Spatially averaged statistics are computed by comparing DeepMet predictions with both ECMWF forecasts and WRF downscaled meteorological reanalysis data. In addition, RPSS is used to evaluate forecast skill against surface-level observations from the NCDC ground-based measurement network.\u003c/p\u003e\u003cp\u003eTo evaluate the heatwave and cold event prediction performance, we use four key metrics: F1 Score, Critical Success Index (CSI), Probability of Detection (POD), and False Alarm Rate (FAR). These metrics assess the models\u0026rsquo; ability to accurately detect impactful extreme temperature events, which are defined using T2max and Q2 for heatwaves, and T2min and 10-meter wind speed (WSPD10) for cold spells. The heatwave event is defined as the T2max\u0026thinsp;\u0026gt;\u0026thinsp;32 ℃, and Q2\u0026thinsp;\u0026gt;\u0026thinsp;0.014, and the cold event is defined as T2min\u0026thinsp;\u0026lt;\u0026thinsp;0 ℃ with WSPD10\u0026thinsp;\u0026gt;\u0026thinsp;3 m/s.\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eHealth impact assessment\u003c/h2\u003e\u003cp\u003eTo quantify the potential public health benefits of improved early warning provide by DeepMet, we estimated the number of individuals accurately identified as exposed to heatwaves by DeepMet compared to ECMWF. Population data were obtained from the Gridded Population of the World, Version 4 (GPWv4), for the most recent year available (2020) \u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, at a spatial resolution of 2.5 arc-minutes, and regridded to match the target domain.\u003c/p\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe WRF simulation data can be obtained from the corresponding authors; The ECMWF data was obtained from https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/; The NCDC data was downloaded from ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite; The Gridded Population of the World was obtained from https://www.earthdata.nasa.gov/data/projects/gpw.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCode availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe DeepMet code will be available at zenodo.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by U.S. National Science Foundation [NO: 2100582], National Natural Science Foundation of China [NO: 42375131], UT AI Tennessee Initiative Seed Funds, and MSRA collaborative research project. The author would like to acknowledge the support of the Bellagio Center Residency Program, funded by the Rockefeller Foundation. The author also acknowledges Dr. Kristen Foley and Dr. Christian Hogrefe from US EPA, Dr. Daniel Tong and Dr. Bok H. Baek from GMU for supporting the meteorological dataset.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eJ.X.: Conceptualization, Methodology, Model Development, Formal analysis, Investigation, Writing- Original draft preparation, Writing- Reviewing and Editing; S.L.: Conceptualization, Methodology, Model Development, Formal analysis, Writing- Original draft preparation, Writing- Reviewing and Editing; S.Z.: Methodology, Model Development, Writing- Reviewing and Editing; G.S.: Formal analysis; J.D.: Formal analysis; G.A.F: Formal analysis; T.Y.L.: Conceptualization, Supervision, Writing- Reviewing and Editing; J.S.F.: Conceptualization, Supervision, Writing- Reviewing and Editing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing financial interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInclusion \u0026amp; Ethics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study does not involve human participants or sensitive data.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eGrant, L., Vanderkelen, I., Gudmundsson, L., Fischer, E., Seneviratne, S.I. and Thiery, W., 2025. Global emergence of unprecedented lifetime exposure to climate extremes. Nature, 641(8062), pp.374-379.\u003c/li\u003e\n\u003cli\u003ePatz, J., Campbell-Lendrum, D., Holloway, T. et al. Impact of regional climate change on human health. Nature 438, 310\u0026ndash;317 (2005). https://doi.org/10.1038/nature04188\u003c/li\u003e\n\u003cli\u003eZhao, Q., Guo, Y., Ye, T., Gasparrini, A., Tong, S., Overcenco, A., Urban, A., Schneider, A., Entezari, A., Vicedo-Cabrera, A.M. and Zanobetti, A., 2021. Global, regional, and national burden of mortality associated with non-optimal ambient temperatures from 2000 to 2019: a three-stage modelling study. The Lancet Planetary Health, 5(7), pp.e415-e425.\u003c/li\u003e\n\u003cli\u003eBerko, J., Ingram, D.D., Saha, S. and Parker, J.D., 2014. Deaths Attributed to Heat, Cold, and other Weather Events in the United States, 2006\u0026ndash;2010, National Health Statistics Reports. US Department of Health and Human Services.\u003c/li\u003e\n\u003cli\u003eMasselot, P., Mistry, M.N., Rao, S. et al. Estimating future heat-related and cold-related mortality under climate change, demographic and adaptation scenarios in 854 European cities. Nat Med 31, 1294\u0026ndash;1302 (2025). https://doi.org/10.1038/s41591-024-03452-2\u003c/li\u003e\n\u003cli\u003eYang, J., Zhou, M., Ren, Z., Li, M., Wang, B., Liu, D.L., Ou, C.Q., Yin, P., Sun, J., Tong, S. and Wang, H., 2021. Projecting heat-related excess mortality under climate change scenarios in China. Nature communications, 12(1), p.1039.\u003c/li\u003e\n\u003cli\u003eMcGregor, G., 2024. Heatwave Responses: Early Warning Systems. In Heatwaves: Causes, Consequences and Responses (pp. 549-599). Cham: Springer International Publishing.\u003c/li\u003e\n\u003cli\u003eVitart, F. and Robertson, A.W., 2018. The sub-seasonal to seasonal prediction project (S2S) and the prediction of extreme events. npj climate and atmospheric science, 1(1), p.3.\u003c/li\u003e\n\u003cli\u003eLin, H., Mo, R. and Vitart, F., 2022. The 2021 western North American heatwave and its subseasonal predictions. Geophysical Research Letters, 49(6), p.e2021GL097036.\u003c/li\u003e\n\u003cli\u003eXie, J., Hsu, P.C., Hu, Y., Zhang, H. and Ye, M., 2024. Advancing subseasonal surface air temperature and heat wave prediction skill in China by incorporating scale interaction in a deep learning model. Geophysical Research Letters, 51(20), p.e2024GL111076.\u003c/li\u003e\n\u003cli\u003eWMO, EARLY WARNINGS FOR ALL, The UN Global Early Warning Initiative for the Implementation of Climate Adaptation, Executive Action Plan 2023-2027, 2022\u003c/li\u003e\n\u003cli\u003eBauer, P., Thorpe, A., \u0026amp; Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525(7567), 47-55.\u003c/li\u003e\n\u003cli\u003eBi, K., Xie, L., Zhang, H., Chen, X., Gu, X., \u0026amp; Tian, Q. (2023). Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619(7970), 533-538.\u003c/li\u003e\n\u003cli\u003eBodnar, C., Bruinsma, W.P., Lucic, A., Stanley, M., Brandstetter, J., Garvan, P., Riechert, M., Weyn, J., Dong, H., Vaughan, A. and Gupta, J.K., 2024. Aurora: A foundation model of the atmosphere. arXiv preprint arXiv:2405.13063.\u003c/li\u003e\n\u003cli\u003eKochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., Kl\u0026ouml;wer, M., Lottes, J., Rasp, S., D\u0026uuml;ben, P. and Hatfield, S., 2024. Neural general circulation models for weather and climate. Nature, 632(8027), pp.1060-1066.\u003c/li\u003e\n\u003cli\u003ePrice, I., Sanchez-Gonzalez, A., Alet, F., Andersson, T.R., El-Kadi, A., Masters, D., Ewalds, T., Stott, J., Mohamed, S., Battaglia, P. and Lam, R., 2025. Probabilistic weather forecasting with machine learning. Nature, 637(8044), pp.84-90.\u003c/li\u003e\n\u003cli\u003eChen, L., Zhong, X., Li, H., Wu, J., Lu, B., Chen, D., ... \u0026amp; Qi, Y. (2024). A machine learning model that outperforms conventional global subseasonal forecast models. Nature Communications, 15(1), 6425.\u003c/li\u003e\n\u003cli\u003eStauffer, D.R. and Seaman, N.L., 1990. Use of four-dimensional data assimilation in a limited-area mesoscale model. Part I: Experiments with synoptic-scale data. Monthly Weather Review, 118(6), pp.1250-1277. \u003c/li\u003e\n\u003cli\u003eHersbach, H., Bell, B., Berrisford, P., Hirahara, S., Hor\u0026aacute;nyi, A., Mu\u0026ntilde;oz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D. and Simmons, A., 2020. The ERA5 global reanalysis. Quarterly journal of the royal meteorological society, 146(730), pp.1999-2049.\u003c/li\u003e\n\u003cli\u003eAnderson, G.B., Bell, M.L. and Peng, R.D., 2013. Methods to calculate the heat index as an exposure metric in environmental health research. Environmental health perspectives, 121(10), pp.1111-1119.\u003c/li\u003e\n\u003cli\u003eOsczevski, R.J., 1995. The basis of wind chill. Arctic, pp.372-382.\u003c/li\u003e\n\u003cli\u003eXing, J., Zheng, S., Li, S., Huang, L., Wang, X., Kelly, J.T., Wang, S., Liu, C., Jang, C., Zhu, Y. and Zhang, J., 2022. Mimicking atmospheric photochemical modeling with a deep neural network. Atmospheric research, 265, p.105919.\u003c/li\u003e\n\u003cli\u003eShi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., \u0026amp; Woo, W. C. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems, 28.\u003c/li\u003e\n\u003cli\u003eHoward, J. T., Androne, N., Alcover, K. C., \u0026amp; Santos-Lozada, A. R. (2024). Trends of heat-related deaths in the US, 1999-2023. JAMA, 332(14), 1203-1204.\u003c/li\u003e\n\u003cli\u003eBerko, J. (2014). Deaths attributed to heat, cold, and other weather events in the United States, 2006-2010 (No. 76). US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics.\u003c/li\u003e\n\u003cli\u003eUSGCRP, 2018: Impacts, Risks, and Adaptation in the United States: Fourth National Climate Assessment, Volume II [Reidmiller, D.R., C.W. Avery, D.R. Easterling, K.E. Kunkel, K.L.M. Lewis, T.K. Maycock, and B.C. Stewart (eds.)]. U.S. Global Change Research Program, Washington, DC, USA, 1515 pp. doi: 10.7930/NCA4.2018.\u003c/li\u003e\n\u003cli\u003eSkamarock, W. C., J. B. Klemp, J. Dudhia, D. O. Gill, D. M. Barker, M. G Duda, X.-Y. Huang, W. Wang, and J. G. Powers, A Description of the Advanced Research WRF Version 3. NCAR Tech. Note NCAR/TN-475+STR, 113 pp. 2008.\u003c/li\u003e\n\u003cli\u003eBaek, B. H., Coats, C., Ma, S., Wang, C.-T., Li, Y., Xing, J., Tong, D., Kim, S., and Woo, J.-H.: Dynamic Meteorology-induced Emissions Coupler (MetEmis) development in the Community Multiscale Air Quality (CMAQ): CMAQ-MetEmis, Geosci. Model Dev., 16, 4659\u0026ndash;4676, https://doi.org/10.5194/gmd-16-4659-2023, 2023.\u003c/li\u003e\n\u003cli\u003eLi, S., Ding, Y., Xing, J., and Fu, J. S.: Retrieving ground-level PM2.5 concentrations in China (2013\u0026ndash;2021) with a numerical-model-informed testbed to mitigate sample-imbalance-induced biases, Earth Syst. Sci. Data, 16, 3781\u0026ndash;3793, https://doi.org/10.5194/essd-16-3781-2024, 2024.\u003c/li\u003e\n\u003cli\u003eKingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint, 2014, arXiv:1412.6980\u003c/li\u003e\n\u003cli\u003eGridded Population of the World, Version 4 (GPWv4): Population Count Adjusted to Match 2015 Revision of UN WPP Country Totals, Revision 11. Center for International Earth Science Information Network - CIESIN - Columbia University. (2018). NASA Socioeconomic Data and Applications Center (SEDAC). DOI: https://doi.org/10.7927/H4PN93PB\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7314380/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7314380/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eSub-seasonal weather prediction remains a significant scientific challenge due to the chaotic nature of the atmosphere, with current numerical and AI-driven models exhibiting limited skill, particularly at the fine spatial scales for human exposure, agriculture, and infrastructure. Here, we introduce DeepMet, a high-resolution, AI-driven sub-seasonal forecasting system designed to improve the prediction of temperature extremes and their associated health risks, demonstrated successfully over the continental United States. Specifically, DeepMet substantially outperforms the benchmark of European Centre for Medium-Range Weather Forecasts, reducing the root mean square error by 20\u0026ndash;60% for key surface variables, including daily maximum and minimum 2-meter temperature, specific humidity, and 10-meter wind speed. The model also improves the detection of extreme heat and cold events by over 40% across all evaluation metrics. By enhancing early warning capabilities, DeepMet enables more accurate identification of extreme weather conditions, potentially improving risk communication to prevent additional extreme-weather related deaths in the United States. Remarkably, such performance is achieved using only a single GPU for training, making the method highly accessible for local agencies to enhance early warning systems and protect public health. This underscores its strong potential to transform long-range forecasting and significantly enhance public health preparedness in a changing climate.\u003c/p\u003e","manuscriptTitle":"AI-Enhanced Subseasonal Forecasting of Extreme Temperature Risks","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-21 09:40:46","doi":"10.21203/rs.3.rs-7314380/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"99b0969a-afe6-46f3-b0c6-501ec4c764b4","owner":[],"postedDate":"August 21st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":53310470,"name":"Earth and environmental sciences/Climate sciences"},{"id":53310471,"name":"Physical sciences/Engineering"},{"id":53310472,"name":"Earth and environmental sciences/Environmental sciences"},{"id":53310473,"name":"Physical sciences/Mathematics and computing"}],"tags":[],"updatedAt":"2025-09-06T15:38:37+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-21 09:40:46","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7314380","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7314380","identity":"rs-7314380","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.