Research on Fine-Scale Risk Zoning of Hail Disasters in Yunnan Flue-Cured Tobacco Based on Machine Learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Research on Fine-Scale Risk Zoning of Hail Disasters in Yunnan Flue-Cured Tobacco Based on Machine Learning Yingmo Zhu, Yonggang Liang, Xueqiong Hu, Lizhang Fan, Pengwu Yang This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8599617/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This study explores the thermal, dynamic and environmental conditions for hail formation, and proposes a machine learning method based on random forest models. The methodology incorporates strong convective ingredients such as CAPE, LI, 0°C level height, -20°C level height, along with disaster-prone environmental factors like altitude, slope, and aspect, to objectively analyze hail days. After steps such as training set construction, factor selection, regional modeling, and model training, the Yunnan hail day discrimination model was developed. By integrating DEM elevation data and ERA5 reanalysis-derived convective ingredients, a distribution map of hail days during the climate baseline year over the 1 km × 1 km grid of Yunnan flue-cured tobacco fields was generated. After verification and correction, the natural breakpoint method was used to achieve the refined zoning for hail disaster risk in Yunnan flue-cured tobacco. It was found that the hail disaster risk presents a higher tendency in the north compared to the south, with northern high-altitude areas predominantly at high or moderately high risk, while the low-altitude central and southern mountainous regions display moderate to low risk. Additionally, by employing climatic factors such as average temperature in July, sunshine duration from July to August, and rainfall from April to September, alongside geographical and topographical elements, precise climatic suitability zoning for Yunnan flue-cured tobacco was accomplished. This zoning represents a comprehensive analysis of exposure and vulnerability. Combined with the refined risk zoning of flue-cured tobacco hail disasters, the refined risk zoning of hail disasters in Yunnan flue-cured tobacco was finally achieved. The results indicated an overall east-high, west-low risk pattern: areas east of the Ailao Mountains showed higher risk, while those west exhibited moderate to lower risk. Notably, the high-cold zones in the north-central region and the rainy margins of the south and west demonstrated no risk. This study aims to provide scientific guidance for rational layout and hail prevention work during the field period of Yunnan flue-cured tobacco. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Hail disasters, severe weather phenomena triggered by intense convective systems [1] , pose significant threats to agriculture, causing yield reductions, quality degradation, and increased pest risks [2] . Therefore, agricultural hail risk assessments and zoning studies are crucial for minimizing losses and promoting sustainable agricultural development. McMaster [3] found frequent hail occurrences in the plateau regions of New South Wales, Australia, through agricultural hail risk assessments. Similarly, Istrate et al. [4] conducted a hail risk assessment on crops in Moldova, Romania, and the results showed that the high-risk hail areas in the region highly overlapped with the areas with concentrated cash crops. Wang et al. [5] observed a generally low hail disaster risk in China’s cotton regions, although a gradual increasing trend was noted from 1950 to 2009. Located in a low-latitude plateau, Yunnan is affected by both the South Asian and East Asian monsoons, resulting in frequent convective activities, making it one of the most severely affected areas by hail in China [6-10] . As an important flue-cured tobacco base in China, Yunnan is world-renowned for its superior quality, rich flavor, and aroma [11,12] . Its planting area and output rank first in the country [13] . The quality of flue-cured tobacco, an economic crop, is directly determined by leaf quality. However, hail damage can cause scars, tears, and even breakage in tobacco leaves, severely impacting their quality and yield. Therefore, it is of great practical significance to conduct research on hail risk of flue-cured tobacco in Yunnan. Current hail risk research mainly focuses on several aspects, including hail disaster hazard, disaster-prone environmental sensitivity, exposure of disaster-bearing bodies, vulnerability of disaster-bearing bodies, and disaster prevention and mitigation capabilities, and then establishes corresponding risk assessment models [14-18] . Key hazard risks are typically described by hail frequency and intensity. The greater the frequency and intensity of hail, the higher its hazard. However, actual hail intensity data is often scarce, prompting this study to use hail days as a risk indicator. According to statistics, the annual average hail days across Yunnan’s 125 national meteorological stations were less than one day from 1978 to 2020. In the same period, the direct economic losses caused by hail reached nearly 700 million yuan, and the affected agricultural areas exceeded 100,000 hectares. This indicates that relying solely on the number of hail days observed by meteorological stations cannot fully reflect the spatial distribution of the number of hail days in Yunnan. Therefore, a more refined spatial distribution study is needed. In geoscience research, interpolation methods such as inverse distance weighting and kriging are often used for fine spatial distribution analysis. However, due to the high nonlinearity and suddenness of hail events, these traditional interpolation methods often prove suboptimal [19-21] . With the rapid advancement of artificial intelligence technology, some new machine learning methods have been introduced to provide better solutions for dealing with nonlinear problems. Consequently, more researchers have begun to apply machine learning to the study of hail weather. Pulukool et al. [22] developed two deep learning models (autoencoder and convolutional neural network) and a machine learning model (random forest) for hail prediction. The random forest model demonstrated the best predictive performance using only four factors: convective available potential energy, convective inhibition, 1-3 km wind shear, and warm cloud depth. Pullman et al. [23] applied deep machine learning to identify hail weather and found that deep learning based on multi-source data can achieve high recognition accuracy. Yao et al. [24] utilized the random forest algorithm to establish a 0-6 hour hail forecast model for Shandong Peninsula, with excellent test results. Czernecki et al. [25] created a machine learning model for predicting large hail, and found that combining thermodynamic and kinematic parameters from numerical weather prediction models with real-time remote sensing data significantly enhances the forecasting ability of large hail. Terrible et al. [26] constructed a hail prediction model based on genetic algorithms, relying on ERA5 large-scale meteorological variables and convection indices to describe Italy’s seasonal and long-term hail event changes. Through the classification and verification of hail probability in Friuli-Venezia Giulia, it is shown that the hail model can effectively estimate the probability of hail in specific areas of Italy. The “ingredient-based” forecasting is a forecasting method based on model output, which is particularly suitable for forecasting heavy rain and severe convective weather [27-30] . This method adopts the basic component point of view and evaluates the possibility and intensity of heavy rain or severe convective events by analyzing and predicting the basic physical quantities or processes (“ingredients”) that have a direct impact on the development and intensity of heavy rain or severe convective events. Raupach et al. [31] studied Australia’s hail disaster changes using convective instability energy, convective inhibition energy, and other “ingredients”, combined with ERA5 reanalysis data and severe storm archives. They found that under the background of climate warming, the frequency of severe thunderstorm environments in northern and eastern Australia may increase significantly. Kahraman et al. [32] proposed a novel hail identification method based on the “ingredient-based” method, which can identify severe hail in climate models that allow convection. Verification shows that the results of this model are highly consistent with the existing hail climatology constructed based on observations, including fine-scale spatial variations. Based on the above analysis, this study proposes an objective hail discrimination method that combines machine learning, strong convective ingredients, and disaster-prone environmental factors. This method yielded a fine-scale spatial distribution of hail days in Yunnan during the flue-cured tobacco field period (May-September) of the climate baseline year (1991-2020). Then, by applying the natural breakpoint method, a refined risk zoning of flue-cured tobacco disasters affected by hail was carried out. These results were then integrated with tobacco suitability zoning, ultimately achieving fine-scale risk zoning for Yunnan flue-cured tobacco hail disasters. This research aims to provide theoretical basis and scientific guidance for the rational layout of Yunnan flue-cured tobacco, in order to reduce or avoid economic losses caused by hail disasters. 1 Data and Methods 1.1 Data The data used in this study includes: ① Daily hail records from 125 national meteorological stations in Yunnan for May to September from 1991 to 2020; ② Monthly average temperature, sunshine duration, and rainfall data from these 125 meteorological stations over the same period; ③ Daily reanalysis grid data at 14:00 from ERA5 for 100 hPa to 950 hPa pressure levels (at 25 hPa intervals) with a resolution of 0.25° × 0.25° for the Yunnan region over May to September from 1991 to 2020; ④ 1 km × 1 km DEM digital elevation data for the Yunnan region. The daily hail records from the National Meteorological Station are used to construct the model training set. ERA5 reanalysis data is employed to calculate the strong convective ingredients at 14:00 for both the 125 meteorological stations and the entire province on a 1 km × 1 km grid for May to September from 1991 to 2020. Due to the coarse spatial resolution of ERA5 grids, bilinear interpolation was utilized for spatial downscaling. The DEM elevation data assists in calculating disaster-prone environmental factors such as slope and aspect for both the meteorological stations and the whole province. Hail records and calculated ingredients from ERA5, as well as DEM-derived environmental factors, are employed in machine learning modeling. Additionally, monthly average temperature, sunshine duration, and rainfall data during the baseline climate year are used to estimate the fine-scale distribution of July average temperature, sunshine hours from July to August, and rainfall from April to September, thus facilitating the fine distribution mapping of flue-cured tobacco suitability in Yunnan. 1.2 Methods 1.2.1 Preliminary Selection of Strong Convective “Ingredients” Hail often occurs in the afternoon, driven by thermal and dynamic conditions necessary for its formation. This study initially selects convective available potential energy (CAPE), lifting index (LI), 0°C level height, and − 20°C level height at 14:00 as the primary strong convective physical “ingredients” [ 33 – 38 ] . CAPE is a crucial parameter for measuring atmospheric instability. Sufficient instability energy is imperative for hail genesis, with higher CAPE values corresponding to greater instability. LI reflects accumulated instability of air masses during the ascent process. The larger the LI value, the higher the possibility of explosive convection. The height of the 0℃ layer determines the upper limit of the development of convective clouds and provides favorable conditions for the generation and growth of hail. When the height of the 0°C layer is suitable, it is conducive to the development of convective clouds to higher altitudes, thereby providing sufficient water vapor environment for the formation of hail. At the same time, the moderate 0°C layer height can ensure that hail particles maintain an appropriate size during falling and are not melted by the warm layer. In addition, the − 20°C layer height is the key temperature for large water droplets to freeze naturally. If the thickness between the 0°C layer and the − 20°C layer is appropriate and the supercooled water content is abundant, the growth rate of hail particles will be significantly accelerated. 1.2.2 Preliminary Selection of Disaster-Prone Environmental Factors Altitude, slope, and aspect significantly influence hail formation [ 39 – 44 ] . High-altitude areas, characterized by lower atmospheric pressure, promote air ascent, aiding hail formation. Slope affects hail mainly through the lifting effect of terrain on airflow. When airflow passes through mountains, slope will enhance the upward movement of airflow, thereby promoting the formation and growth of hailstones. Slope aspect affects the distribution and intensity of solar radiation, thereby changing surface temperature and humidity, which in turn affects hail formation and development. 1.2.3 Machine Learning Model This study adopted random forest as a machine learning model, which is mainly used to deal with classification or regression tasks [ 45 – 47 ] . The core of random forest is decision tree. By integrating the results of multiple decision trees, the accuracy and stability of the model are enhanced. In this study, 14:00 strong convective ingredients and disaster-prone environmental factors from 125 national meteorological stations serve as independent variables, with daily hail records as the dependent variable. Training is performed through random forest to establish a hail day discrimination model. Subsequently, the model is applied to predict hail days for each 1 km × 1 km grid during the climate baseline year’s tobacco field period in Yunnan, thereby obtaining the spatial distribution of hail days and realizing the division of flue-cured tobacco hail disaster risk zones. 1.2.4 Climate Suitability Zoning for Flue-Cured Tobacco Yunnan is located on the Yunnan-Guizhou Plateau, and its average summer temperature is significantly lower than that of provinces in central and eastern China. Low temperatures, prolonged rainy periods, and insufficient sunlight during the mid-late field period pose challenges for tobacco production. Based on studies by Huang Zhongyan and Hu Xueqiong et al. [ 48 – 50 ] , this research selects July average temperature, sunshine hours from July to August, and rainfall from April to September during the climate baseline year as factors for evaluating climate suitability for tobacco. These factors are used for suitability assessment zoning. See Table 1 for specific assessment zoning indicators. Table 1 Climate Suitability Zoning Indicators for Yunnan Flue-Cured Tobacco Most suitable Suitable Moderately suitable Unsuitable July Avg Temperature (°C) 20.0ཞ22.0 19.0ཞ23.5 18.5ཞ26.5 <18.5 or 26.5 July-August Sunshine Hours (h) ≥ 250 ≥ 220 ≥ 180 <180 April-September Rainfall (mm) 550ཞ1250 450ཞ1250 450ཞ1400 1400 or <450 1.2.5 Risk Assessment Zoning for Tobacco Hail Disasters The climatic suitability of flue-cured tobacco not only reflects the adaptability of the climate environment to the crop’s survival and growth, but also considers its exposure and vulnerability. Higher suitability often indicates greater planting feasibility, yet it may also imply increased loss potential if hail strikes. Therefore, suitability zoning can be translated into comprehensive exposure and vulnerability zoning. The transformation method assigns values as follows: most suitable, suitable, moderately suitable, and unsuitable correspond to high, medium, low, and none in terms of exposure and vulnerability, with the values of 0.8, 0.6, 0.4, and 0 respectively. Based on natural disaster risk assessment methodologies [ 51 – 53 ] , a risk assessment model for Yunnan flue-cured tobacco hail disasters was constructed after comprehensively considering the combined effects of hail hazard, exposure, and vulnerability: $$\:\text{R}={\text{D}}^{0.8}\times\:{\text{E}}^{0.2}$$ 1 where R represents the flue-cured tobacco hail disaster risk index, D denotes the hail hazard zoning value, and E stands for the combined exposure and vulnerability zoning value. 2 Spatial Distribution of Hail Days at Stations Figure 1 illustrates the spatial distribution of hail days observed during Yunnan’s flue-cured tobacco field period in the baseline climate year. The distribution exhibits a clear pattern divided by the Ailao Mountains, with a distribution pattern of fewer hail days in the west and more hail days in the east. The number of hail days in the region west of the Ailao Mountains typically does not exceed 5, with places like Lincang and Dehong experiencing fewer than 2 days on average. In contrast, areas to the east of the Ailao Mountains generally record 6 or more hail days, while certain stations in Zhaotong, Qujing, and Wenshan report more than 10 days. Additionally, the high-altitude regions of northwestern Yunnan are particularly noteworthy for their hail frequency, where Diqing and northern Lijiang commonly see over 10 hail days, with some locations experiencing more than 20. However, there are specific sites within the valleys of the Jinsha River, Yuanjiang River, and Nujiang River that have no recorded hail observations. The disparity in hail occurrences between the east and west of the Ailao Mountains is likely attributed to the fact that over 90% of the cold air affecting Yunnan originates from northeastern and eastern trajectories. During the spring and summer seasons, this cold air frequently converges with warm, moist air from the southwest east of the Ailao Mountains, triggering severe convective weather conditions, thereby leading to a higher incidence of hail. West of the Ailao Mountains, the massive mountain ridges act as a barrier, preventing cold air intrusions and resulting in more stable atmospheric conditions, thus reducing the likelihood of hail. The increased hail occurrence in northwestern Yunnan might be associated with its high altitude. As warm, moist airflows from the Bay of Bengal pass through, they are forced to rise sharply, causing rapid cooling and condensation of moisture at high altitudes, which provides ample energy and moisture for hail formation. Furthermore, in high-altitude areas, the thin air allows more solar radiation to reach and warm the ground surface, heightening surface temperatures and facilitating intense convective movements that can lead to hail. The absence of hail in certain valley regions may be due to topographical influences and the Föhn effect, as these valleys often experience dry, warm winds, resulting in poor moisture conditions that are less conducive to hail formation. 3 Hazard Zoning 3.1 Construction of Training Set As can be seen from Fig. 1 , the number of hail days at most observation stations is less than 20 days, while the total number of days in the Yunnan flue-cured tobacco field period in the climate base year is 4590 days. This shows that the number of hail days is very small compared to the number of non-hail days. If such a dataset is used directly for training, it will lead to the problem of unbalanced datasets, which will cause the model to tend to identify more non-hail days, which is contrary to our goal. To deal with unbalanced datasets, commonly used methods include oversampling and undersampling [ 54 , 55 ] . Oversampling is to increase the number of samples in the minority category to approximate that of the majority class. Undersampling is to reduce the number of samples in the majority category to align closer with the minority category. Due to the substantial discrepancy in the number of hail days and non-hail days, it is difficult to construct an ideal training set regardless of whether random oversampling or synthetic sample oversampling is used [ 56 , 57 ] . Thus, this study employs the undersampling method of randomly deleting non-hail day samples to construct a relatively balanced hail training set, resulting in approximately 2800 samples. 3.2 Factor Selection Not all the preliminarily selected severe convection physical quantity “ingredients” and disaster-prone environmental factors can effectively identify hail days, so factor screening is needed. In machine learning, impurity measures the degree of disorder or complexity and helps determine data division strategies for efficient model building. When a factor can significantly reduce impurity when splitting a node, it indicates that the factor contributes more to the model. This study used the Gini impurity analysis in random forests [ 58 , 59 ] , ultimately selecting CAPE, LI, and 0°C level height as strong convective “ingredients”, and elevation as the disaster-prone environmental factor. 3.3 Regional Modeling Figure 2 a displays the average distribution of hail days across different elevations during Yunnan tobacco field period, showing a clear correlation between hail days and elevation. As elevation rises, hail days increase significantly, especially above 1000m. For every 250m increase in altitude, the average number of hail days increases by about 0.3 days. In order to build a more accurate hail day determination model, this study divides the province into low-altitude areas ( 2000m) based on a 1 km × 1 km grid. Figure 2 b illustrates these divisions: low-altitude zones primarily encompass southern low mountains, valleys, and plains; high-altitude zones concentrate in the northwest and northeast mountain ranges; the remaining areas fall into the medium-altitude zone. Accordingly, 125 national meteorological stations are categorized by altitude for separate modeling: 14 stations in the high-altitude zone, 55 in the medium-altitude zone, and 56 in the low-altitude zone. 3.4 Model Training First, DEM data provided elevation heights for the 125 national meteorological stations. Subsequently, ERA reanalysis data were used to compute CAPE, LI, and 0°C level height for each record in the training set at 14:00 on the same day. Next, the parameters of the random forest model were repeatedly adjusted through 4-fold cross-validation, including the number of trees, maximum tree depth, minimum sample count for node splits, and minimum samples in leaf nodes, to optimize hail day discrimination models for the low, medium, and high-altitude zones. The accuracy of the three models finally obtained in the test exceeded 88%. Among them, the high-altitude region model showed slightly lower accuracy, under 90%, while models for the medium and low-altitude regions approached 95%. The overall average accuracy reached 93%. 3.5 Discrimination and Validation of Hail Days Through the three trained hail day discrimination models, combined with 1 km × 1 km DEM grid elevation data and 1 km × 1 km grid CAPE, LI, and 0°C level height calculated via bilinear interpolation from ERA5, daily grid-based hail discrimination was conducted for Yunnan’s tobacco field period in the baseline climate year. The cumulative results produced a spatial distribution map of hail days (Fig. 3 a), highlighting the relationship between hail days and large-scale topography: over 20 days in northwest high-altitude areas and generally under 10 days in southern low-altitude zones, with sparse occurrences in valleys like those of the Jinsha, Yuanjiang, and Nujiang Rivers, aligning well with station observations. The discrimination also reflected local terrain influences on hail, such as contrasting hail days between the Yuanmou Basin and its eastern Raying Mountain, where mountain areas recorded over 20 days, while the basin averaged below 2 days. Compared with the number of hail days observed at the station (Fig. 3 b), 49% of the stations exhibited errors within ± 3 days, 24% within ± 1 day. Moreover, the absolute average error for all stations was 5.6 days, indicating satisfactory overall discrimination performance. At the same time, it can be seen that the discrimination model still has some shortcomings. Firstly, there is an apparent overestimation in northwest regions, with hail days exceeding 40 (over 10 days more than station observations). The possible reason is that the terrain in the high-altitude areas of the northwest is extremely complex, and the grid of ERA5 is relatively coarse (0.25°×0.25°), which cannot accurately reflect the strong convection conditions of the micro-topography. Secondly, the number of hail days in the eastern marginal areas is generally less, which is less than 10 days in some areas. The possible reason is that the strong convection factor used in this study is the calculation result at 14:00, while hail in some areas of the eastern marginal areas occurred earlier or later. For example, there were 3 hail events before 14:00 in Luoping, and 2 hail events after 20:00 in Guangnan. The strong convection factor at 14:00 cannot truly reflect the weather conditions before or after the weather conditions for a long time, causing deviations in predicted hail days. 3.6 Correction and Zoning By analyzing the station residual values (differences between observed and predicted hail days), it can be found that 64% of the stations in the province bear negative values, which is almost twice the number of positive ones, suggesting overall overestimation in Yunnan hail days. Moreover, this overestimation is mainly concentrated in the central and western areas. By applying kriging interpolation to spatially interpolate residuals and adding them to the prediction map, a corrected spatial distribution of hail days was obtained (Fig. 3 a). The corrected map aligns better with actual observations, accentuating hail day variations with topography. Specifically, hail days increased markedly in the eastern edge, notably in eastern Zhaotong and northeastern Wenshan, consistent with frequent cold air intrusions in eastern and northeastern Yunnan, which foster strong convection and more hail. Meanwhile, hail days dropped in the northwest, especially in valleys and their margins, reflecting geographic barriers hindering airflow ascent required for hail formation. Furthermore, significant reductions also occurred in Dehong, Baoshan, and Lincang. Despite sufficient water vapor in the west in spring and summer, there is less cold air coming from the east or from the plateau, resulting in sparse strong convective activities and hail weather is not easy to form. The natural breakpoint method is used to classify the corrected number of hail days for precise hazard zoning of flue-cured tobacco (Fig. 3 b). This zoning is divided into four levels: high, medium-high, medium, and low hazard, with respective values of 0.8, 0.6, 0.4, and 0.2. The map reveals a north-high, south-low hazard distribution. Northern high-altitude areas exhibit high to medium-high hazard, while mid-south low-altitude mountains and basins show medium to low hazards. At the same time, some tall mountains in the mid-south indicate generally higher hazard, while northern valleys and basins present relatively lower hazard. Notably, distinct differences exist across the Ailao Mountains, with eastern areas generally hazardier than the west, particularly on the eastern fringe. Most areas are mainly medium-hazard or above, while most areas west of Ailao Mountain are mainly low-hazard. 4 Suitability Zoning Based on the climate baseline data of average monthly temperature, sunshine duration, and rainfall from 125 national meteorological stations, combined with geographical and topographical factors such as longitude, latitude, elevation, slope, and aspect, a stepwise regression method was employed to estimate 1 km × 1 km spatially refined maps for July average temperatures, sunshine hours from July to August, and precipitation from April to September (detailed modeling not discussed here). The following trends are observed. According to the climate baseline data, July’s average temperature shows a south-high, north-low distribution pattern geographically. Regions in central and northern Yunnan with altitudes exceeding 2100m have average temperatures below 18.5°C, while the low-altitude areas in Yuanjiang Valley and the southeast exceed 26.5°C (Fig. 5 a). From July to August, the distribution of sunshine hours is more in the east and less in the west. Most areas east of the Ailao Mountains have more than 250 hours of sunshine, while the Nujiang area in the northwest receives the least, between 180 and 200 hours (Fig. 5 b). The precipitation from April to September exhibits a peripheral high, central low pattern, with southern and western edges exceeding 1400 mm, while central-western regions often fall below 800 mm, with some areas under 450 mm (Fig. 5 c). Table 1 divides the climate suitability of flue-cured tobacco in Yunnan (Fig. 5 d). It can be observed that, except for the high-cold areas in the central and northern parts and the rainy areas in the southern and western edges, most areas in Yunnan are suitable for growing flue-cured tobacco. In terms of suitability, there is an overall east-better-west-worse distinction. Most areas east of the Ailao Mountains have moderate temperatures, sufficient sunshine and suitable precipitation during the flue-cured tobacco growing period. Thus, they are mainly divided into suitable areas and most suitable areas, of which the most suitable areas account for about half of the total area. Conversely, most areas west of the Ailao Mountains, despite suitable temperatures, experience slightly insufficient sunshine and excessive rainfall, primarily falling under sub-suitable and suitable zones. In general, whether from the perspective of plantable area or suitability, Yunnan is a highly suitable area for flue-cured tobacco cultivation. 5 Risk Zoning According to the method described in Section 1.2.5 , the climate suitability zoning of tobacco was transformed into a comprehensive zoning reflecting tobacco exposure and vulnerability, followed by value assignment. Subsequently, formula (1) was used to calculate a fine-scale hail disaster risk index per square kilometer across the province. Based on the natural breakpoint method, this index was divided into five risk levels: high, medium-high, medium, low, and no risk, achieving precise regional zoning of flue-cured tobacco hail disasters (Fig. 6 ). It can be seen from the figure that the flue-cured tobacco hail disaster risk in Yunnan generally shows a pattern of higher in the east and lower in the west, which is significantly different from the risk zoning. In the north-central regions, due to high altitudes and low temperatures, large areas are not suitable for flue-cured tobacco cultivation. Therefore, these zones are considered no-risk zones due to the lack of exposure. Similarly, the rainy southern and western margins are not conducive to flue-cured tobacco cultivation and are also classified as no-risk zones. In contrast, most of the higher and medium-risk areas east of the Ailao Mountains align with suitable or most suitable tobacco zones. Because of their large cultivation areas, once they are hit by hail, the losses will be relatively large. Thus, they are classified as medium-high or high-risk zones. The western side of the Ailao Mountains generally poses lower risks, mainly suitable and sub-suitable zones, with limited tobacco planting, resulting in typical medium or lower risk classifications. Across the province, the area proportions of each risk level are as follows: high-risk zones occupy 0.9%, medium-high-risk zones cover 9.4%, medium-risk zones make up 26.1%, low-risk zones account for 30.4%, and no-risk zones constitute 33.3%. This indicates smaller areas for high and medium-high-risk zones, while medium, low, and no-risk zones are relatively evenly distributed. 6 Discussion (1) By applying the random forest machine learning model, selecting appropriate physical and environmental factors, and relying on a relatively balanced training set, a hail day discrimination model with high recognition rate was successfully trained. This model facilitated a refined depiction of hail day spatial distribution during the tobacco field period in Yunnan for the climate baseline year. This distribution can not only reflect the impact of large terrain on hail distribution, but also reveal the effect of local landforms on hail, offering practical insights and potential application for other weather event identifications. (2) Due to the limitations of hail observation data and the coarse grid resolution of ERA5 data used for calculating strong convective ingredients, inevitable biases exist in the computed hail day discrimination model. Therefore, it is necessary to correct the residuals of these estimation results. Through revision, not only can local extreme features be more highlighted, but the spatial distribution can also be made more objective and reasonable. With the increase in the number of anti-hail sites established by meteorological departments, hail damage can be mitigated effectively, while more observational samples are collected. These samples provide crucial data support for future in-depth studies of hail disaster risks. Due to the challenges posed by crop rotation and delayed disaster data collection, it is difficult to accurately obtain the exposure of flue-cured tobacco (i.e., its spatial distribution) and the vulnerability to hail disaster (i.e., disaster loss). Therefore, this study uses tobacco climate suitability as a comprehensive index to represent exposure and vulnerability. The main reason is that areas where tobacco is unsuitable or only moderately suitable for planting have low planting probability (low exposure) and, consequently, minor losses when hail strikes (low vulnerability). By contrast, areas with higher climate suitability are more likely to grow flue-cured tobacco (high exposure), and thus suffer greater losses from hail (high vulnerability). 7 Conclusion In the northern high-altitude regions, hail disaster hazards are generally high to very high. However, due to low temperatures and insufficient heat, these areas are unsuitable for flue-cured tobacco cultivation, thus categorized as no-risk zones due to lack of exposure. Similarly, although the southern and western margins exhibit lower hail hazards, excessive rainfall during the tobacco field period can cause abnormal root and leaf growth along with pest issues, classifying them as unsuitable and no-risk zones. Most other areas are cultivable, exhibiting an east-better-west-worse suitability pattern, while hail hazards display an east-high, west-low trend. Consequently, western regions of the Ailao Mountains generally pose lower risks, predominantly medium or low risk. Meanwhile, risks are somewhat elevated east of the Ailao Mountains, particularly in areas where high hazard and high suitability overlap, forming moderately high or high-risk zones. High and moderately high-risk zones cover less than 10% of the total provincial area. Medium, low, and no-risk zones each account for about 30% of the area. Given the overall low hail risk west of the Ailao Mountains, expanding the planting area of suitable tobacco varieties may be considered. For eastern areas with high suitability but relatively high risk, protective measures such as artificial anti-hail interventions or installing hail nets should be implemented to minimize economic losses while maintaining cultivation scale. Declarations Funding This work was supported by Yunnan Science and Technology Planning Project ( 2018BC007) . Author Contribution YZ participated in sample collection, optimization of analysis and manuscript writing. YL participated in sample collection, optimization of analysis and manuscript writing. XH provided field data. LZ participated in development of ideas and provided field data. PY participated in development of ideas and optimization of analysis. All the authors read and approve the final manuscript. Data availability All data generated or analysed during this study are included in this published article and its supplementary information file. Ethical approval This study did not involve any protected or endangered species, and no specific permissions were required for the sampling on the collection localities. Consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare no competing interests. References Kim M, Lee J, Lee S. Hail:Mechanisms, Monitoring, Forecasting, Damages, Financial Compensation Systems, and Prevention[J]. Atmosphere.2023,14(11):1642. https://doi.org/10.3390/atmos14111642. Botzen W,Bouwer L,Bergh J.Climate change and hailstorm damage:Empirical evidence and implications for agriculture and insurance[J],Resource and Energy Economics, 2010,32(3):341-362. https://doi.org/10.1016/j.reseneeco.2009.10.004. McMaster H.Hailstorm risk assessment in rural New South Wales[J].Natural Hazards,2001,24:187-196. https://doi.org/10.1023/A:1011820206279. Istrate V,Jitariu V,Ichim P,et al.Hailstorm risk assessment for crop areas in Moldova Region (Romania)[J].Present Environment and Sustainable Development,2021,15(2):55-67. Wang L,Hu G,Yue,Y,et al.GIS-Based Risk Assessment of Hail Disasters Affecting Cotton and Its Spatiotemporal Evolution in China[J].Sustainability,2016,8,218. https://doi.org/10.3390/su8030218. Zhang S,Liu S,Zhang T.Analysis on the Evolution and Microphysical Characteristics of Two Consecutive Hailstorms in Spring in Yunnan, China[J].Atmosphere.2021,12(1): 63.https://doi.org/10.3390/atmos12010063. Zhang C,Zhang Q,Wang Y.Climatology of Hail in China:1961–2005[J].Journal of Applied Meteorology and Climatology,2008,47:795–804.https://doi.org/10.1175/2007JAMC1603.1. Dong Y,Hua S,Chen B,et al.Numerical simulation of a pulse hailstorm in the plateau region in southwestern China[J].Atmospheric Research,2024,299:107218.https://doi.org/10.1016/j.atmosres.2023.107218. Tao, Y., Duan, X., Duan, C., et al. Characteristics of Hail Changes in Yunnan [J]. Plateau Meteorology, 2011, 30(04): 1108-1118. Yin, L.Y., Mei, H., Zhang, T.F., et al. Analysis of Fine-scale Hail Disaster Risk Zoning in Yunnan Province [J]. Journal of Catastrophology, 2022, 37(03): 99-105. Tie J, Li S,He W,et al.Study of metabolite differences of flue-cured tobacco from Canada (CT157) and Yunnan (Yunyan 87)[J]. Heliyon,2024,10(11). https://doi.org/10.1016/j.heliyon.2024.e32417. Eng I.Agglomeration and the Local State: the Tobacco Economy of Yunnan,China[J].Transactions of the Institute of British Geographers,1999,24(3):315-329. https://doi.org/10.1111/j.0020-2754.1999.00315.x. Huang, Z.Y., Kong, G.H., Ni, X., et al. Climatic Demonstration for the Substitution of Imported Tobacco with Flue-Cured Tobacco in Some Tobacco Areas of Yunnan [J]. Journal of Natural Resources, 2011, 26(09): 1592-1602. Yin Y,Wang J,Zhao J,et al.Risk Assessment of Hail Disaster on Cotton — A Case Study in Anhui Province[J].Agricultural Science & Technology,2012,13(8):1744-1748. Leigh R,Kuhnel I.Hailstorm Loss Modelling and Risk Assessment in the Sydney Region[J], Australia.Natural Hazards,2001,24, 171–185. https://doi.org/10.1023/A:1011855801345. Daradur M,Leah T,Pandey R,et al.Variability and Risk Assessment of Hail in the Republic of Moldova[J].Present Environment & Sustainable Development,2016,10(2):141-152. https://doi.org/10.1515/pesd-2016-0032. Li, M., Zhu, Y., Ji, W.J. Risk Assessment of Hail Disasters in Yunnan Tobacco Areas Based on GIS [J]. Chinese Agricultural Meteorology, 2012, 33(01): 129-133. Zhang, H., Liu, X.L., Fang, P. Risk Assessment of Hail Disasters in the Main Production Areas of Flue-Cured Tobacco in Sichuan [J]. Meteorological Science and Technology, 2016, 44(3): 468-473. Prein A F,Holland G J.Global estimates of damaging hail hazard[J].Weather and Climate Extremes,2018,22:10-23.https://doi.org/ 10.1016/j.wace.2018.10.004. Sama B,Uma K N, Das S K.A comprehensive assessment of the temporal and spatial variation of hail producing convective storms over eastern India using weather radar[J].Climate Dynamics,2024,62:4749-4773. https://doi.org/ 10.1007/s00382-023-07059-0. Zhang, W.H., Li, L. Preliminary Application of Artificial Intelligence in Hail Identification and Nowcasting [J]. Acta Meteorologica Sinica, 2019, 77(02): 282-291. Pulukool F,Li L,Liu C.Using Deep Learning and Machine Learning Methods to Diagnose Hailstorms in Large-Scale Thermodynamic Environments[J].Sustainability,2020,12(24). https://doi.org/10.3390/su122410499. Pullman M,Gurung I,Maskey M,et al.Applying Deep Learning to Hail Detection: A Case Study[J].IEEE Transactions on Geoscience and Remote Sensing,2019,57(12):10218-10225. https://doi.org/10.1109/TGRS.2019.2931944. Yao H, Li X, Pang H,et al.Application of random forest algorithm in hail forecasting over Shandong Peninsula[J].Atmospheric Research,2020,244. https://doi.org/10.1016/j.atmosres.2020.105093. Czernecki B,Taszarek M,Marosz M,et al.Application of machine learning to large hail prediction - The importance of radar reflectivity, lightning occurrence and convective parameters derived from ERA5[J].Atmospheric Research,2019,227:249-262. https://doi.org/10.1016/j.atmosres.2019.05.010. Torralba V,Hénin R,Cantelli A,et al.Modelling hail hazard over Italy with ERA5 large-scale variables[J].Weather and Climate Extremes,2023,39. https://doi.org/10.1016/j.wace.2022.100535. Doswell III C A,Brooks H E,Maddox R A.Flash Flood Forecasting: an Ingredients-Based Methodology[J].Weather and Forecasting,1996,11(4):560-581. https://doi.org/10.1175/1520-0434(1996)0112.0.CO;2. Evans M,Jurewicz M.Correlations Between Analyses and Forecasts of Banded Snow Ingredients and Observed Snowfall[J]. WeaForecasting,2009,24(1):337-350. https://doi.org/10.1175/2008WAF2007105.1. Allen J,Karoly D,Walsh K.Future Australian Severe Thunderstorm Environments.Part II: The Influence of a Strongly Warming Climate on Convective Environments[J].Journal of Climate,2014,27(10),3848–3868. https://doi.org/10.1175/JCLI-D-13-00426.1. Tian F,Xia K,Sun J,et al.Ingredients-based Methodology and Fuzzy Logic Combined Short-Duration Heavy Rainfall Short-Range Forecasting:An Improved Scheme[J].Journal of Tropical Meteorology,2024,30(3):241-256. https://doi.org/10.3724/j.1006-8775.2024.022. Raupach T H,Soderholm J S,Warren R A.Changes in Hail Hazard Across Australia: 1979-2021[J].npj climate and atmospheric science, 2023,6. https://doi.org/10.1038/s41612-023-00454-8. Kahraman A,Kendon E J,Fowler H J.Climatology of Severe Hail Potential in Europe Based on A Convection-Permitting Simulation[J]. Climate Dynamics,2024,62:6625-6642. https://doi.org/10.1007/s00382-024-07227-w. Prein A F, Holland G J.Global estimates of damaging hail hazard[J].Weather and Climate Extremes,2018,22:10-23. https://doi.org/10.1016/j.wace.2018.10.004. López L, Marcos J L, Sánchez J L,et al.CAPE Values and Hailstorms on Northwestern Spain[J].Atmospheric Research, 2001, 56(1):147-160. https://doi.org/10.1016/S0169-8095(00)00095-8. Manzato A .Hail in Northeast Italy: Climatology and Bivariate Analysis with the Sounding-Derived Indices[J].Journal of Applied Meteorology and Climatology, 2012,51(3):449-467. https://doi.org/10.1175/JAMC-D-10-05012.1. Billet J, Delisi M , Smith B G,et al.Use of Regression Techniques to Predict Hail Size and the Probability of Large Hail[J].Weather and Forecasting,1997,12(1):154-164. https://doi.org/10.1175/1520-0434(1997)0122.0.CO;2. Liu, B., Zou, L.Y., Li, X.P., et al. Analysis of Environmental Conditions for Thunderstorm Winds in Yunnan [J]. Meteorology, 2022, 48(11): 1402-1417. Pu, W.Y., Li, H.B., Song, Y., et al. Analysis and Application of the Influence of 0°C Layer Height Changes on Hail Melting [J]. Meteorology, 2015, 41(8): 980-985. Vinet F.Climatology of Hail in France[J].Atmospheric Research,2001,56(1):309-323.https://doi.org/10.1016/S0169-8095(00)00082-X. Kunz M. High-Resolution Assessment of the Hail Hazard over Complex Terrain from Radar and Insurance Data[J]. Meteorologische Zeitschrift,2010,19(5):427-439. https://doi.org/10.1127/0941-2948/2010/0452. Jin H,Lee H,Lkhamjav J,et al.A hail climatology in South Korea[J].Atmospheric Research,2017,188:90-99. https://doi.org/10.1016/j.atmosres.2016.12.013. Gdeeb R T.Weather Classification Using Meta-Based Random Forest Fusion of Transfer Learning Models[J]. International Journal of Advances in Intelligent Informatics,2024,10(2):186-201. https://doi.org/10.26555/ijain.v10i2.1264. Wang, J., Liu, L.P. Analysis of the Relationship between Hail Distribution and Topographic Factors in Guizhou Province Based on GIS [J]. Journal of Applied Meteorology, 2008, 19(5): 627-634. Liu, X.L., Zhang, Y., Liu, J.X. Temporal and Spatial Characteristics of Hail Disasters in Southwestern Sichuan Mountains [J]. Arid Meteorology, 2016, 34(01): 75-81. Noi P T,Degener J,Kappas M.Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data[J].Remote Sensing,2017,9(5). https://doi.org/10.3390/rs9050398. Du X, Lin X.Conceptual Model on Regional Natural Disaster Risk Assessment[J].Procedia Engineering,2012,45:96-100. https://doi.org/10.1016/j.proeng.2012.08.127. Li, X.H. Application of Random Forest Model in Classification and Regression Analysis [J]. Chinese Journal of Applied Entomology, 2013, 50(04): 1190-1197. Huang, Z.Y., Zhu, Y., Wang, S.H., et al. The Relationship Between Intrinsic Quality of Yunnan Flue-Cured Tobacco and Climate [J]. Resources Science, 2007(02): 83-90. Huang, Z.Y., Zhu, Y., Deng, Y.L., et al. Influence of Field Period Climate on Tobacco Leaf Quality in Yunnan [J]. Chinese Agricultural Meteorology, 2008(04): 440-445+449. Hu, X.Q., Huang, Z.Y., Zhu, Y., et al. Study on Climate Types and Suitability of Yunnan Flue-Cured Tobacco [J]. Journal of Nanjing Institute of Meteorology, 2006(04): 563-568. Fuchs S,Birkmann J,Glade T.Vulnerability assessment in natural hazard and risk analysis: current approaches and future challenges[J].Natural Hazards, 2012, 64:1969–1975. https://doi.org/10.1007/s11069-012-0352-9. Ward P J, Blauhut V, Bloemendaal N,et al.Review article: Natural hazard risk assessments at the global scale[J]. Natural Hazards and Earth System Sciences,2020,20(4):1069-1096.https://doi.org/10.5194/nhess-20-1069-2020. Zhang, G.C. Principles and Methods of Natural Disaster Risk Assessment and Zoning [M]. Meteorological Press, 2014. Spelmen V S, Porkodi R.A Review on Handling Imbalanced Data[J].2018 International Conference on Current Trends towards Converging Technologies (ICCTCT),2018:1-11.DOI:10.1109/ICCTCT.2018.8551020. Yap B W,Rani K A, Rahman H A A,et al. An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets[C] Herawan T,Deris M M,Abawajy J.Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013).Singapore:Springer,2014:13-22.https://doi.org/10.1007/978-981-4585-18-7_2. Bej S,Davtyan N,Wolfien M,et al.LoRAS: An oversampling approach for imbalanced datasets[J].Machine Learning,2021,110:279-301.https://doi.org/10.1007/s10994-020-05913-4. Chawla N V,Bowyer K W, Hall L O,et al.SMOTE: Synthetic Minority Over-sampling Technique[J].Journal of Artificial Intelligence Research,2002,16:321-357. https://doi.org/10.1613/jair.953. Yuan Y, Wu L, Zhang X, Gini-Impurity Index Analysis[J]. IEEE Transactions on Information Forensics and Security, 2021,16:3154-3169. https://doi.org/10.1109/TIFS.2021.3076932. Strobl C, Boulesteix A,Augustin T.Unbiased split selection for classification trees based on the Gini Index[J].Computational Statistics & Data Analysis,2007,52(1):483-501.https://doi.org/10.1016/j.csda.2006.12.030. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8599617","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":578509607,"identity":"cb4ebeaf-c9df-4675-8d1c-b92537e886b0","order_by":0,"name":"Yingmo Zhu","email":"","orcid":"","institution":"Kunming University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"Yingmo","middleName":"","lastName":"Zhu","suffix":""},{"id":578509608,"identity":"62aac63c-0e16-4f36-81ac-79fd3023524c","order_by":1,"name":"Yonggang Liang","email":"","orcid":"","institution":"Kunming University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"Yonggang","middleName":"","lastName":"Liang","suffix":""},{"id":578509609,"identity":"e6567e36-8c00-4197-a2d9-94ef36345182","order_by":2,"name":"Xueqiong Hu","email":"","orcid":"","institution":"Yunnan Climate Center","correspondingAuthor":false,"prefix":"","firstName":"Xueqiong","middleName":"","lastName":"Hu","suffix":""},{"id":578509610,"identity":"bc72a153-0249-4b33-b98f-67bba24bed19","order_by":3,"name":"Lizhang Fan","email":"","orcid":"","institution":"Yunnan Climate Center","correspondingAuthor":false,"prefix":"","firstName":"Lizhang","middleName":"","lastName":"Fan","suffix":""},{"id":578509611,"identity":"4c2a7587-09ea-42b8-98ce-0f1be29d7cb3","order_by":4,"name":"Pengwu Yang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzklEQVRIiWNgGAWjYBACfvmDDQckDGzk5NmbDxCnRXIGc+MBi4I0Y8OeYwnEaTG4ATS94sPhxIYbPgZEuux2Y8OBGwaHExtn8Hy88YbBTk63gYAOxjkHGw7OMEg3bpfu3Ww5hyHZ2OwAAS3MDIkNhyUMrGUb55zdJs3DcCBxGyEtbCAtfwyYGRtu5DwjTguPRCIokJ0VgVrYiNMiwQOOF3AgG1vOMSDCL/bH2x9/kPgDjsqHN95U2MkR1IJmJbFRg6SFVB2jYBSMglEwIgAAcfxKgATJHnYAAAAASUVORK5CYII=","orcid":"","institution":"Yunnan Climate Center","correspondingAuthor":true,"prefix":"","firstName":"Pengwu","middleName":"","lastName":"Yang","suffix":""}],"badges":[],"createdAt":"2026-01-14 08:53:34","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8599617/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8599617/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":101203383,"identity":"f2fafcd1-083c-4368-8830-3fabefa5eb53","added_by":"auto","created_at":"2026-01-27 09:39:31","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1949275,"visible":true,"origin":"","legend":"","description":"","filename":"maintext.docx","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/79d60de6cd518b27d05498a4.docx"},{"id":100979863,"identity":"73195edd-2669-407c-96e1-34d30dc7b84f","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7350,"visible":true,"origin":"","legend":"","description":"","filename":"b7a34c3400194f009f21466a09e4608c.json","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/57dd20deef117c99e8b1d36c.json"},{"id":101203465,"identity":"dca77a55-ac34-4eae-8590-6f7849660908","added_by":"auto","created_at":"2026-01-27 09:39:49","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":127292,"visible":true,"origin":"","legend":"","description":"","filename":"b7a34c3400194f009f21466a09e4608c1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/46b95db9131d9359175b6de8.xml"},{"id":100979855,"identity":"5b4e3809-8684-4214-a257-96666347620e","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"eps","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":50496,"visible":true,"origin":"","legend":"","description":"","filename":"drawingimage3.eps","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/c2ae3dc75eb209b0d0a09444.eps"},{"id":101203460,"identity":"58b65f45-841f-4ca0-85d5-235e457034e9","added_by":"auto","created_at":"2026-01-27 09:39:48","extension":"jpeg","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":138106,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/0b30e50722f280da4d23813e.jpeg"},{"id":100979858,"identity":"5a87bef1-86b4-46d8-aa4f-9458d82a61a9","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"jpeg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":205439,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/59a24c7e4e90d49f12eca94f.jpeg"},{"id":100979859,"identity":"e9265ecb-09ce-47cd-950f-15b51a9d1434","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"jpeg","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1966,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/ed6dba82aeb1e6cf84a374b6.jpeg"},{"id":100979862,"identity":"3d01c292-d43f-4254-8ec4-07cc12c27910","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"jpeg","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":880657,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/88b6f055ccc8402ea5a3b877.jpeg"},{"id":101203057,"identity":"414f325b-4f6d-4361-bb20-0e1577278262","added_by":"auto","created_at":"2026-01-27 09:38:40","extension":"jpeg","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":861356,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/6de8f47205c8f6723b4c3869.jpeg"},{"id":100979874,"identity":"4bb41648-3fab-4c7d-a62e-961c540da76f","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"jpeg","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":198524,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/64ff4069167a1eab4498c50d.jpeg"},{"id":101203805,"identity":"28115352-832b-450d-a22a-de4b29bc4fa1","added_by":"auto","created_at":"2026-01-27 09:40:42","extension":"jpeg","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":128667,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/0f3eb53aa324c43819a951e2.jpeg"},{"id":100979870,"identity":"77ff0bb9-e103-4c02-8cac-ed969017a327","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"jpeg","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":806565,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/292d58c95e3cd29090e26703.jpeg"},{"id":101203032,"identity":"dca93e6a-7fde-47e6-a9c2-ef5a3f5a39d6","added_by":"auto","created_at":"2026-01-27 09:38:36","extension":"jpeg","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":203253,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/8c7c302e3bd85cf29b6d4c84.jpeg"},{"id":101203314,"identity":"997a1166-e2d8-49ec-9c13-d6722c8c9634","added_by":"auto","created_at":"2026-01-27 09:39:22","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":38259,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/2bedca6184b725337224f1f2.png"},{"id":100979866,"identity":"57bbd2b2-fe5e-4654-9a5b-c374b860bf98","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":80697,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/0a5fb39164a274341a351c54.png"},{"id":100979877,"identity":"997f767a-ce38-4aaf-894e-b10c41fac367","added_by":"auto","created_at":"2026-01-23 12:00:29","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":999,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/a63510c55a5c90d448402033.png"},{"id":100979865,"identity":"6d2e12d7-5377-4611-9278-e45991ea1181","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":237386,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/a0c396216bb5a30d6238e03f.png"},{"id":100979873,"identity":"75a58325-848a-45ba-a4f7-05ece222c04c","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":289225,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/031dfd31c33951bec7e5a5ce.png"},{"id":100979878,"identity":"6aae14d3-64d3-4e37-a1e6-35319d90af0b","added_by":"auto","created_at":"2026-01-23 12:00:29","extension":"png","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":85105,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/499b2657b86b86c495109be4.png"},{"id":100979869,"identity":"03498b8c-a796-4ff3-9665-956664d34f5e","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":45251,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/5875023842599330cac98567.png"},{"id":100979871,"identity":"1e3844d4-2008-413f-8d66-6c4308f7f438","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":263188,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/de65ea3d9046d8247f71384a.png"},{"id":100979875,"identity":"56d9a20e-57f1-4fc5-b150-209b20d25503","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":85579,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/58d9abd9abd805805b8a71aa.png"},{"id":101297241,"identity":"11ceca2f-8a00-463c-98ae-60bb30be0451","added_by":"auto","created_at":"2026-01-28 09:26:08","extension":"xml","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":123716,"visible":true,"origin":"","legend":"","description":"","filename":"b7a34c3400194f009f21466a09e4608c1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/50c630c6a2593b87f0817347.xml"},{"id":100979880,"identity":"07e215cd-3a57-4286-a930-e6ac0297a621","added_by":"auto","created_at":"2026-01-23 12:00:29","extension":"html","order_by":25,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":138547,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/60df869b43854084ac40e3fc.html"},{"id":100979868,"identity":"3ca9e28a-27f3-47e3-9858-9913e4ccf85a","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":383672,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSpatial Distribution of Observed Hail Days During the Flue-Cured Tobacco Field Period (May-September) in Yunnan for the Baseline Climate Year (1991-2020)\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/839a875e82827d891020860f.png"},{"id":100979856,"identity":"f6478782-c09f-4289-99b0-9bb0836320e9","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":645855,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003e(a) Distribution of Average Annual Hail Frequency Across Different Elevations and (b)Proportional Elevation Division Results for Yunnan\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/c71972e35c163b276b412cd4.png"},{"id":100979853,"identity":"b0d6bd17-fe77-48cf-8a53-88e8ade8e886","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":924201,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003e(a)Predicted Spatial Distribution of Hail Days During the Flue-Cured Tobacco Field Period (May-September) for the Baseline Climate Year (1991-2020) and (b)Differences from Station Observations\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/dff2c3d03ba8a9ee788584c2.png"},{"id":100979860,"identity":"9bfec7ee-978f-46df-9a60-86fd5710cae9","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":1127541,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCorrected Spatial Distribution of Hail Days (a) and Hazard Zoning (b) for the Baseline Climate Year (1991-2020) During the Flue-Cured Tobacco Field Period (May-September)\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/8fcac6c6332f473ed77ad1e0.png"},{"id":100979854,"identity":"2e661025-1b0f-4a8e-85ad-c1600633f88d","added_by":"auto","created_at":"2026-01-23 12:00:28","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":902605,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eClimate Baseline Year (1991-2020) Suitability Zoning Map for Flue-Cured Tobacco in Yunnan Province\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/79e0476f4ff7075bd711ac58.png"},{"id":101204073,"identity":"f0499788-e225-4aec-9a66-a72132c4713f","added_by":"auto","created_at":"2026-01-27 09:41:30","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":535139,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRisk Assessment Zoning of Hail Disasters During the Flue-Cured Tobacco Field Period (May-September) in Yunnan for the Climate Baseline Year (1991-2020)\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/d39dc1f3b6915dcff9bdd47c.png"},{"id":102748173,"identity":"635d1036-529a-40ae-95a5-7778ac3234d3","added_by":"auto","created_at":"2026-02-16 09:06:12","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":5328662,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8599617/v1/f6086668-b56e-4975-9244-f0b97eb52c26.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Research on Fine-Scale Risk Zoning of Hail Disasters in Yunnan Flue-Cured Tobacco Based on Machine Learning","fulltext":[{"header":"Introduction","content":"\u003cp\u003eHail disasters, severe weather phenomena triggered by intense convective systems \u003csup\u003e[1]\u003c/sup\u003e, pose significant threats to agriculture, causing yield reductions, quality degradation, and increased pest risks \u003csup\u003e[2]\u003c/sup\u003e. Therefore, agricultural hail risk assessments and zoning studies are crucial for minimizing losses and promoting sustainable agricultural development. McMaster \u003csup\u003e[3]\u003c/sup\u003e found frequent hail occurrences in the plateau regions of New South Wales, Australia, through agricultural hail risk assessments. Similarly, Istrate et al. \u003csup\u003e[4]\u003c/sup\u003e conducted a hail risk assessment on crops in Moldova, Romania, and the results showed that the high-risk hail areas in the region highly overlapped with the areas with concentrated cash crops. Wang et al. \u003csup\u003e[5]\u003c/sup\u003e observed a generally low hail disaster risk in China\u0026rsquo;s cotton regions, although a gradual increasing trend was noted from 1950 to 2009.\u003c/p\u003e\n\u003cp\u003eLocated in a low-latitude plateau, Yunnan is affected by both the South Asian and East Asian monsoons, resulting in frequent convective activities, making it one of the most severely affected areas by hail in China \u003csup\u003e[6-10]\u003c/sup\u003e. As an important flue-cured tobacco base in China, Yunnan is world-renowned for its superior quality, rich flavor, and aroma \u003csup\u003e[11,12]\u003c/sup\u003e. Its planting area and output rank first in the country \u003csup\u003e[13]\u003c/sup\u003e. The quality of flue-cured tobacco, an economic crop, is directly determined by leaf quality. However, hail damage can cause scars, tears, and even breakage in tobacco leaves, severely impacting their quality and yield. Therefore, it is of great practical significance to conduct research on hail risk of flue-cured tobacco in Yunnan. Current hail risk research mainly focuses on several aspects, including hail disaster hazard, disaster-prone environmental sensitivity, exposure of disaster-bearing bodies, vulnerability of disaster-bearing bodies, and disaster prevention and mitigation capabilities, and then establishes corresponding risk assessment models \u003csup\u003e[14-18]\u003c/sup\u003e. Key hazard risks are typically described by hail frequency and intensity. The greater the frequency and intensity of hail, the higher its hazard. However, actual hail intensity data is often scarce, prompting this study to use hail days as a risk indicator. According to statistics, the annual average hail days across Yunnan\u0026rsquo;s 125 national meteorological stations were less than one day from 1978 to 2020. In the same period, the direct economic losses caused by hail reached nearly 700 million yuan, and the affected agricultural areas exceeded 100,000 hectares. This indicates that relying solely on the number of hail days observed by meteorological stations cannot fully reflect the spatial distribution of the number of hail days in Yunnan. Therefore, a more refined spatial distribution study is needed.\u003c/p\u003e\n\u003cp\u003eIn geoscience research, interpolation methods such as inverse distance weighting and kriging are often used for fine spatial distribution analysis. However, due to the high nonlinearity and suddenness of hail events, these traditional interpolation methods often prove suboptimal \u003csup\u003e[19-21]\u003c/sup\u003e. With the rapid advancement of artificial intelligence technology, some new machine learning methods have been introduced to provide better solutions for dealing with nonlinear problems. Consequently, more researchers have begun to apply machine learning to the study of hail weather. Pulukool et al. \u003csup\u003e[22]\u003c/sup\u003e developed two deep learning models (autoencoder and convolutional neural network) and a machine learning model (random forest) for hail prediction. The random forest model demonstrated the best predictive performance using only four factors: convective available potential energy, convective inhibition, 1-3 km wind shear, and warm cloud depth. Pullman et al. \u003csup\u003e[23]\u003c/sup\u003e applied deep machine learning to identify hail weather and found that deep learning based on multi-source data can achieve high recognition accuracy. Yao et al. \u003csup\u003e[24]\u003c/sup\u003e utilized the random forest algorithm to establish a 0-6 hour hail forecast model for Shandong Peninsula, with excellent test results. Czernecki et al. \u003csup\u003e[25]\u003c/sup\u003e created a machine learning model for predicting large hail, and found that combining thermodynamic and kinematic parameters from numerical weather prediction models with real-time remote sensing data significantly enhances the forecasting ability of large hail. Terrible et al. \u003csup\u003e[26]\u003c/sup\u003e constructed a hail prediction model based on genetic algorithms, relying on ERA5 large-scale meteorological variables and convection indices to describe Italy\u0026rsquo;s seasonal and long-term hail event changes. Through the classification and verification of hail probability in Friuli-Venezia Giulia, it is shown that the hail model can effectively estimate the probability of hail in specific areas of Italy.\u003c/p\u003e\n\u003cp\u003eThe \u0026ldquo;ingredient-based\u0026rdquo; forecasting is a forecasting method based on model output, which is particularly suitable for forecasting heavy rain and severe convective weather \u003csup\u003e[27-30]\u003c/sup\u003e. This method adopts the basic component point of view and evaluates the possibility and intensity of heavy rain or severe convective events by analyzing and predicting the basic physical quantities or processes (\u0026ldquo;ingredients\u0026rdquo;) that have a direct impact on the development and intensity of heavy rain or severe convective events. Raupach et al. \u003csup\u003e[31]\u003c/sup\u003e studied Australia\u0026rsquo;s hail disaster changes using convective instability energy, convective inhibition energy, and other \u0026ldquo;ingredients\u0026rdquo;, combined with ERA5 reanalysis data and severe storm archives. They found that under the background of climate warming, the frequency of severe thunderstorm environments in northern and eastern Australia may increase significantly. Kahraman et al. \u003csup\u003e[32]\u003c/sup\u003e proposed a novel hail identification method based on the \u0026ldquo;ingredient-based\u0026rdquo; method, which can identify severe hail in climate models that allow convection. Verification shows that the results of this model are highly consistent with the existing hail climatology constructed based on observations, including fine-scale spatial variations.\u003c/p\u003e\n\u003cp\u003eBased on the above analysis, this study proposes an objective hail discrimination method that combines machine learning, strong convective ingredients, and disaster-prone environmental factors. This method yielded a fine-scale spatial distribution of hail days in Yunnan during the flue-cured tobacco field period (May-September) of the climate baseline year (1991-2020). Then, by applying the natural breakpoint method, a refined risk zoning of flue-cured tobacco disasters affected by hail was carried out. These results were then integrated with tobacco suitability zoning, ultimately achieving fine-scale risk zoning for Yunnan flue-cured tobacco hail disasters. This research aims to provide theoretical basis and scientific guidance for the rational layout of Yunnan flue-cured tobacco, in order to reduce or avoid economic losses caused by hail disasters.\u003c/p\u003e"},{"header":"1 Data and Methods","content":"\u003cdiv id=\"Sec2\" class=\"Section2\"\u003e \u003ch2\u003e1.1 Data\u003c/h2\u003e \u003cp\u003eThe data used in this study includes: ① Daily hail records from 125 national meteorological stations in Yunnan for May to September from 1991 to 2020; ② Monthly average temperature, sunshine duration, and rainfall data from these 125 meteorological stations over the same period; ③ Daily reanalysis grid data at 14:00 from ERA5 for 100 hPa to 950 hPa pressure levels (at 25 hPa intervals) with a resolution of 0.25\u0026deg; \u0026times; 0.25\u0026deg; for the Yunnan region over May to September from 1991 to 2020; ④ 1 km \u0026times; 1 km DEM digital elevation data for the Yunnan region. The daily hail records from the National Meteorological Station are used to construct the model training set. ERA5 reanalysis data is employed to calculate the strong convective ingredients at 14:00 for both the 125 meteorological stations and the entire province on a 1 km \u0026times; 1 km grid for May to September from 1991 to 2020. Due to the coarse spatial resolution of ERA5 grids, bilinear interpolation was utilized for spatial downscaling. The DEM elevation data assists in calculating disaster-prone environmental factors such as slope and aspect for both the meteorological stations and the whole province. Hail records and calculated ingredients from ERA5, as well as DEM-derived environmental factors, are employed in machine learning modeling. Additionally, monthly average temperature, sunshine duration, and rainfall data during the baseline climate year are used to estimate the fine-scale distribution of July average temperature, sunshine hours from July to August, and rainfall from April to September, thus facilitating the fine distribution mapping of flue-cured tobacco suitability in Yunnan.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e1.2 Methods\u003c/h2\u003e \u003cdiv id=\"Sec4\" class=\"Section3\"\u003e \u003ch2\u003e1.2.1 Preliminary Selection of Strong Convective \u0026ldquo;Ingredients\u0026rdquo;\u003c/h2\u003e \u003cp\u003eHail often occurs in the afternoon, driven by thermal and dynamic conditions necessary for its formation. This study initially selects convective available potential energy (CAPE), lifting index (LI), 0\u0026deg;C level height, and \u0026minus;\u0026thinsp;20\u0026deg;C level height at 14:00 as the primary strong convective physical \u0026ldquo;ingredients\u0026rdquo; \u003csup\u003e[\u003cspan additionalcitationids=\"CR34 CR35 CR36 CR37\" citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]\u003c/sup\u003e. CAPE is a crucial parameter for measuring atmospheric instability. Sufficient instability energy is imperative for hail genesis, with higher CAPE values corresponding to greater instability. LI reflects accumulated instability of air masses during the ascent process. The larger the LI value, the higher the possibility of explosive convection. The height of the 0℃ layer determines the upper limit of the development of convective clouds and provides favorable conditions for the generation and growth of hail. When the height of the 0\u0026deg;C layer is suitable, it is conducive to the development of convective clouds to higher altitudes, thereby providing sufficient water vapor environment for the formation of hail. At the same time, the moderate 0\u0026deg;C layer height can ensure that hail particles maintain an appropriate size during falling and are not melted by the warm layer. In addition, the \u0026minus;\u0026thinsp;20\u0026deg;C layer height is the key temperature for large water droplets to freeze naturally. If the thickness between the 0\u0026deg;C layer and the \u0026minus;\u0026thinsp;20\u0026deg;C layer is appropriate and the supercooled water content is abundant, the growth rate of hail particles will be significantly accelerated.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section3\"\u003e \u003ch2\u003e1.2.2 Preliminary Selection of Disaster-Prone Environmental Factors\u003c/h2\u003e \u003cp\u003eAltitude, slope, and aspect significantly influence hail formation \u003csup\u003e[\u003cspan additionalcitationids=\"CR40 CR41 CR42 CR43\" citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]\u003c/sup\u003e. High-altitude areas, characterized by lower atmospheric pressure, promote air ascent, aiding hail formation. Slope affects hail mainly through the lifting effect of terrain on airflow. When airflow passes through mountains, slope will enhance the upward movement of airflow, thereby promoting the formation and growth of hailstones. Slope aspect affects the distribution and intensity of solar radiation, thereby changing surface temperature and humidity, which in turn affects hail formation and development.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e1.2.3 Machine Learning Model\u003c/h2\u003e \u003cp\u003eThis study adopted random forest as a machine learning model, which is mainly used to deal with classification or regression tasks \u003csup\u003e[\u003cspan additionalcitationids=\"CR46\" citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]\u003c/sup\u003e. The core of random forest is decision tree. By integrating the results of multiple decision trees, the accuracy and stability of the model are enhanced. In this study, 14:00 strong convective ingredients and disaster-prone environmental factors from 125 national meteorological stations serve as independent variables, with daily hail records as the dependent variable. Training is performed through random forest to establish a hail day discrimination model. Subsequently, the model is applied to predict hail days for each 1 km \u0026times; 1 km grid during the climate baseline year\u0026rsquo;s tobacco field period in Yunnan, thereby obtaining the spatial distribution of hail days and realizing the division of flue-cured tobacco hail disaster risk zones.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e1.2.4 Climate Suitability Zoning for Flue-Cured Tobacco\u003c/h2\u003e \u003cp\u003eYunnan is located on the Yunnan-Guizhou Plateau, and its average summer temperature is significantly lower than that of provinces in central and eastern China. Low temperatures, prolonged rainy periods, and insufficient sunlight during the mid-late field period pose challenges for tobacco production. Based on studies by Huang Zhongyan and Hu Xueqiong et al. \u003csup\u003e[\u003cspan additionalcitationids=\"CR49\" citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]\u003c/sup\u003e, this research selects July average temperature, sunshine hours from July to August, and rainfall from April to September during the climate baseline year as factors for evaluating climate suitability for tobacco. These factors are used for suitability assessment zoning. See Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e for specific assessment zoning indicators.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eClimate Suitability Zoning Indicators for Yunnan Flue-Cured Tobacco\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMost suitable\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSuitable\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eModerately suitable\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eUnsuitable\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJuly Avg Temperature (\u0026deg;C)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e20.0ཞ22.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e19.0ཞ23.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e18.5ཞ26.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u0026lt;18.5 or 26.5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJuly-August Sunshine Hours (h)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u0026ge;\u0026thinsp;250\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u0026ge;\u0026thinsp;220\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u0026ge;\u0026thinsp;180\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u0026lt;180\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eApril-September Rainfall (mm)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e550ཞ1250\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e450ཞ1250\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e450ཞ1400\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1400 or \u0026lt;450\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section3\"\u003e \u003ch2\u003e1.2.5 Risk Assessment Zoning for Tobacco Hail Disasters\u003c/h2\u003e \u003cp\u003eThe climatic suitability of flue-cured tobacco not only reflects the adaptability of the climate environment to the crop\u0026rsquo;s survival and growth, but also considers its exposure and vulnerability. Higher suitability often indicates greater planting feasibility, yet it may also imply increased loss potential if hail strikes. Therefore, suitability zoning can be translated into comprehensive exposure and vulnerability zoning. The transformation method assigns values as follows: most suitable, suitable, moderately suitable, and unsuitable correspond to high, medium, low, and none in terms of exposure and vulnerability, with the values of 0.8, 0.6, 0.4, and 0 respectively.\u003c/p\u003e \u003cp\u003eBased on natural disaster risk assessment methodologies \u003csup\u003e[\u003cspan additionalcitationids=\"CR52\" citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e]\u003c/sup\u003e, a risk assessment model for Yunnan flue-cured tobacco hail disasters was constructed after comprehensively considering the combined effects of hail hazard, exposure, and vulnerability:\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\:\\text{R}={\\text{D}}^{0.8}\\times\\:{\\text{E}}^{0.2}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere R represents the flue-cured tobacco hail disaster risk index, D denotes the hail hazard zoning value, and E stands for the combined exposure and vulnerability zoning value.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"2 Spatial Distribution of Hail Days at Stations","content":"\u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates the spatial distribution of hail days observed during Yunnan\u0026rsquo;s flue-cured tobacco field period in the baseline climate year. The distribution exhibits a clear pattern divided by the Ailao Mountains, with a distribution pattern of fewer hail days in the west and more hail days in the east. The number of hail days in the region west of the Ailao Mountains typically does not exceed 5, with places like Lincang and Dehong experiencing fewer than 2 days on average. In contrast, areas to the east of the Ailao Mountains generally record 6 or more hail days, while certain stations in Zhaotong, Qujing, and Wenshan report more than 10 days. Additionally, the high-altitude regions of northwestern Yunnan are particularly noteworthy for their hail frequency, where Diqing and northern Lijiang commonly see over 10 hail days, with some locations experiencing more than 20. However, there are specific sites within the valleys of the Jinsha River, Yuanjiang River, and Nujiang River that have no recorded hail observations. The disparity in hail occurrences between the east and west of the Ailao Mountains is likely attributed to the fact that over 90% of the cold air affecting Yunnan originates from northeastern and eastern trajectories. During the spring and summer seasons, this cold air frequently converges with warm, moist air from the southwest east of the Ailao Mountains, triggering severe convective weather conditions, thereby leading to a higher incidence of hail. West of the Ailao Mountains, the massive mountain ridges act as a barrier, preventing cold air intrusions and resulting in more stable atmospheric conditions, thus reducing the likelihood of hail. The increased hail occurrence in northwestern Yunnan might be associated with its high altitude. As warm, moist airflows from the Bay of Bengal pass through, they are forced to rise sharply, causing rapid cooling and condensation of moisture at high altitudes, which provides ample energy and moisture for hail formation. Furthermore, in high-altitude areas, the thin air allows more solar radiation to reach and warm the ground surface, heightening surface temperatures and facilitating intense convective movements that can lead to hail. The absence of hail in certain valley regions may be due to topographical influences and the F\u0026ouml;hn effect, as these valleys often experience dry, warm winds, resulting in poor moisture conditions that are less conducive to hail formation.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"3 Hazard Zoning","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Construction of Training Set\u003c/h2\u003e \u003cp\u003eAs can be seen from Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, the number of hail days at most observation stations is less than 20 days, while the total number of days in the Yunnan flue-cured tobacco field period in the climate base year is 4590 days. This shows that the number of hail days is very small compared to the number of non-hail days. If such a dataset is used directly for training, it will lead to the problem of unbalanced datasets, which will cause the model to tend to identify more non-hail days, which is contrary to our goal. To deal with unbalanced datasets, commonly used methods include oversampling and undersampling \u003csup\u003e[\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e, \u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e]\u003c/sup\u003e. Oversampling is to increase the number of samples in the minority category to approximate that of the majority class. Undersampling is to reduce the number of samples in the majority category to align closer with the minority category. Due to the substantial discrepancy in the number of hail days and non-hail days, it is difficult to construct an ideal training set regardless of whether random oversampling or synthetic sample oversampling is used \u003csup\u003e[\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e, \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e]\u003c/sup\u003e. Thus, this study employs the undersampling method of randomly deleting non-hail day samples to construct a relatively balanced hail training set, resulting in approximately 2800 samples.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Factor Selection\u003c/h2\u003e \u003cp\u003eNot all the preliminarily selected severe convection physical quantity \u0026ldquo;ingredients\u0026rdquo; and disaster-prone environmental factors can effectively identify hail days, so factor screening is needed. In machine learning, impurity measures the degree of disorder or complexity and helps determine data division strategies for efficient model building. When a factor can significantly reduce impurity when splitting a node, it indicates that the factor contributes more to the model. This study used the Gini impurity analysis in random forests \u003csup\u003e[\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e, \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e]\u003c/sup\u003e, ultimately selecting CAPE, LI, and 0\u0026deg;C level height as strong convective \u0026ldquo;ingredients\u0026rdquo;, and elevation as the disaster-prone environmental factor.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Regional Modeling\u003c/h2\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea displays the average distribution of hail days across different elevations during Yunnan tobacco field period, showing a clear correlation between hail days and elevation. As elevation rises, hail days increase significantly, especially above 1000m. For every 250m increase in altitude, the average number of hail days increases by about 0.3 days. In order to build a more accurate hail day determination model, this study divides the province into low-altitude areas (\u0026lt;\u0026thinsp;1500m), medium-altitude areas (1500m\u0026thinsp;~\u0026thinsp;2000m) and high-altitude areas (\u0026gt;\u0026thinsp;2000m) based on a 1 km \u0026times; 1 km grid. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb illustrates these divisions: low-altitude zones primarily encompass southern low mountains, valleys, and plains; high-altitude zones concentrate in the northwest and northeast mountain ranges; the remaining areas fall into the medium-altitude zone. Accordingly, 125 national meteorological stations are categorized by altitude for separate modeling: 14 stations in the high-altitude zone, 55 in the medium-altitude zone, and 56 in the low-altitude zone.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Model Training\u003c/h2\u003e \u003cp\u003eFirst, DEM data provided elevation heights for the 125 national meteorological stations. Subsequently, ERA reanalysis data were used to compute CAPE, LI, and 0\u0026deg;C level height for each record in the training set at 14:00 on the same day. Next, the parameters of the random forest model were repeatedly adjusted through 4-fold cross-validation, including the number of trees, maximum tree depth, minimum sample count for node splits, and minimum samples in leaf nodes, to optimize hail day discrimination models for the low, medium, and high-altitude zones. The accuracy of the three models finally obtained in the test exceeded 88%. Among them, the high-altitude region model showed slightly lower accuracy, under 90%, while models for the medium and low-altitude regions approached 95%. The overall average accuracy reached 93%.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Discrimination and Validation of Hail Days\u003c/h2\u003e \u003cp\u003eThrough the three trained hail day discrimination models, combined with 1 km \u0026times; 1 km DEM grid elevation data and 1 km \u0026times; 1 km grid CAPE, LI, and 0\u0026deg;C level height calculated via bilinear interpolation from ERA5, daily grid-based hail discrimination was conducted for Yunnan\u0026rsquo;s tobacco field period in the baseline climate year. The cumulative results produced a spatial distribution map of hail days (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea), highlighting the relationship between hail days and large-scale topography: over 20 days in northwest high-altitude areas and generally under 10 days in southern low-altitude zones, with sparse occurrences in valleys like those of the Jinsha, Yuanjiang, and Nujiang Rivers, aligning well with station observations. The discrimination also reflected local terrain influences on hail, such as contrasting hail days between the Yuanmou Basin and its eastern Raying Mountain, where mountain areas recorded over 20 days, while the basin averaged below 2 days. Compared with the number of hail days observed at the station (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eb), 49% of the stations exhibited errors within \u0026plusmn;\u0026thinsp;3 days, 24% within \u0026plusmn;\u0026thinsp;1 day. Moreover, the absolute average error for all stations was 5.6 days, indicating satisfactory overall discrimination performance. At the same time, it can be seen that the discrimination model still has some shortcomings. Firstly, there is an apparent overestimation in northwest regions, with hail days exceeding 40 (over 10 days more than station observations). The possible reason is that the terrain in the high-altitude areas of the northwest is extremely complex, and the grid of ERA5 is relatively coarse (0.25\u0026deg;\u0026times;0.25\u0026deg;), which cannot accurately reflect the strong convection conditions of the micro-topography. Secondly, the number of hail days in the eastern marginal areas is generally less, which is less than 10 days in some areas. The possible reason is that the strong convection factor used in this study is the calculation result at 14:00, while hail in some areas of the eastern marginal areas occurred earlier or later. For example, there were 3 hail events before 14:00 in Luoping, and 2 hail events after 20:00 in Guangnan. The strong convection factor at 14:00 cannot truly reflect the weather conditions before or after the weather conditions for a long time, causing deviations in predicted hail days.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003e3.6 Correction and Zoning\u003c/h2\u003e \u003cp\u003eBy analyzing the station residual values (differences between observed and predicted hail days), it can be found that 64% of the stations in the province bear negative values, which is almost twice the number of positive ones, suggesting overall overestimation in Yunnan hail days. Moreover, this overestimation is mainly concentrated in the central and western areas. By applying kriging interpolation to spatially interpolate residuals and adding them to the prediction map, a corrected spatial distribution of hail days was obtained (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea). The corrected map aligns better with actual observations, accentuating hail day variations with topography. Specifically, hail days increased markedly in the eastern edge, notably in eastern Zhaotong and northeastern Wenshan, consistent with frequent cold air intrusions in eastern and northeastern Yunnan, which foster strong convection and more hail. Meanwhile, hail days dropped in the northwest, especially in valleys and their margins, reflecting geographic barriers hindering airflow ascent required for hail formation. Furthermore, significant reductions also occurred in Dehong, Baoshan, and Lincang. Despite sufficient water vapor in the west in spring and summer, there is less cold air coming from the east or from the plateau, resulting in sparse strong convective activities and hail weather is not easy to form.\u003c/p\u003e \u003cp\u003eThe natural breakpoint method is used to classify the corrected number of hail days for precise hazard zoning of flue-cured tobacco (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eb). This zoning is divided into four levels: high, medium-high, medium, and low hazard, with respective values of 0.8, 0.6, 0.4, and 0.2. The map reveals a north-high, south-low hazard distribution. Northern high-altitude areas exhibit high to medium-high hazard, while mid-south low-altitude mountains and basins show medium to low hazards. At the same time, some tall mountains in the mid-south indicate generally higher hazard, while northern valleys and basins present relatively lower hazard. Notably, distinct differences exist across the Ailao Mountains, with eastern areas generally hazardier than the west, particularly on the eastern fringe. Most areas are mainly medium-hazard or above, while most areas west of Ailao Mountain are mainly low-hazard.\u003c/p\u003e \u003c/div\u003e"},{"header":"4 Suitability Zoning","content":"\u003cp\u003eBased on the climate baseline data of average monthly temperature, sunshine duration, and rainfall from 125 national meteorological stations, combined with geographical and topographical factors such as longitude, latitude, elevation, slope, and aspect, a stepwise regression method was employed to estimate 1 km \u0026times; 1 km spatially refined maps for July average temperatures, sunshine hours from July to August, and precipitation from April to September (detailed modeling not discussed here). The following trends are observed. According to the climate baseline data, July\u0026rsquo;s average temperature shows a south-high, north-low distribution pattern geographically. Regions in central and northern Yunnan with altitudes exceeding 2100m have average temperatures below 18.5\u0026deg;C, while the low-altitude areas in Yuanjiang Valley and the southeast exceed 26.5\u0026deg;C (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea). From July to August, the distribution of sunshine hours is more in the east and less in the west. Most areas east of the Ailao Mountains have more than 250 hours of sunshine, while the Nujiang area in the northwest receives the least, between 180 and 200 hours (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eb). The precipitation from April to September exhibits a peripheral high, central low pattern, with southern and western edges exceeding 1400 mm, while central-western regions often fall below 800 mm, with some areas under 450 mm (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ec). Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e divides the climate suitability of flue-cured tobacco in Yunnan (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ed). It can be observed that, except for the high-cold areas in the central and northern parts and the rainy areas in the southern and western edges, most areas in Yunnan are suitable for growing flue-cured tobacco. In terms of suitability, there is an overall east-better-west-worse distinction. Most areas east of the Ailao Mountains have moderate temperatures, sufficient sunshine and suitable precipitation during the flue-cured tobacco growing period. Thus, they are mainly divided into suitable areas and most suitable areas, of which the most suitable areas account for about half of the total area. Conversely, most areas west of the Ailao Mountains, despite suitable temperatures, experience slightly insufficient sunshine and excessive rainfall, primarily falling under sub-suitable and suitable zones. In general, whether from the perspective of plantable area or suitability, Yunnan is a highly suitable area for flue-cured tobacco cultivation.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"5 Risk Zoning","content":"\u003cp\u003eAccording to the method described in Section \u003cspan refid=\"Sec8\" class=\"InternalRef\"\u003e1.2.5\u003c/span\u003e, the climate suitability zoning of tobacco was transformed into a comprehensive zoning reflecting tobacco exposure and vulnerability, followed by value assignment. Subsequently, formula (1) was used to calculate a fine-scale hail disaster risk index per square kilometer across the province. Based on the natural breakpoint method, this index was divided into five risk levels: high, medium-high, medium, low, and no risk, achieving precise regional zoning of flue-cured tobacco hail disasters (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). It can be seen from the figure that the flue-cured tobacco hail disaster risk in Yunnan generally shows a pattern of higher in the east and lower in the west, which is significantly different from the risk zoning. In the north-central regions, due to high altitudes and low temperatures, large areas are not suitable for flue-cured tobacco cultivation. Therefore, these zones are considered no-risk zones due to the lack of exposure. Similarly, the rainy southern and western margins are not conducive to flue-cured tobacco cultivation and are also classified as no-risk zones. In contrast, most of the higher and medium-risk areas east of the Ailao Mountains align with suitable or most suitable tobacco zones. Because of their large cultivation areas, once they are hit by hail, the losses will be relatively large. Thus, they are classified as medium-high or high-risk zones. The western side of the Ailao Mountains generally poses lower risks, mainly suitable and sub-suitable zones, with limited tobacco planting, resulting in typical medium or lower risk classifications. Across the province, the area proportions of each risk level are as follows: high-risk zones occupy 0.9%, medium-high-risk zones cover 9.4%, medium-risk zones make up 26.1%, low-risk zones account for 30.4%, and no-risk zones constitute 33.3%. This indicates smaller areas for high and medium-high-risk zones, while medium, low, and no-risk zones are relatively evenly distributed.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"6 Discussion","content":"\u003cp\u003e(1) By applying the random forest machine learning model, selecting appropriate physical and environmental factors, and relying on a relatively balanced training set, a hail day discrimination model with high recognition rate was successfully trained. This model facilitated a refined depiction of hail day spatial distribution during the tobacco field period in Yunnan for the climate baseline year. This distribution can not only reflect the impact of large terrain on hail distribution, but also reveal the effect of local landforms on hail, offering practical insights and potential application for other weather event identifications.\u003c/p\u003e \u003cp\u003e(2) Due to the limitations of hail observation data and the coarse grid resolution of ERA5 data used for calculating strong convective ingredients, inevitable biases exist in the computed hail day discrimination model. Therefore, it is necessary to correct the residuals of these estimation results. Through revision, not only can local extreme features be more highlighted, but the spatial distribution can also be made more objective and reasonable. With the increase in the number of anti-hail sites established by meteorological departments, hail damage can be mitigated effectively, while more observational samples are collected. These samples provide crucial data support for future in-depth studies of hail disaster risks.\u003c/p\u003e \u003cp\u003eDue to the challenges posed by crop rotation and delayed disaster data collection, it is difficult to accurately obtain the exposure of flue-cured tobacco (i.e., its spatial distribution) and the vulnerability to hail disaster (i.e., disaster loss). Therefore, this study uses tobacco climate suitability as a comprehensive index to represent exposure and vulnerability. The main reason is that areas where tobacco is unsuitable or only moderately suitable for planting have low planting probability (low exposure) and, consequently, minor losses when hail strikes (low vulnerability). By contrast, areas with higher climate suitability are more likely to grow flue-cured tobacco (high exposure), and thus suffer greater losses from hail (high vulnerability).\u003c/p\u003e"},{"header":"7 Conclusion","content":"\u003cp\u003eIn the northern high-altitude regions, hail disaster hazards are generally high to very high. However, due to low temperatures and insufficient heat, these areas are unsuitable for flue-cured tobacco cultivation, thus categorized as no-risk zones due to lack of exposure. Similarly, although the southern and western margins exhibit lower hail hazards, excessive rainfall during the tobacco field period can cause abnormal root and leaf growth along with pest issues, classifying them as unsuitable and no-risk zones. Most other areas are cultivable, exhibiting an east-better-west-worse suitability pattern, while hail hazards display an east-high, west-low trend. Consequently, western regions of the Ailao Mountains generally pose lower risks, predominantly medium or low risk. Meanwhile, risks are somewhat elevated east of the Ailao Mountains, particularly in areas where high hazard and high suitability overlap, forming moderately high or high-risk zones. High and moderately high-risk zones cover less than 10% of the total provincial area. Medium, low, and no-risk zones each account for about 30% of the area. Given the overall low hail risk west of the Ailao Mountains, expanding the planting area of suitable tobacco varieties may be considered. For eastern areas with high suitability but relatively high risk, protective measures such as artificial anti-hail interventions or installing hail nets should be implemented to minimize economic losses while maintaining cultivation scale.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by Yunnan Science and Technology Planning Project ( 2018BC007) .\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contribution\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eYZ participated in sample collection, optimization of analysis and manuscript writing. YL participated in sample collection, optimization of analysis and manuscript writing. XH provided field data. LZ participated in development of ideas and provided field data. PY participated in development of ideas and optimization of analysis. All the authors read and approve the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll data generated or analysed during this study are included in this published article and its supplementary information file.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical approval\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study did not involve any protected or endangered species, and no specific permissions were required for the sampling on the collection localities.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to participate\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eKim M, Lee J, Lee S. Hail:Mechanisms, Monitoring, Forecasting, Damages, Financial Compensation Systems, and Prevention[J]. Atmosphere.2023,14(11):1642. https://doi.org/10.3390/atmos14111642.\u003c/li\u003e\n \u003cli\u003eBotzen W,Bouwer L,Bergh J.Climate change and hailstorm damage:Empirical evidence and implications for agriculture and insurance[J],Resource and Energy Economics, 2010,32(3):341-362. https://doi.org/10.1016/j.reseneeco.2009.10.004.\u003c/li\u003e\n \u003cli\u003eMcMaster H.Hailstorm risk assessment in rural New South Wales[J].Natural Hazards,2001,24:187-196.\u0026nbsp;https://doi.org/10.1023/A:1011820206279.\u003c/li\u003e\n \u003cli\u003eIstrate V,Jitariu V,Ichim P,et al.Hailstorm risk assessment for crop areas in Moldova Region (Romania)[J].Present Environment and Sustainable Development,2021,15(2):55-67.\u003c/li\u003e\n \u003cli\u003eWang L,Hu G,Yue,Y,et al.GIS-Based Risk Assessment of Hail Disasters Affecting Cotton and Its Spatiotemporal Evolution in China[J].Sustainability,2016,8,218.\u0026nbsp;https://doi.org/10.3390/su8030218.\u003c/li\u003e\n \u003cli\u003eZhang S,Liu S,Zhang T.Analysis on the Evolution and Microphysical Characteristics of Two Consecutive Hailstorms in Spring in Yunnan, China[J].Atmosphere.2021,12(1): 63.https://doi.org/10.3390/atmos12010063.\u003c/li\u003e\n \u003cli\u003eZhang C,Zhang Q,Wang Y.Climatology of Hail in China:1961\u0026ndash;2005[J].Journal of Applied Meteorology and Climatology,2008,47:795\u0026ndash;804.https://doi.org/10.1175/2007JAMC1603.1.\u003c/li\u003e\n \u003cli\u003eDong Y,Hua S,Chen B,et al.Numerical simulation of a pulse hailstorm in the plateau region in southwestern China[J].Atmospheric Research,2024,299:107218.https://doi.org/10.1016/j.atmosres.2023.107218.\u003c/li\u003e\n \u003cli\u003eTao, Y., Duan, X., Duan, C., et al. Characteristics of Hail Changes in Yunnan [J]. Plateau Meteorology, 2011, 30(04): 1108-1118.\u003c/li\u003e\n \u003cli\u003eYin, L.Y., Mei, H., Zhang, T.F., et al. Analysis of Fine-scale Hail Disaster Risk Zoning in Yunnan Province [J]. Journal of Catastrophology, 2022, 37(03): 99-105.\u003c/li\u003e\n \u003cli\u003eTie J, Li S,He W,et al.Study of metabolite differences of flue-cured tobacco from Canada (CT157) and Yunnan (Yunyan 87)[J].\u0026nbsp;Heliyon,2024,10(11).\u0026nbsp;https://doi.org/10.1016/j.heliyon.2024.e32417.\u003c/li\u003e\n \u003cli\u003eEng I.Agglomeration and the Local State: the Tobacco Economy of Yunnan,China[J].Transactions of the Institute of British Geographers,1999,24(3):315-329. https://doi.org/10.1111/j.0020-2754.1999.00315.x.\u003c/li\u003e\n \u003cli\u003eHuang, Z.Y., Kong, G.H., Ni, X., et al. Climatic Demonstration for the Substitution of Imported Tobacco with Flue-Cured Tobacco in Some Tobacco Areas of Yunnan [J]. Journal of Natural Resources, 2011, 26(09): 1592-1602.\u003c/li\u003e\n \u003cli\u003eYin Y,Wang J,Zhao J,et al.Risk Assessment of Hail Disaster on Cotton \u0026mdash; A Case Study in Anhui Province[J].Agricultural Science \u0026amp; Technology,2012,13(8):1744-1748.\u003c/li\u003e\n \u003cli\u003eLeigh R,Kuhnel I.Hailstorm Loss Modelling and Risk Assessment in the Sydney Region[J], Australia.Natural Hazards,2001,24, 171\u0026ndash;185. https://doi.org/10.1023/A:1011855801345.\u003c/li\u003e\n \u003cli\u003eDaradur M,Leah T,Pandey R,et al.Variability and Risk Assessment of Hail in the Republic of Moldova[J].Present Environment \u0026amp; Sustainable Development,2016,10(2):141-152. https://doi.org/10.1515/pesd-2016-0032.\u003c/li\u003e\n \u003cli\u003eLi, M., Zhu, Y., Ji, W.J. Risk Assessment of Hail Disasters in Yunnan Tobacco Areas Based on GIS [J]. Chinese Agricultural Meteorology, 2012, 33(01): 129-133.\u003c/li\u003e\n \u003cli\u003eZhang, H., Liu, X.L., Fang, P. Risk Assessment of Hail Disasters in the Main Production Areas of Flue-Cured Tobacco in Sichuan [J]. Meteorological Science and Technology, 2016, 44(3): 468-473.\u003c/li\u003e\n \u003cli\u003ePrein A F,Holland G J.Global estimates of damaging hail hazard[J].Weather and Climate Extremes,2018,22:10-23.https://doi.org/ 10.1016/j.wace.2018.10.004.\u003c/li\u003e\n \u003cli\u003eSama B,Uma K N, Das S K.A comprehensive assessment of the temporal and spatial variation of hail producing convective storms over eastern India using weather radar[J].Climate Dynamics,2024,62:4749-4773. https://doi.org/ 10.1007/s00382-023-07059-0.\u003c/li\u003e\n \u003cli\u003eZhang, W.H., Li, L. Preliminary Application of Artificial Intelligence in Hail Identification and Nowcasting [J]. Acta Meteorologica Sinica, 2019, 77(02): 282-291.\u003c/li\u003e\n \u003cli\u003ePulukool F,Li L,Liu C.Using Deep Learning and Machine Learning Methods to Diagnose Hailstorms in Large-Scale Thermodynamic Environments[J].Sustainability,2020,12(24).\u0026nbsp;https://doi.org/10.3390/su122410499.\u003c/li\u003e\n \u003cli\u003ePullman M,Gurung I,Maskey M,et al.Applying Deep Learning to Hail Detection: A Case Study[J].IEEE Transactions on Geoscience and Remote Sensing,2019,57(12):10218-10225.\u0026nbsp;https://doi.org/10.1109/TGRS.2019.2931944.\u003c/li\u003e\n \u003cli\u003eYao H, Li X, Pang H,et al.Application of random forest algorithm in hail forecasting over Shandong Peninsula[J].Atmospheric Research,2020,244.\u0026nbsp;https://doi.org/10.1016/j.atmosres.2020.105093.\u003c/li\u003e\n \u003cli\u003eCzernecki B,Taszarek M,Marosz M,et al.Application of machine learning to large hail prediction - The importance of radar reflectivity, lightning occurrence and convective parameters derived from ERA5[J].Atmospheric Research,2019,227:249-262. https://doi.org/10.1016/j.atmosres.2019.05.010.\u003c/li\u003e\n \u003cli\u003eTorralba V,H\u0026eacute;nin R,Cantelli A,et al.Modelling hail hazard over Italy with ERA5 large-scale variables[J].Weather and Climate Extremes,2023,39.\u0026nbsp;https://doi.org/10.1016/j.wace.2022.100535.\u003c/li\u003e\n \u003cli\u003eDoswell III C A,Brooks H E,Maddox R A.Flash Flood Forecasting: an Ingredients-Based Methodology[J].Weather and Forecasting,1996,11(4):560-581.\u0026nbsp;https://doi.org/10.1175/1520-0434(1996)011\u0026lt;0560:FFFAIB\u0026gt;2.0.CO;2.\u003c/li\u003e\n \u003cli\u003eEvans M,Jurewicz M.Correlations Between Analyses and Forecasts of Banded Snow Ingredients and Observed Snowfall[J]. WeaForecasting,2009,24(1):337-350.\u0026nbsp;https://doi.org/10.1175/2008WAF2007105.1.\u003c/li\u003e\n \u003cli\u003eAllen J,Karoly D,Walsh K.Future Australian Severe Thunderstorm Environments.Part II: The Influence of a Strongly Warming Climate on Convective Environments[J].Journal of Climate,2014,27(10),3848\u0026ndash;3868.\u0026nbsp;https://doi.org/10.1175/JCLI-D-13-00426.1.\u003c/li\u003e\n \u003cli\u003eTian F,Xia K,Sun J,et al.Ingredients-based Methodology and Fuzzy Logic Combined Short-Duration Heavy Rainfall Short-Range Forecasting:An Improved Scheme[J].Journal of Tropical Meteorology,2024,30(3):241-256.\u0026nbsp;https://doi.org/10.3724/j.1006-8775.2024.022.\u003c/li\u003e\n \u003cli\u003eRaupach T H,Soderholm J S,Warren R A.Changes in Hail Hazard Across Australia: 1979-2021[J].npj climate and atmospheric science, 2023,6.\u0026nbsp;https://doi.org/10.1038/s41612-023-00454-8.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eKahraman A,Kendon E J,Fowler H J.Climatology of \u0026nbsp; Severe Hail Potential in Europe Based on A Convection-Permitting Simulation[J]. Climate Dynamics,2024,62:6625-6642. https://doi.org/10.1007/s00382-024-07227-w.\u003c/li\u003e\n \u003cli\u003ePrein A F, Holland G J.Global estimates of damaging hail hazard[J].Weather and Climate Extremes,2018,22:10-23.\u0026nbsp;https://doi.org/10.1016/j.wace.2018.10.004.\u003c/li\u003e\n \u003cli\u003eL\u0026oacute;pez L, Marcos J L, S\u0026aacute;nchez J L,et al.CAPE Values and Hailstorms on Northwestern Spain[J].Atmospheric Research, 2001, 56(1):147-160.\u0026nbsp;https://doi.org/10.1016/S0169-8095(00)00095-8.\u003c/li\u003e\n \u003cli\u003eManzato A .Hail in Northeast Italy: Climatology and Bivariate Analysis with the Sounding-Derived Indices[J].Journal of Applied Meteorology and Climatology, 2012,51(3):449-467.\u0026nbsp;https://doi.org/10.1175/JAMC-D-10-05012.1.\u003c/li\u003e\n \u003cli\u003eBillet J, Delisi M , Smith B G,et al.Use of Regression Techniques to Predict Hail Size and the Probability of Large Hail[J].Weather and Forecasting,1997,12(1):154-164.\u0026nbsp;https://doi.org/10.1175/1520-0434(1997)012\u0026lt;0154:UORTTP\u0026gt;2.0.CO;2.\u003c/li\u003e\n \u003cli\u003eLiu, B., Zou, L.Y., Li, X.P., et al. Analysis of Environmental Conditions for Thunderstorm Winds in Yunnan [J]. Meteorology, 2022, 48(11): 1402-1417.\u003c/li\u003e\n \u003cli\u003ePu, W.Y., Li, H.B., Song, Y., et al. Analysis and Application of the Influence of 0\u0026deg;C Layer Height Changes on Hail Melting [J]. Meteorology, 2015, 41(8): 980-985.\u003c/li\u003e\n \u003cli\u003eVinet F.Climatology of Hail in France[J].Atmospheric Research,2001,56(1):309-323.https://doi.org/10.1016/S0169-8095(00)00082-X.\u003c/li\u003e\n \u003cli\u003eKunz M.\u0026nbsp;High-Resolution Assessment of the Hail Hazard over Complex Terrain from Radar and Insurance Data[J].\u0026nbsp;Meteorologische Zeitschrift,2010,19(5):427-439.\u0026nbsp;https://doi.org/10.1127/0941-2948/2010/0452.\u003c/li\u003e\n \u003cli\u003eJin H,Lee H,Lkhamjav J,et al.A hail climatology in South Korea[J].Atmospheric Research,2017,188:90-99.\u0026nbsp;https://doi.org/10.1016/j.atmosres.2016.12.013.\u003c/li\u003e\n \u003cli\u003eGdeeb R T.Weather Classification Using Meta-Based Random Forest Fusion of Transfer Learning Models[J].\u0026nbsp;International Journal of Advances in Intelligent Informatics,2024,10(2):186-201. https://doi.org/10.26555/ijain.v10i2.1264.\u003c/li\u003e\n \u003cli\u003eWang, J., Liu, L.P. Analysis of the Relationship between Hail Distribution and Topographic Factors in Guizhou Province Based on GIS [J]. Journal of Applied Meteorology, 2008, 19(5): 627-634.\u003c/li\u003e\n \u003cli\u003eLiu, X.L., Zhang, Y., Liu, J.X. Temporal and Spatial Characteristics of Hail Disasters in Southwestern Sichuan Mountains [J]. Arid Meteorology, 2016, 34(01): 75-81.\u003c/li\u003e\n \u003cli\u003eNoi P T,Degener J,Kappas M.Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data[J].Remote Sensing,2017,9(5). https://doi.org/10.3390/rs9050398.\u003c/li\u003e\n \u003cli\u003eDu X, Lin X.Conceptual Model on Regional Natural Disaster Risk Assessment[J].Procedia Engineering,2012,45:96-100.\u0026nbsp;https://doi.org/10.1016/j.proeng.2012.08.127.\u003c/li\u003e\n \u003cli\u003eLi, X.H. Application of Random Forest Model in Classification and Regression Analysis [J]. Chinese Journal of Applied Entomology, 2013, 50(04): 1190-1197.\u003c/li\u003e\n \u003cli\u003eHuang, Z.Y., Zhu, Y., Wang, S.H., et al. The Relationship Between Intrinsic Quality of Yunnan Flue-Cured Tobacco and Climate [J]. Resources Science, 2007(02): 83-90.\u003c/li\u003e\n \u003cli\u003eHuang, Z.Y., Zhu, Y., Deng, Y.L., et al. Influence of Field Period Climate on Tobacco Leaf Quality in Yunnan [J]. Chinese Agricultural Meteorology, 2008(04): 440-445+449.\u003c/li\u003e\n \u003cli\u003eHu, X.Q., Huang, Z.Y., Zhu, Y., et al. Study on Climate Types and Suitability of Yunnan Flue-Cured Tobacco [J]. Journal of Nanjing Institute of Meteorology, 2006(04): 563-568.\u003c/li\u003e\n \u003cli\u003eFuchs S,Birkmann J,Glade T.Vulnerability assessment in natural hazard and risk analysis: current approaches and future challenges[J].Natural Hazards, 2012, 64:1969\u0026ndash;1975.\u0026nbsp;https://doi.org/10.1007/s11069-012-0352-9.\u003c/li\u003e\n \u003cli\u003eWard P J, Blauhut V, Bloemendaal N,et al.Review article: Natural hazard risk assessments at the global scale[J]. Natural Hazards and Earth System Sciences,2020,20(4):1069-1096.https://doi.org/10.5194/nhess-20-1069-2020.\u003c/li\u003e\n \u003cli\u003eZhang, G.C. Principles and Methods of Natural Disaster Risk Assessment and Zoning [M]. Meteorological Press, 2014.\u003c/li\u003e\n \u003cli\u003eSpelmen V S, Porkodi R.A Review on Handling Imbalanced Data[J].2018 International Conference on Current Trends towards Converging Technologies (ICCTCT),2018:1-11.DOI:10.1109/ICCTCT.2018.8551020.\u003c/li\u003e\n \u003cli\u003eYap B W,Rani K A, Rahman H A A,et al. An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets[C] Herawan T,Deris M M,Abawajy J.Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013).Singapore:Springer,2014:13-22.https://doi.org/10.1007/978-981-4585-18-7_2.\u003c/li\u003e\n \u003cli\u003eBej S,Davtyan N,Wolfien M,et al.LoRAS: An oversampling approach for imbalanced datasets[J].Machine Learning,2021,110:279-301.https://doi.org/10.1007/s10994-020-05913-4.\u003c/li\u003e\n \u003cli\u003eChawla N V,Bowyer K W, Hall L O,et al.SMOTE: Synthetic Minority Over-sampling Technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.\u0026nbsp;https://doi.org/10.1613/jair.953.\u003c/li\u003e\n \u003cli\u003eYuan Y, Wu L, Zhang X, Gini-Impurity Index Analysis[J]. IEEE Transactions on Information Forensics and Security, 2021,16:3154-3169.\u0026nbsp;https://doi.org/10.1109/TIFS.2021.3076932.\u003c/li\u003e\n \u003cli\u003eStrobl C, Boulesteix A,Augustin T.Unbiased split selection for classification trees based on the Gini Index[J].Computational Statistics \u0026amp; Data Analysis,2007,52(1):483-501.https://doi.org/10.1016/j.csda.2006.12.030.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8599617/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8599617/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study explores the thermal, dynamic and environmental conditions for hail formation, and proposes a machine learning method based on random forest models. The methodology incorporates strong convective ingredients such as CAPE, LI, 0°C level height, -20°C level height, along with disaster-prone environmental factors like altitude, slope, and aspect, to objectively analyze hail days. After steps such as training set construction, factor selection, regional modeling, and model training, the Yunnan hail day discrimination model was developed. By integrating DEM elevation data and ERA5 reanalysis-derived convective ingredients, a distribution map of hail days during the climate baseline year over the 1 km × 1 km grid of Yunnan flue-cured tobacco fields was generated. After verification and correction, the natural breakpoint method was used to achieve the refined zoning for hail disaster risk in Yunnan flue-cured tobacco. It was found that the hail disaster risk presents a higher tendency in the north compared to the south, with northern high-altitude areas predominantly at high or moderately high risk, while the low-altitude central and southern mountainous regions display moderate to low risk. Additionally, by employing climatic factors such as average temperature in July, sunshine duration from July to August, and rainfall from April to September, alongside geographical and topographical elements, precise climatic suitability zoning for Yunnan flue-cured tobacco was accomplished. This zoning represents a comprehensive analysis of exposure and vulnerability. Combined with the refined risk zoning of flue-cured tobacco hail disasters, the refined risk zoning of hail disasters in Yunnan flue-cured tobacco was finally achieved. The results indicated an overall east-high, west-low risk pattern: areas east of the Ailao Mountains showed higher risk, while those west exhibited moderate to lower risk. Notably, the high-cold zones in the north-central region and the rainy margins of the south and west demonstrated no risk. This study aims to provide scientific guidance for rational layout and hail prevention work during the field period of Yunnan flue-cured tobacco.\u003c/p\u003e","manuscriptTitle":"Research on Fine-Scale Risk Zoning of Hail Disasters in Yunnan Flue-Cured Tobacco Based on Machine Learning","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-23 12:00:23","doi":"10.21203/rs.3.rs-8599617/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"e3fe6c73-f12f-4f66-9b77-eb6a05fe7fde","owner":[],"postedDate":"January 23rd, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-03-08T10:09:06+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-23 12:00:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8599617","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8599617","identity":"rs-8599617","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.