Assessing Air Quality in Tehran: Explainable Artificial Intelligence for Megacities

preprint OA: closed
Full text JSON View at publisher
Full text 210,306 characters · extracted from preprint-html · click to expand
Assessing Air Quality in Tehran: Explainable Artificial Intelligence for Megacities | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Assessing Air Quality in Tehran: Explainable Artificial Intelligence for Megacities Alireza Faghani Ghodrat, Somayeh Rezaei Sough This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8291122/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Urban air pollution poses a persistent threat to public health and environmental sustainability in megacities like Tehran, where complex emission sources and topographical constraints amplify exposure risks. This study presents a transparent, interpretable, and highly accurate machine learning framework for district-level air quality assessment, leveraging a decade of high-resolution hourly monitoring data (April 2015–April 2025) across six socio-spatially stratified districts. Using an ensemble of tree-based models—XGBoost, CatBoost, and LightGBM—the framework forecasts concentrations of six key pollutants (PM₂.₅, PM₁₀, NO₂, CO, SO₂, O₃) with exceptional fidelity (MAE: 3.0–9.0 µg/m³; R²: 0.65–0.91). Through SHAP-based explainable Artificial Intelligence, the model identifies seasonally dynamic, district-specific pollution drivers—such as CO/NO₂ dominance in traffic corridors during winter and O₃-driven photochemical regimes in receptor basins during summer—revealing signatures that align with Tehran’s known emission geography. By converting pollutant forecasts into U.S. EPA Air Quality Index (AQI) categories, the framework provides spatially resolved estimates of population-level health risk exposure—from Moderate to Hazardous conditions—enabling targeted public health interventions. Designed for scalability and immediate policy utility, this work delivers a reproducible blueprint for data-driven air quality governance. By demonstrating that high-impact forecasting is achievable even in data-constrained settings, the study offers a globally transferable model for healthier, more resilient megacities—particularly in low- and middle-income regions where monitoring infrastructure is limited but governance needs are urgent. Artificial Intelligence and Machine Learning Environmental Engineering Air Pollution Machine Learning Explainable Artificial Intelligence (XAI) Urban Environment. Figures Figure 1 Figure 2 Figure 3 Figure 4 1. Introduction Tehran, the capital of Iran and a major Middle Eastern metropolis with over 15 million residents in its greater urban region, faces intensifying air quality challenges driven by rapid urbanization, industrial expansion, and escalating vehicular emissions. Enclosed by the Alborz mountain range, the city’s topography acts as a natural barrier that traps pollutants near the surface, amplifying health risks from elevated concentrations of PM₂.₅, PM₁₀, NO₂, CO, SO₂, and O₃—substances strongly linked to respiratory and cardiovascular diseases (Lelieveld et al., 2015 ; Pope & Dockery, 2006 ). Globally, air pollution ranks among the leading causes of premature mortality, with low- and middle-income countries such as Iran bearing a disproportionate burden (Lelieveld et al., 2015 ; World Health Organization, 2021 ). In Tehran, deteriorating air quality has been correlated with increased hospital admissions for asthma and chronic obstructive pulmonary disease, underscoring the urgent need for robust, data-driven forecasting tools to protect public health and inform evidence-based environmental governance. Long-term monitoring is essential to fully capture the complexity of urban air pollution—including seasonal variability, multiyear trends, and the effects of major disruptions such as regulatory measures or global events like the COVID-19 pandemic (Le et al., 2020 ; Shi & Brasseur, 2020 ). Tehran’s pollution dynamics are further shaped by recurring meteorological phenomena, particularly winter temperature inversions that confine pollutants near the surface (Nejad et al., 2023 ). This intricate interplay between anthropogenic emissions and atmospheric processes necessitates high-resolution, spatially disaggregated analysis to reveal localized exposure patterns that city-wide averages inevitably obscure. This study presents a decade-long, district-level assessment of Tehran’s air pollution (April 2015–April 2025)—one of the most comprehensive evaluations to date. Six districts—1, 2, 6, 14, 15, and 19—were strategically selected to represent the city’s spatial, socioeconomic, and emission heterogeneity. District 1 corresponds to the northern residential/foothill zone, characterized by low local emissions but high sensitivity to topographically trapped and regionally transported pollutants. District 2 serves as a major high-traffic corridor in the northwest, intersected by multiple expressways and dominated by mobile-source emissions. District 6 encompasses central Tehran’s administrative and commercial core, marked by intense vehicular activity and dense urban morphology. District 14, in the southeast, typifies high-density residential neighborhoods influenced by pollutant transport from adjacent industrial areas. District 15, located in the far south, hosts Tehran’s primary industrial cluster, generating substantial stationary-source and heavy-vehicle emissions. Finally, District 19, situated in the low-lying south-southwest basin, functions as a receptor zone highly prone to pollutant accumulation. This spatial stratification provides a controlled analytical framework that uncovers district-specific pollution regimes otherwise masked in aggregated city-level statistics. We develop a transparent, reproducible, and highly interpretable machine learning framework based on three state-of-the-art tree-based ensemble models: XGBoost, CatBoost, and LightGBM. These models were selected for their proven accuracy, computational efficiency, and native support for SHAP (Shapley Additive exPlanations)-based interpretability (Chen & Guestrin, 2016 ; Lundberg & Lee, 2017 ). Using only historical pollutant concentrations as input, our approach avoids reliance on external covariates while maintaining scientific rigor and real-world deployability. Critically, SHAP analysis transforms predictive outputs into actionable, physically interpretable insights, revealing the dominant pollution drivers in each district and season—thereby advancing recent developments in explainable AI (XAI) for environmental monitoring (Wu et al., 2022 ). This work delivers five key contributions, positioning it at the forefront of AI-driven environmental science: A Decade of High-Resolution Data Across Strategically Selected Districts: We analyze 10 years of hourly monitoring data from six socio-spatially diverse districts, enabling robust modeling of long-term trends and heterogeneous urban responses across Tehran’s complex fabric. A Highly Accurate and Interpretable Modeling Framework: Our ensemble of tree-based models achieves precise short-term forecasting while ensuring full transparency through SHAP-based explainability—offering a reproducible alternative to complex deep learning or hybrid architectures that often require extensive external covariates (Lin et al., 2022 ; Peng et al., 2022 ; Zaini et al., 2022 ; Xu et al., 2022 ). Context-Aware Pollution Regime Identification: SHAP analysis uncovers interpretable, spatiotemporally dynamic pollution signatures that reflect local emission sources, seasonal patterns, and urban form—providing evidence-based guidance for precision environmental policy. Quantification of Real-World Policy Impacts: We capture significant air quality improvements during the 2020–2022 mobility restrictions and their post-pandemic rebound, demonstrating the model’s sensitivity to societal interventions—consistent with global observations of pandemic-era air quality shifts (Le et al., 2020 ; Taghizadeh et al., 2023 ). A Scalable, Replicable Blueprint for Global Megacities: The end-to-end pipeline—from preprocessing to prediction and explanation—is designed for adaptation in other data-constrained urban environments, particularly in low- and middle-income regions where monitoring infrastructure is limited yet governance needs are urgent (Taghizadeh et al., 2023 ). By advancing a transparent, data-driven, and policy-ready framework, this study directly informs environmental decision-making in Tehran and provides a transferable model for cities worldwide seeking to transform air quality monitoring data into actionable intelligence for healthier, more sustainable urban futures. Our work demonstrates how existing monitoring infrastructure can be leveraged to generate high-resolution, policy-relevant exposure intelligence through modern data science. 2. Methodology 2.1 Data Source Hourly air quality data were obtained from the Tehran Air Quality Monitoring System, operated by the Tehran Environmental Monitoring Center. The dataset includes measurements of six key pollutants—PM₂.₅, PM₁₀, NO₂, CO, SO₂, and O₃—collected from fixed monitoring stations in six districts: 1, 2, 6, 14, 15, and 19. These districts were selected to represent Tehran’s socio-spatial and emission heterogeneity: District 1: northern residential/foothill zone (low local emissions, high sensitivity to regional transport), Districts 2 and 14: high-traffic corridors (dominated by mobile-source emissions), District 6: central administrative and commercial core (intense vehicular and energy-related activity), District 15: southern industrial cluster (stationary-source and heavy-vehicle emissions), District 19: south-southwest receptor basin (prone to pollutant accumulation). Covering April 2015 to April 2025, this decade-long dataset enables robust long-term trend analysis, seasonal decomposition, and rigorous out-of-sample forecasting. Data quality assessments confirmed high completeness, with missing values systematically addressed during preprocessing. 2.2 Preprocessing Data preprocessing ensured temporal consistency and modeling suitability. Missing values were handled using a time-series-aware imputation strategy: linear interpolation for gaps ≤ 12 hours, followed by a 24-hour centered rolling mean for longer gaps, with global mean imputation as a final fallback. This approach preserved temporal autocorrelation while minimizing distortion of extreme events. Feature engineering focused on lagged predictors (t − 1) for each pollutant to capture short-term autocorrelation—a critical property of urban air quality time series. All features were normalized to the [0, 1] range using MinMaxScaler to stabilize model training. The Air Quality Index (AQI) was computed following U.S. EPA (2020) guidelines: $$\:AQI=\text{m}\text{a}\text{x}\left({\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{P}\text{M}₂.₅}{,\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{P}\text{M}₁₀}{,\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{N}\text{O}₂}{,\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{S}\text{O}₂}{,\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{C}\text{O}}{,\text{S}\text{u}\text{b}-\text{I}\text{n}\text{d}\text{e}\text{x}}_{\text{O}₃}\right)$$ , where sub-indices were derived from EPA breakpoints 2.3 Concentration Ranking and Seasonal Analysis To characterize spatial and temporal pollution patterns, we computed descriptive statistics for all six pollutants: Annual mean concentrations (2015–2025) for each district. Seasonal mean concentrations, stratified into warm (March–September) and cold (October–February) periods to reflect Tehran’s distinct meteorological regimes (e.g., winter inversions and summer photochemistry). These analyses provide essential context for interpreting model predictions and SHAP-based driver identification. 2.4 Model Architecture A transparent and reproducible machine learning framework was developed using XGBoost (Chen & Guestrin, 2016 ), CatBoost, and LightGBM—state-of-the-art tree-based ensembles selected for their accuracy, computational efficiency, and native support for SHAP-based interpretability. All models were implemented as multi-output regressors to simultaneously predict the six pollutant concentrations. Input features consisted of the lag-1 values of all six pollutants; the target was the current-hour concentration vector. This design enables short-term forecasting using only historical pollutant data, ensuring broad applicability in urban settings equipped with monitoring infrastructure. 2.5 Optimization, Model Selection, and AQI Derivation Model hyperparameters were optimized using Bayesian optimization (20 trials per model) to maximize out-of-sample R² on a validation set. For each district, the best-performing model was selected based on R² and MAE on the holdout test period (January 2024–April 2025). Air Quality Index (AQI) categories (e.g., Good , Moderate , Unhealthy ) were derived post-hoc by applying U.S. EPA breakpoints to the predicted pollutant concentrations. This approach ensures perfect alignment between scientific forecasts (µg/m³) and public health outputs (AQI categories), without requiring a separate classification model or introducing prediction misalignment. 2.6 Training Protocol The dataset was partitioned chronologically to preserve temporal dynamics: Training: April, 2015 – December, 2023 Test: January, 2024 – April, 2025 This split enables rigorous out-of-sample forecasting and avoids data leakage. All models were trained on the lag-1 feature set and evaluated on their ability to predict the next hour’s pollutant concentrations. 2.7 Forecasting, Evaluation, and Explainability The best-performing model for each district was used to generate forecasts for the test period (January 2024–April 2025). Evaluation was conducted at two levels: Pollutant-level regression performance, assessed using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R². AQI-level classification performance, evaluated using accuracy, precision, recall, and F1-score on AQI categories derived from predicted pollutant concentrations via U.S. EPA breakpoints. Explainability was provided through SHAP (Shapley Additive exPlanations) analysis, which quantified the contribution of each lagged pollutant feature to model predictions. Annual SHAP rankings were computed using the full training set, while seasonal SHAP rankings were derived by stratifying the training data into warm (March–September) and cold (October–February) periods. The framework’s sensitivity to real-world events was demonstrated through an analysis of pre/during/post-COVID pollutant trends. 3. Results 3.1 Annual and Seasonal Pollutant Concentrations Table 1 presents the annual mean concentrations of six key pollutants across Tehran’s six socio-spatially stratified districts over the 2015–2025 period. District 19 (South-Southwest receptor basin) consistently records the highest concentrations for PM₂.₅ (48.2 µg/m³), PM₁₀ (109.7 µg/m³), and CO (2287.3 µg/m³), confirming its status as Tehran’s most polluted district—consistent with its dense residential/commercial land use, high vehicular density, and topographical vulnerability to pollutant accumulation in the low-lying basin. District 2 (northwest high-traffic corridor) exhibits the highest NO₂ (90.1 µg/m³) and the second-highest CO, reflecting its role as a major transportation artery intersected by multiple expressways. Table 1 Annual Mean Pollutant Concentrations (2015–2025) District PM₂.₅ (µg/m³) PM₁₀ (µg/m³) NO₂ (µg/m³) CO (µg/m³) SO₂ (µg/m³) O₃ (µg/m³) 1 22.8 61.3 108.5 2062.3 13.6 42.7 2 40.9 89.6 90.1 2534.8 20.8 37.6 6 33.3 82 77.2 2254.5 16.9 43 14 35.1 87.4 84.5 1918.5 19.8 42.3 15 24.4 70.3 86.1 1485.7 14.8 46.2 19 48.2 109.7 73.2 2287.3 20.1 33.7 SO₂ concentrations peak in District 2 (20.8 µg/m³) and District 19 (20.1 µg/m³), suggesting contributions from industrial or energy-related combustion sources beyond mobile emissions alone. In contrast, District 15 (southern industrial cluster) shows the lowest levels of PM₂.₅, PM₁₀, and CO, aligning with its profile as a regulated industrial zone that likely implements emission controls despite its stationary-source activity. Critically, all districts exceed the WHO’s annual PM₂.₅ guideline (5 µg/m³) by 5 to 10 times, underscoring the severity and persistence of Tehran’s air quality crisis. Table 2 reveals pronounced seasonal patterns shaped by Tehran’s semi-arid climate and recurrent meteorological phenomena: During the cold season (October–February), concentrations of PM₂.₅, PM₁₀, NO₂, CO, and SO₂ are systematically elevated across all districts. For example, District 19’s PM₂.₅ increases from 44.2 µg/m³ (warm) to 53.3 µg/m³ (cold)—a rise driven by wintertime temperature inversions that suppress vertical dispersion and by increased demand for residential and commercial heating. In the warm season (March–September), O₃ concentrations rise sharply due to enhanced photochemical activity under intense solar radiation. For instance, District 1’s O₃ jumps from 25.9 µg/m³ (cold) to 54.9 µg/m³ (warm), reflecting the formation of secondary pollutants from NOₓ and VOC precursors. These seasonal shifts exhibit notable spatial heterogeneity: Traffic-heavy districts (2 and 14) display the most dramatic cold-season increases in NO₂ and CO—e.g., District 2’s CO rises from 2403.4 µg/m³ (warm) to 2717.9 µg/m³ (cold)—highlighting the compounding effect of traffic emissions and meteorological trapping. Industrial districts (6 and 15) show only modest winter increases in particulate matter but maintain relatively stable SO₂ levels between seasons, suggesting regulated or consistent industrial combustion processes. District 19, despite its high precursor (NOₓ) concentrations, exhibits relatively low summer O₃ (43.2 µg/m³) compared to other districts (54–56 µg/m³), likely due to NO titration, where fresh NO emissions chemically destroy O₃—a hallmark of traffic-influenced photochemical regimes in receptor zones. These empirically observed concentration dynamics closely validate the SHAP-based driver rankings in Table 6 : In the cold season, SHAP identifies CO and NO₂ as dominant predictive features across most districts, mirroring their elevated concentrations and primary-emission origins. In the warm season, O₃ and NO₂ emerge as top predictors—particularly in photochemical hotspots like District 19—directly reflecting the shift toward secondary pollution chemistry. Thus, the concentration trends in Table 2 provide the physical foundation for the model’s interpretable insights, reinforcing the mechanistic coherence of our AI-driven air quality assessment framework. Table 2 Seasonal Mean Pollutant Concentrations (2015–2025) District Season PM₂.₅ (µg/m³) PM₁₀ (µg/m³) NO₂ (µg/m³) CO (µg/m³) SO₂ (µg/m³) O₃ (µg/m³) 1 Warm 20.2 58.7 103.7 1695.3 12.7 54.9 1 Cold 26.3 64.8 115.2 2568.6 14.9 25.9 2 Warm 37.1 88.2 80.7 2403.4 18 48 2 Cold 46.2 91.6 103.1 2717.9 24.6 23.2 6 Warm 27.1 81.2 67.6 2027.9 14.3 56.2 6 Cold 41.8 83 90.4 2563.1 20.4 24.9 14 Warm 29.3 86.6 79.6 1636 16.8 54.1 14 Cold 43.1 88.6 91.2 2307.1 24 26.1 15 Warm 23.4 72.4 75.6 1358.5 13.4 54 15 Cold 26 67.3 101 1668.2 16.8 35 19 Warm 44.2 104.3 69.4 2011 16.6 43.2 19 Cold 53.3 116.5 78 2631 24.6 21.9 3.2 Regression and AQI Classification Performance Our framework achieves exceptional predictive accuracy for PM₂.₅ across all six socio-spatially stratified districts of Tehran, with Mean Absolute Error (MAE) ranging from 3.03 to 9.02 µg/m³, Root Mean Squared Error (RMSE) from 4.18 to 13.91 µg/m³, and coefficient of determination (R²) from 0.65 to 0.91 (Table 3 ). These results—derived from a full decade of hourly monitoring data (April 2015–April 2025)—demonstrate that interpretable tree-based ensemble models (CatBoost and LightGBM) can effectively capture the nonlinear, temporally dynamic nature of urban air pollution without relying on external meteorological covariates. Performance varies systematically with district characteristics: District 1 (northern residential/foothill zone) and District 14 (southeastern commercial/transit hub) exhibit the highest predictive performance (R² = 0.914 and 0.873, respectively), reflecting relatively stable and temporally consistent emission patterns dominated by regional transport (D1) and recurrent traffic congestion (D14). District 15 (southern industrial cluster) shows the lowest R² (0.654), attributable to the intermittent nature of industrial operations and the high volatility of PM₁₀ and SO₂ emissions associated with unregulated fugitive dust and episodic stack releases. District 19 (south-southwest receptor basin), despite being Tehran’s most polluted district in terms of mean PM₂.₅ (48.2 µg/m³), achieves a moderate R² (0.791) but the highest MAE (9.02 µg/m³), consistent with its role as a pollution accumulation zone where sharp, inversion-driven spikes challenge even high-performing models. Table 3 Regression Models Performance by District District Best Model MAE (µg/m³) RMSE (µg/m³) R² 1 CatBoost 3.03 4.175 0.914 2 LightGBM 6.762 9.503 0.796 6 LightGBM 5.29 11.274 0.741 14 LightGBM 5.011 7.62 0.873 15 LightGBM 4.527 10.658 0.654 19 CatBoost 9.015 13.914 0.791 Air Quality Index (AQI) categories—spanning Good to Hazardous —were derived post-hoc by applying U.S. EPA (2020) breakpoints directly to the continuous PM₂.₅ predictions generated by each district’s best-performing regression model. This unified, two-stage approach (regression → AQI conversion) eliminates misalignment between scientific forecasts and public health outputs, ensuring that classification results are fully traceable to physical concentration predictions. Table 4 Classification Models Performance by District District Classification Method Accuracy Precision Recall F1-Score 1 CatBoost → AQI 0.899 0.9 0.899 0.894 2 LightGBM → AQI 0.915 0.919 0.915 0.913 6 LightGBM → AQI 0.919 0.92 0.919 0.912 14 LightGBM → AQI 0.939 0.939 0.939 0.936 15 LightGBM → AQI 0.94 0.94 0.94 0.937 19 CatBoost → AQI 0.92 0.928 0.92 0.919 The method delivers exceptional AQI classification performance (Table 4 ), with F1-scores ranging from 0.894 to 0.937 and accuracy between 89.9% and 94.0%—metrics that meet or exceed benchmarks in recent ML-based air quality studies (Wu et al., 2022 ; Peng et al., 2022 ). Notably: District 15 achieves the highest F1-score (0.937), likely because its industrial emission profile produces more distinct, class-separable AQI levels (e.g., frequent Moderate vs. rare Hazardous ). District 1, despite high regression accuracy, shows the lowest F1-score (0.894), possibly due to mixed-source complexity (regional transport, photochemistry, and residential heating) that blurs AQI category boundaries during transitional periods. Together, these results confirm that a transparent, pollutant-only ML pipeline can deliver both high-fidelity concentration forecasts and operationally reliable public health alerts, even in a data-constrained megacity like Tehran. This dual capability—scientific precision coupled with regulatory alignment—is essential for real-world deployment in environmental monitoring and public health response systems. 3.3 AQI Statistics and Health Risk Over the 10-year study period (2015–2025), the Air Quality Index (AQI) across Tehran’s six districts spanned the entire U.S. EPA classification spectrum—from “Good” (0–50) during clean, post-rain conditions to “Hazardous” (> 300) during severe winter inversion episodes. District-level mean AQI values ranged from 77.9 (District 15) to 126.7 (District 19), with hourly extremes as low as 11.3 and as high as 313.7 (Table 5 ). This wide variability reflects Tehran’s bimodal pollution regime: in winter, temperature inversions trap primary pollutants such as PM₂.₅, NO₂, and CO near ground level, producing hazardous air quality; in summer, intense solar radiation fuels photochemical reactions that elevate ozone (O₃) concentrations. District 19—a low-lying, densely populated receptor basin in the south-southwest—emerges as Tehran’s most polluted district, with the highest mean AQI (126.7) and peak PM₂.₅ levels (48.2 µg/m³). Its topography acts as a natural sink, accumulating emissions transported from surrounding high-traffic and industrial zones under stable atmospheric conditions. In contrast: Districts 2 and 14, major traffic corridors, show consistently elevated NO₂ and CO, reflecting their role as primary transit arteries. District 6 (central commercial core) and District 15 (southern industrial cluster) exhibit moderate SO₂ but notable PM₁₀, largely driven by dust resuspension in arid peripheral areas and industrial activity. Health Risk Interpretation Using U.S. EPA AQI Categories The spatial distribution of AQI reveals stark disparities in population-level health exposure: Districts 1 (northern residential/foothill zone) and 15 (industrial cluster) experience consistently “Moderate” air quality (Mean AQI ≈ 78; EPA category: 51–100). While generally acceptable for the general public, both districts occasionally experience “Very Unhealthy” episodes (AQI > 200) during extreme winter inversions or regional dust storms—posing acute risks to sensitive individuals. Districts 6 and 14 hover near the upper boundary of the Moderate range (Mean AQI = 96.0 and 98.3, respectively), frequently surpassing AQI 100 in winter. Though not classified as high-risk on average, they regularly expose children, the elderly, and those with cardiopulmonary conditions to unhealthy air. District 2 (high-traffic corridor) crosses into the “Unhealthy for Sensitive Groups” category (Mean AQI = 110.1) and routinely endures “Very Unhealthy” spikes, underscoring the chronic burden of traffic-related pollution. District 19 is the only district with a mean AQI exceeding 100, placing it in the “Unhealthy” range for the general population (EPA label: AQI 151–200). Even though its mean (126.7) falls below 151, this value implies frequent days in the “Unhealthy” or “Very Unhealthy” categories (as confirmed by its median AQI of 132.1 and max of 313.7). Residents here face chronic, city-wide worst-case exposure, with implications for long-term respiratory and cardiovascular health. This spatial stratification highlights a pressing environmental justice issue: vulnerable populations in low-elevation, high-density districts like 19 bear a disproportionate cumulative exposure burden, despite contributing minimally to emissions. Conversely, District 15, despite hosting industrial activity, achieves relatively better air quality—likely due to lower residential density and possibly stricter emission controls. These findings argue compellingly for district-specific air quality management strategies that integrate not only emission inventories but also topographical vulnerability, land-use patterns, and demographic sensitivity. In a city where monitoring data are abundant but health outcomes are unequally distributed, such granular exposure intelligence is essential for equitable environmental governance. Table 5 Air Quality Index (AQI) Statistics by District District Mean AQI Median AQI Min AQI Max AQI 1 77.9 73 17.6 234.7 2 110.1 107.1 11.3 257.1 6 96 88.5 13.4 272.9 14 98.3 93.5 24.5 248.2 15 78 75.6 20.3 215.5 19 126.7 132.1 30.2 313.7 3.4 Seasonal SHAP Rankings: Uncovering District-Specific Pollution Drivers Table 6 presents a comprehensive, data-driven quantification of pollutant feature importance across six socio-spatially distinct districts of Tehran, stratified by season (cold: October–February; warm: March–September). By leveraging SHAP (SHapley Additive exPlanations), this analysis moves beyond simple concentration metrics to reveal the predictive influence of each lagged pollutant in forecasting air quality outcomes. The results expose clear, seasonally dynamic emission regimes that align precisely with Tehran’s documented pollution sources, land-use patterns, and meteorological context. Cold Season: Traffic and Heating Emissions Dominate During the cold season, temperature inversions and increased heating demand trap emissions near the surface, creating conditions where primary pollutants from combustion processes dominate. This is reflected in high SHAP values for CO and NO₂ across traffic-heavy and mixed-use districts: Districts 1, 2, and 19 all show CO as the primary predictive driver (SHAP: 0.0755–0.0784), directly implicating vehicular traffic and residential heating as dominant sources. District 14 (southeastern transit hub) shows NO₂ (0.0703) slightly edging out CO (0.0691) as the top driver—a distinction reflecting its role as a major commuter corridor with persistent idling and congestion. District 6 (central commercial core) exhibits NO₂ as the top driver (SHAP = 0.0851), consistent with intense vehicular activity and commercial energy use in this dense urban hub—not industrial emissions. District 15 (southern industrial cluster) uniquely identifies PM₁₀ as the leading predictor (SHAP = 0.0451), highlighting the role of resuspended dust and unpaved industrial surfaces in this arid, peripheral zone. The minor but consistent presence of O₃ in cold-season rankings—even under reduced sunlight—reflects carryover precursor chemistry and NO titration dynamics, where residual NOₓ interacts with background ozone. Warm Season: Photochemical Smog and Persistent Dust Signatures Emerge The warm season is characterized by enhanced photochemical activity and reduced inversion trapping, leading to a distinct shift in predictive drivers: District 19 (south-southwest receptor basin) exhibits a textbook photochemical regime, with O₃ as the dominant predictor (SHAP = 0.0999)—the highest of any pollutant in any district/season. This aligns with its low-lying basin topography, which accumulates NOₓ and VOC precursors from surrounding traffic corridors, enabling intense ozone production under strong solar radiation. District 1 (northern residential/foothill zone) shows SO₂ as the primary driver (SHAP = 0.0817), likely due to regional transport of combustion plumes from southern industrial and power-generation sources, with O₃ as a strong secondary influence (SHAP = 0.0647). Districts 2 (northwest traffic corridor) and 14 (southeast transit hub) show a transition toward O₃ dominance: in District 2, CO (0.0592) and O₃ (0.0574) are nearly co-dominant; in District 14, NO₂ (0.0326) and O₃ (0.0292) lead, reflecting the shift from primary to secondary pollution chemistry in summer. District 15 remains PM₁₀-dominated (SHAP = 0.0464), with CO and NO₂ as secondary drivers, underscoring the year-round influence of dust resuspension alongside traffic in this industrial-southeastern zone. District 6 shows the strongest NO₂ signal citywide in summer (SHAP = 0.1281), reinforcing its identity as a traffic- and energy-intensive central core, where combustion emissions remain the dominant forecasting signal even during high-photochemistry periods. Scientific and Policy Significance The seasonal SHAP rankings (Table 6 ) reveal pollutant–district signatures that are physically interpretable and strongly consistent with Tehran’s known emission dynamics and atmospheric controls. These patterns are not statistical artifacts but empirically grounded insights: The cold-season dominance of CO and NO₂—especially in Districts 2, 14, and 19—mirrors Tehran’s vehicle-driven pollution regime, where mobile sources contribute ~ 80–85% of primary pollutants (Taghizadeh et al., 2023 ). This aligns with inversion-enhanced accumulation (Nejad et al., 2023 ) and ML-based findings in global megacities (Wu et al., 2022 ). The intense warm-season O₃ sensitivity in District 19 reflects a secondary pollution hotspot, consistent with Tehran’s municipal air quality reports. Despite the absence of VOC measurements, the O₃ SHAP dominance signals active NOₓ–VOC photochemistry under solar forcing—a mechanism well-documented in ML-O₃ studies (Peng et al., 2022 ; Xu et al., 2022 ). The persistent PM₁₀ leadership in District 15 across both seasons corroborates satellite and ground evidence of dust entrainment in southeastern Tehran (Mamić et al., 2023 ), validating the model’s sensitivity to non-combustion particulate sources. Table 6 Seasonal SHAP Rankings District Season Primary Secondary Tertiary Quaternary Quinary Senary 1 cold CO_ugm3 (0.0784) O3_ugm3 (0.0203) NO2_ugm3 (0.0178) SO2_ugm3 (0.0004) PM2.5_ugm3 (0.0003) PM10_ugm3 (0.0001) 1 warm SO2_ugm3 (0.0817) O3_ugm3 (0.0647) CO_ugm3 (0.0462) NO2_ugm3 (0.0333) PM2.5_ugm3 (0.0237) PM10_ugm3 (0.0147) 2 cold CO_ugm3 (0.0755) NO2_ugm3 (0.048) O3_ugm3 (0.0201) PM10_ugm3 (0.0135) PM2.5_ugm3 (0.009) SO2_ugm3 (0.0074) 2 warm CO_ugm3 (0.0592) O3_ugm3 (0.0574) NO2_ugm3 (0.0438) PM10_ugm3 (0.0142) PM2.5_ugm3 (0.0134) SO2_ugm3 (0.0044) 6 cold NO2_ugm3 (0.0851) CO_ugm3 (0.0416) O3_ugm3 (0.0294) SO2_ugm3 (0.0176) PM10_ugm3 (0.0123) PM2.5_ugm3 (0.0021) 6 warm NO2_ugm3 (0.1281) CO_ugm3 (0.0392) O3_ugm3 (0.0387) PM10_ugm3 (0.0291) SO2_ugm3 (0.0193) PM2.5_ugm3 (0.0136) 14 cold NO2_ugm3 (0.0703) CO_ugm3 (0.0691) SO2_ugm3 (0.0254) PM10_ugm3 (0.0101) PM2.5_ugm3 (0.0019) O3_ugm3 (0.0019) 14 warm NO2_ugm3 (0.0326) O3_ugm3 (0.0292) CO_ugm3 (0.0163) SO2_ugm3 (0.0145) PM10_ugm3 (0.0078) PM2.5_ugm3 (0.0042) 15 cold PM10_ugm3 (0.0451) CO_ugm3 (0.0419) NO2_ugm3 (0.0156) SO2_ugm3 (0.0044) PM2.5_ugm3 (0.0016) O3_ugm3 (0.0016) 15 warm PM10_ugm3 (0.0464) CO_ugm3 (0.0389) NO2_ugm3 (0.0379) PM2.5_ugm3 (0.0348) O3_ugm3 (0.025) SO2_ugm3 (0.0121) 19 cold CO_ugm3 (0.078) NO2_ugm3 (0.0446) PM10_ugm3 (0.0122) SO2_ugm3 (0.0025) PM2.5_ugm3 (0.0009) O3_ugm3 (0.0003) 19 warm O3_ugm3 (0.0999) NO2_ugm3 (0.0705) CO_ugm3 (0.0486) PM2.5_ugm3 (0.0316) PM10_ugm3 (0.0315) SO2_ugm3 (0.0137) Policy implications are spatially precise: Traffic-intensive districts (2, 14, 19): Prioritize low-emission zones, public transit expansion, and fleet electrification to reduce CO/NO₂ (Taghizadeh et al., 2023 ). Central commercial core (6): Target energy efficiency in commercial buildings and traffic flow optimization to curb NO₂. Industrial cluster (15): Enforce dust suppression, paved surfaces, and stack emission controls to address PM₁₀ and SO₂. Photochemical hotspot (19): Implement NOₓ/VOC precursor management, green infrastructure, and urban cooling strategies to mitigate ozone formation (Peng et al., 2022 ). By delivering decadal, district-level, seasonally resolved XAI, this framework transcends city-wide averages to provide a mechanistically coherent evidence base for precision environmental governance in complex megacities. 3.5 Visual Analysis Pollutant Correlation Matrices Figure 1 presents correlation matrices for six strategically selected districts of Tehran, illustrating pairwise relationships between key pollutants (PM₂.₅, PM₁₀, NO₂, CO, SO₂, O₃). The color gradient (red for strong positive correlations, blue for negative or weak correlations) reveals distinct spatially heterogeneous pollution dynamics, directly reflecting the socio-spatial and emission profiles of each district. The analysis reveals three primary patterns: 1. Industrial and Central Urban Zones (Districts 6 and 15) District 6 (Central Commercial/High-Traffic Core): This district exhibits the most complex correlation structure. It shows moderate positive correlations between SO₂ and other pollutants (e.g., SO₂-NO₂ = 0.16, SO₂-CO = 0.27), suggesting shared sources from combustion processes associated with dense urban activity rather than heavy industry. The strongest correlations are between PM₁₀ and PM₂.₅ (0.63) and between CO and NO₂ (0.47), reinforcing its identity as a high-traffic zone where exhaust emissions and road dust are dominant. District 15 (Industrial Cluster): This district displays weak to moderate positive correlations across most pollutant pairs. Notably, SO₂ shows very low correlation with other pollutants (e.g., SO₂-NO₂ = 0.21, SO₂-CO = 0.19), which contradicts the notion of SO₂ being a dominant, co-emitted industrial tracer. Instead, the highest correlation is between PM₁₀ and PM₂.₅ (0.58), consistent with its profile as an area dominated by dust resuspension and particulate matter from industrial and heavy-vehicle activity. The relatively weak link between SO₂ and other pollutants suggests that its emissions may be more isolated or regulated compared to other sources. 2. Traffic-Dominated Districts (Districts 2 and 14) District 2 (High-Traffic Corridor): This district shows strong positive correlations centered on CO and NO₂ (CO-NO₂ = 0.35), directly confirming vehicular traffic as the primary source. It also exhibits high correlations between PM₁₀ and PM₂.₅ (0.61), indicating co-emission from exhaust and road dust. The correlation between NO₂ and PM₂.₅ (0.29) is moderate, supporting the link between traffic emissions and fine particulate matter. District 14 (Commercial/Transit Hub): Similar to District 2, this district shows strong correlations between CO and NO₂ (0.46) and among the particulate matter fractions (PM₁₀-PM₂.₅ = 0.62). However, it also shows a notable positive correlation between SO₂ and CO (0.53), which is higher than in other districts. This suggests that in this transit hub, there may be a contribution from diesel-powered commercial vehicles or nearby industrial activity influencing SO₂ levels alongside traffic emissions. 3. Residential and Receptor Zones (Districts 1 and 19) District 1 (Northern Residential/Foothill Zone): This district displays weaker overall correlations compared to others, reflecting its lower local emissions and sensitivity to regional transport. The strongest correlation is between PM₁₀ and PM₂.₅ (0.60), indicating shared sources like transported dust or secondary aerosols. There is a moderate negative correlation between O₃ and CO (-0.38) and O₃ and NO₂ (-0.24), which is consistent with photochemical titration in a receptor zone where O₃ can be consumed by high concentrations of NOₓ. District 19 (South-Southwest Receptor Basin): This district shows the weakest correlations overall, particularly for O₃, which has low to negative correlations with all other pollutants (e.g., O₃-CO = -0.28, O₃-NO₂ = -0.32). This is characteristic of a photochemical hotspot, where O₃ is formed independently from precursor gases (NOₓ, VOCs) and can be destroyed by fresh NO emissions, leading to an inverse relationship. The only strong correlation is between PM₁₀ and PM₂.₅ (0.51), again pointing to common sources of particulate matter. SHAP-Based Feature Importance Figure 2 presents the mean absolute SHAP values for the lag-1 features of each pollutant, ranked by their predictive importance for air quality in each of the six districts. This analysis reveals the dominant drivers of short-term pollution fluctuations within each district, providing a clear, interpretable signature of local emission sources and atmospheric dynamics.The findings from Fig. 2 are as follows: District 1 (mixed-use): The most influential predictor is SO₂ (Mean |SHAP| = 0.083), followed by O₃ (0.066). This indicates that emissions linked to fossil fuel combustion (e.g., power generation or industrial activity) and photochemical processes are the primary drivers of short-term variability in this district, consistent with its central location and mixed land use. District 2 (High-traffic corridor): The leading contributor is CO (Mean |SHAP| = 0.060), followed closely by O₃ (0.059). This strongly reinforces the dominance of vehicular traffic as the primary source of pollution, with CO being a direct tracer of combustion and O₃ indicating active photochemistry in this major corridor. District 6 (Central commercial / high-traffic core): NO₂ is the most important factor (Mean |SHAP| = 0.125), followed by CO (0.038) and O₃ (0.035). This highlights the significant role of industrial combustion processes, which emit large quantities of NOₓ, and suggests that these emissions also contribute to secondary ozone formation. District 14 (Commercial/transit hub): NO₂ dominates (Mean |SHAP| = 0.043), followed by O₃ (0.034) and CO (0.023). This pattern confirms the district's identity as a major transit hub, where traffic-related emissions (NO₂, CO) drive air quality, with O₃ reflecting secondary pollutant formation under sunny conditions. District 15 (Industrial): PM₁₀ stands out as the primary driver (Mean |SHAP| = 0.045), followed by CO and NO₂ (both at 0.032). This is consistent with the district's profile as an industrial cluster, where dust resuspension (PM₁₀) and industrial/vehicular combustion (CO, NO₂) are key sources of pollution. District 19 (Residential/commercial receptor): O₃ is the most influential pollutant (Mean |SHAP| = 0.101), followed by NO₂ (0.067) and CO (0.049). This identifies District 19 as a distinct photochemical hotspot, where high levels of precursor gases (NO₂, CO) from dense traffic lead to intense ozone production, a finding supported by its topographical position and summer meteorology. These results provide a granular, data-driven basis for targeted environmental management and validate the reliability of the AI/ML models. The utility of eXplainable Artificial Intelligence (XAI) is underscored, as it transforms complex model outputs into actionable, physically interpretable insights for environmental monitoring and decision-making. Pandemic Impact on Pollutant Levels Figure 3 presents a comparative analysis of key pollutants across six districts during three distinct timeframes: Pre-COVID (April 2015 – November 2019), During-COVID (December 2019 – April 2023), and Post-COVID (May 2023 – April 2025). The data reveal a complex, district- and pollutant-specific response to the pandemic, with significant reductions in primary pollutants during the lockdown period, followed by a heterogeneous recovery in the post-pandemic era. The most pronounced declines occurred in traffic-dominated districts: District 2 (high-traffic corridor) saw its mean NO₂ levels increase from 80.4 µg/m³ (Pre) to 101.0 µg/m³ (During), a rise of approximately 25.7%. This counter-intuitive increase may reflect localized measurement effects or changes in traffic composition during restrictions. Its CO levels fell from 2719.8 µg/m³ (Pre) to 2409.7 µg/m³ (During), a reduction of 11.4%. District 14 (commercial/transit hub) exhibited a decrease in NO₂ from 86.1 µg/m³ (Pre) to 81.9 µg/m³ (During), and a more substantial drop in CO from 2034.8 µg/m³ (Pre) to 1707.9 µg/m³ (During), a 16.0% reduction. Industrial districts showed varied responses: District 6 (central commercial/high-traffic core) experienced a slight decrease in SO₂ from 18.4 µg/m³ (Pre) to 16.5 µg/m³ (During), while its PM₂.₅ levels increased from 32.3 µg/m³ (Pre) to 34.4 µg/m³ (During). District 15 (industrial cluster) recorded a significant drop in SO₂ from 16.7 µg/m³ (Pre) to 13.7 µg/m³ (During), consistent with reduced industrial output. Its CO levels decreased dramatically from 1768.5 µg/m³ (Pre) to 1182.2 µg/m³ (During), a 33.2% reduction. Residential/commercial areas displayed mixed results: District 19 (receptor basin) saw its NO₂ fall from 79.5 µg/m³ (Pre) to 73.9 µg/m³ (During), while CO dropped significantly from 2808.0 µg/m³ (Pre) to 1726.0 µg/m³ (During), a 38.5% reduction. However, the post-pandemic recovery is highly variable. In many cases, concentrations have returned to, or even exceeded, pre-pandemic levels, but the pattern is not uniform: District 6 shows PM₂.₅ rebounding to 32.8 µg/m³ in the Post-COVID period—slightly above its pre-pandemic level of 32.3 µg/m³—indicating a near-complete return to baseline after a transient increase during lockdowns. District 2’s CO has recovered to 2278.6 µg/m³ (Post), still below its pre-pandemic value of 2719.8 µg/m³. District 19’s NO₂ has fallen further to 61.4 µg/m³ (Post), well below its pre-pandemic level, while its PM₁₀ has surged to 128.4 µg/m³ (Post), far exceeding its pre-pandemic level of 89.1 µg/m³. Notably, District 15 maintained lower SO₂ levels in the post-pandemic period (12.7 µg/m³) compared to pre-pandemic (16.7 µg/m³), suggesting potential long-term changes in industrial operations or emission controls. A unique pattern emerged for Ozone (O₃). While primary pollutants declined, O₃ levels often remained stable or showed minor increases during the pandemic, contrary to the common expectation of an increase due to reduced NOₓ titration. For instance: District 1 (residential/foothill) saw O₃ rise slightly from 41.0 µg/m³ (Pre) to 43.7 µg/m³ (During). District 6 (central core) experienced a slight dip in O₃ from 42.6 µg/m³ (Pre) to 42.1 µg/m³ (During), before surging to 47.5 µg/m³ (Post). District 19 (receptor basin) experienced a decrease in O₃ from 36.9 µg/m³ (Pre) to 32.1 µg/m³ (During), before partially recovering to 34.2 µg/m³ in the post-pandemic period. This suggests that in this specific receptor zone, other factors (e.g., changes in precursor mix, meteorology, or background pollution) may have outweighed the simple NOₓ-titration effect. These findings highlight the limited and non-uniform long-term impact of temporary interventions and underscore the need for sustained, structural policy actions beyond episodic reductions. The model’s sensitivity to these real-world perturbations validates its utility for evaluating the impact of policy measures and designing resilient, long-term air quality strategies. Prediction vs. Actual Trends Figure 4 presents a direct comparison between actual and predicted concentrations for the leading pollutant in each of the six districts over the test period (January 2024–April 2025). The leading pollutant for each district was determined by its highest mean absolute SHAP value across the full training set (as shown in Table 6 and discussed in Section 3.4 ), ensuring the plots reflect the model’s most influential predictive feature for each location. The visual comparison reveals strong alignment between actual and predicted values across all districts, demonstrating the model’s capacity to accurately reproduce real-world pollution dynamics on an hourly basis. The relatively narrow Mean Absolute Error (MAE) bands, displayed as shaded regions around the predicted line, indicate high precision and low uncertainty in the forecasts. Seasonal patterns are consistently captured: Elevated pollutant levels during the cold season (October–February) are evident in most plots, reflecting the impact of winter temperature inversions and increased heating demand. Lower concentrations and distinct diurnal cycles are observed during the warm season (March–September), consistent with enhanced atmospheric dispersion and photochemical activity. Short-term variability, including sharp spikes linked to traffic congestion, industrial activity, or dust events, is also well-represented, highlighting the model’s sensitivity to dynamic emission sources and temporal dependencies. District-level performance demonstrates robustness across Tehran’s diverse urban environments: District 1 (Leading Pollutant: SO₂): The model accurately tracks the highly variable SO₂ concentrations, capturing both baseline fluctuations and episodic peaks. This aligns with the district’s role as a receptor zone for regionally transported pollutants, where SO₂ can be influenced by distant industrial sources. District 2 (Leading Pollutant: CO): Predictions for CO show excellent agreement with observations, particularly during peak traffic hours. The model effectively captures the daily and weekly patterns associated with vehicular emissions, confirming its sensitivity to mobile-source drivers. District 6 (Leading Pollutant: NO₂): The periodic nature of NO₂ emissions, primarily linked to dense urban traffic and commercial activity in central Tehran, is effectively modeled. While occasional deviations occur during sudden emission surges (e.g., from heavy traffic or local events), the predictions remain largely within the MAE band. District 14 (Leading Pollutant: NO₂): The model precisely predicts the NO₂ peaks associated with morning and evening rush hours, reflecting the influence of traffic congestion in this major transit hub. The close tracking of daily cycles underscores the model’s ability to capture localized, time-dependent emission patterns. District 15 (Leading Pollutant: PM₁₀): Predictions for PM₁₀ demonstrate reasonable accuracy, maintaining forecasts within the MAE band despite periods of abrupt change. This reflects the complex interplay of stationary industrial sources and resuspended dust, which can lead to highly variable concentrations. District 19 (Leading Pollutant: O₃): The model successfully captures the seasonal and diurnal patterns of ozone, including its characteristic afternoon peaks during the warm season. The strong alignment between actual and predicted O₃ levels validates the model’s ability to forecast secondary pollutant formation in this photochemical hotspot. Model performance metrics support these qualitative observations. The MAE values reported in the figure captions (ranging from 0.142 ppm for CO in District 2 to 8.378 µg/m³ for PM₁₀ in District 15) are consistent with the quantitative results presented in Table 3 . These results affirm the effectiveness of the tree-based ensemble framework in handling nonlinear relationships and complex interactions among pollutants. The framework also exhibits resilience in predicting extreme pollution episodes. While minor deviations occur during very sharp spikes, most predictions remain within the uncertainty bounds, enabling users to gauge confidence levels in real-time applications. These findings validate the predictive power of the interpretable ML framework and its suitability for deployment in real-world air quality management systems. The framework also exhibits resilience in predicting extreme pollution episodes such as dust storms or industrial outbursts. While minor deviations occur during very sharp spikes, most predictions remain within the uncertainty bounds, enabling users to gauge confidence levels in real-time applications. These findings validate the predictive power of our interpretable machine learning framework and its suitability for deployment in real-world air quality management systems. 4. Discussion This study introduces a transparent, reproducible, and highly interpretable machine learning framework for district-level air quality assessment in Tehran, Iran, grounded in a decade of high-resolution hourly monitoring data (April 2015–April 2025). A central contribution of this work is its exceptional spatiotemporal granularity in source attribution—offering a nuanced picture of urban pollution dynamics that transcends city-wide averages. Through SHAP-based explainable AI, we uncover district-specific and seasonally dynamic pollution drivers that closely align with Tehran’s well-documented emission landscape and atmospheric behavior. The central commercial core (District 6) is consistently dominated by NO₂, reflecting intense vehicular and energy-related combustion in Tehran’s dense administrative hub, while the industrial cluster (District 15) is uniquely characterized by PM₁₀ dominance, attributable to dust resuspension and industrial activity—patterns well established in Tehran’s environmental literature. Major traffic corridors (District 2) and transit-influenced residential zones (District 14) exhibit CO and NO₂ as primary predictive features year-round, consistent with their roles as key commuter arteries. In contrast, the northern residential/foothill zone (District 1) and the south-southwest receptor basin (District 19) display distinct photochemical regimes: SO₂ (in D1) and O₃ (in D19) emerge as dominant warm-season drivers, while CO prevails in winter—directly mirroring seasonal shifts in precursor availability, inversion frequency, and atmospheric stability under Tehran’s semi-arid climate. These patterns are not statistical artifacts but empirically grounded signatures of Tehran’s complex airshed. They corroborate documented evidence on industrial clustering, transportation emissions, and meteorological controls (Taghizadeh et al., 2023 ). Furthermore, our identification of winter inversion effects—manifested in elevated SHAP importance for CO and NO₂ during cold months—is strongly supported by recent Tehran-focused studies that document sharp pollutant entrapment under stable atmospheric conditions (Nejad et al., 2023 ). This alignment underscores the capacity of interpretable machine learning to recover physically meaningful mechanisms that resonate with the broader environmental science literature. The framework also demonstrates high sensitivity to real-world environmental interventions. It accurately captures 10–30% reductions in NO₂, 20–30% declines in CO, and notable drops in PM₂.₅ during the 2020–2022 mobility restrictions, followed by a post-pandemic rebound. This temporal responsiveness highlights the model’s dual utility—not only for forecasting, but also for evaluating the effectiveness of environmental policies and informing resilient, long-term air quality strategies (Le et al., 2020 ). These findings echo regional and global observations that machine learning–based spatiotemporal frameworks can reliably track high-resolution pollutant responses to both anthropogenic disruptions and policy actions (Wu et al., 2022 ; Lin et al., 2022 ; Wang et al., 2024 ), reinforcing the generalizability of our approach across diverse urban contexts. A key limitation of this study is the absence of explicit meteorological covariates—such as temperature, wind speed, or relative humidity—in the input data. This constraint stems from the design of Tehran’s existing monitoring infrastructure, which focuses on pollutant concentrations rather than co-located meteorological measurements. Importantly, however, our pipeline implicitly captures many meteorologically driven phenomena through lagged pollutant features and seasonal stratification. For instance, the elevated SHAP importance of CO and NO₂ in winter accurately reflects inversion-related accumulation, even without direct weather inputs. Despite this data limitation, the models deliver robust predictive performance across pollutants, districts, and seasons—demonstrating remarkable adaptability under real-world constraints. This performance is consistent with findings from other megacities like Beijing, Delhi, and Kuala Lumpur, where high-performing ML systems often rely primarily on pollutant time-series structure rather than external covariates (Xu et al., 2022 ; Masood et al., 2023 ; Ke et al., 2022 ). While future integration of meteorological data is expected to further refine predictions—especially for secondary pollutants like O₃—the current results affirm the practical utility of pollutant-only frameworks in regions with sparse meteorological monitoring (Mamić et al., 2023 ). This is particularly relevant for low- and middle-income countries where air quality monitoring is often prioritized over comprehensive weather instrumentation. By leveraging an ensemble of state-of-the-art tree-based models—XGBoost, CatBoost, and LightGBM—we achieve exceptional predictive accuracy (MAE: 3.03–9.02 µg/m³; R²: 0.65–0.91) and reliable AQI classification (F1-score: 0.894–0.937) without meteorological inputs. This demonstrates that high-impact, policy-actionable forecasting is feasible even in data-constrained settings—a critical insight for megacities across the Global South. Our results align with prior Tehran-based work showing that pollutant-only models can extract meaningful signals when the modeling architecture and feature engineering are robust (Ghaemi et al., 2018 ), and with global studies affirming that interpretable, structured models can match or outperform complex deep learning systems when covariate data are limited (Peng et al., 2022 ; Zaini et al., 2022 ; Mamić et al., 2023 ). Although this study does not incorporate individual-level epidemiological records, the U.S. EPA Air Quality Index (AQI) provides a scientifically validated framework for estimating population-level health risk exposure. AQI categories map directly to defined public health guidance: ‘Good’ (0–50) poses little or no risk; ‘Moderate’ (51–100) may affect unusually sensitive individuals; ‘Unhealthy for Sensitive Groups’ (101–150) can impact children, the elderly, and those with respiratory or cardiovascular conditions; ‘Unhealthy’ (151–200) is harmful to the general population; ‘Very Unhealthy’ (201–300) triggers serious health warnings; and ‘Hazardous’ (> 300) represents emergency conditions posing severe risk to all (U.S. EPA, 2020). Our district-level AQI forecasts—spanning from Good (min = 11.3) to Hazardous (max = 313.7)—thus deliver actionable, spatially resolved estimates of relative exposure burden, particularly for vulnerable subpopulations in high-risk zones like District 19. Our modular, reproducible pipeline is designed for direct scalability to other cities facing similar challenges—from Delhi to Cairo—where pollutant-only monitoring is routine but meteorological data remain scarce. The integration of high accuracy, full interpretability, and regulatory-aligned outputs (U.S. EPA AQI) ensures immediate policy relevance. Importantly, this work challenges the common assumption that forecasting quality depends on model complexity. Instead, it shows that transparent, explainable architectures can deliver superior utility for environmental governance (Peng et al., 2022 ; Wu et al., 2022 ). Structured boosting models, when paired with XAI, can match or exceed the performance of hybrid or optimization-driven systems—such as ELM–metaheuristic models or spatiotemporal attention networks—while preserving transparency and decision-making clarity (Masood et al., 2023 ; Wang et al., 2024 ). In sum, this research moves beyond technical prediction to offer a data-driven, district-aware, and seasonally informed blueprint for urban air quality governance. It demonstrates that even with limited input data, a thoughtfully designed, interpretable ML framework can yield scientifically rigorous and policy-relevant insights—paving the way for healthier, more sustainable cities in Tehran and beyond. By anchoring our findings in both Iran’s environmental reality and the evolving global discourse on ML for air quality, this study provides a robust foundation for future integration of meteorology, remote sensing, and causal inference in next-generation urban environmental intelligence. Crucially, it affirms that the path to equitable, high-impact environmental monitoring does not require perfect data—but rather intelligent, context-sensitive design grounded in real-world constraints and governance needs. 5. Conclusion This study presents a scalable, interpretable, and high-fidelity machine learning framework for district-level air quality forecasting in data-constrained megacities. Drawing on a decade of high-resolution, multi-pollutant monitoring data from six socio-spatially stratified districts of Tehran, we demonstrate that tree-based ensemble models—augmented with SHAP-based explainability—can achieve exceptional predictive accuracy without relying on meteorological covariates. Critically, the framework goes beyond mere forecasting: it reveals seasonally dynamic, district-specific pollution regimes that reflect the interplay of local emission sources, topographic trapping, and atmospheric processes. Rather than offering another opaque predictive system, this work delivers a transparent, interpretable, and policy-actionable blueprint for urban environmental governance. Its demonstrated sensitivity to real-world perturbations—such as the 10–30% reductions in NO₂ and CO during the 2020–2022 mobility restrictions—and its compatibility with existing monitoring infrastructure make it immediately deployable in cities across the Global South, where meteorological data are often unavailable but air quality governance is urgently needed. Our findings affirm a key methodological principle: high-impact environmental intelligence depends not on data abundance, but on context-aware design grounded in real-world monitoring constraints. By transforming ten years of Tehran’s air quality records into a reproducible, open, and interpretable pipeline, we provide more than a forecasting tool—we establish a foundation for precision air quality management, where interventions are tailored to the unique spatiotemporal pollution fingerprint of each urban neighborhood. In doing so, this study advances Tehran’s capacity for evidence-based environmental action and offers a globally transferable model for equitable, data-driven urban resilience in the face of escalating air pollution challenges. Declarations Data Availability: The datasets analyzed in this study were obtained from the Tehran Air Quality Monitoring System ( http://air.tehran.ir ), operated by the Tehran Environmental Monitoring Center, subject to the provider’s terms. Due to regional network restrictions, this website is most likely accessible only within Iran. The data used in this study are available from the corresponding author upon reasonable request. References Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785 Ghaemi Z, Alimohammadi A, Farnaghi M (2018) LaSVM-based big data learning system for dynamic prediction of air pollution in Tehran. Environ Monit Assess 190 Article 300. https://doi.org/10.1007/s10661-018-6659-6 Ke H, Gong S, He J, Zhang L, Mo J (2022) A hybrid XGBoost-SMOTE model for optimization of operational air quality numerical model forecasts. Front Environ Sci 10:1007530. https://doi.org/10.3389/fenvs.2022.1007530 Le T, Wang Y, Liu L, Yang J, Yung YL, Li G, Seinfeld JH (2020) Unexpected air pollution with marked emission reductions during the COVID-19 outbreak in China. Science 369(6504):702–706. https://doi.org/10.1126/science.abb7431 Lelieveld J, Evans JS, Fnais M, Giannadaki D, Pozzer A (2015) The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 525(7569):367–371. https://doi.org/10.1038/nature15371 Lin S, Zhao J, Li J et al (2022) A spatial–temporal causal convolution network framework for accurate and fine-grained PM₂.₅ concentration prediction. Entropy 24(8):1125. https://doi.org/10.3390/e24081125 Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30 , 4765–4774. https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf Mamić L, Gašparović M, Kaplan G (2023) Developing PM₂.₅ and PM₁₀ prediction models on a national and regional scale using open-source remote sensing data. Environ Monit Assess 195:644. https://doi.org/10.1007/s10661-023-11212-x Masood A, Hameed MM, Srivastava A, Pham QB, Ahmad K, Razali SFM, Baowidan SA (2023) Improving PM₂.₅ prediction in New Delhi using a hybrid extreme learning machine coupled with snake optimization algorithm. Sci Rep 13:21057. https://doi.org/10.1038/s41598-023-47492-z Nejad MT, Ghalehteimouri J, Talkhabi K, H., Dolatshahi Z (2023) The relationship between atmospheric temperature inversion and urban air pollution characteristics: A case study of Tehran, Iran. Discover Environ 1:17. https://doi.org/10.1007/s44274-023-00018-w Peng J, Han H, Yi Y, Huang H, Xie L (2022) Machine learning and deep learning modeling and simulation for predicting PM₂.₅ concentrations. Chemosphere 308:136353. https://doi.org/10.1016/j.chemosphere.2022.136353 Pope CA, III, Dockery DW (2006) Health effects of fine particulate air pollution: Lines that connect. J Air Waste Manag Assoc 56(6):709–742. https://doi.org/10.1080/10473289.2006.10464485 Shi X, Brasseur GP (2020) The response in air quality to the reduction of Chinese economic activities during the COVID-19 outbreak. Geophysical Research Letters, 47 (11), e2020GL088070. https://doi.org/10.1029/2020GL088070 Taghizadeh F, Mokhtarani B, Rahmanian N (2023) Air pollution in Iran: The current status and potential solutions. Environ Monit Assess 195:737. https://doi.org/10.1007/s10661-023-11296-5 U.S. Environmental Protection Agency (2020) Technical assistance document for the reporting of daily air quality—Air Quality Index (AQI) (EPA-454/B-20-005). https://www.epa.gov/sites/default/files/2020-09/documents/aqi-technical-assistance-document-sept2020.pdf Wang Y, Tian S, Zhang P (2024) Novel spatio-temporal attention causal convolutional neural network for multi-site PM₂.₅ prediction. Front Environ Sci 12:1408370. https://doi.org/10.3389/fenvs.2024.1408370 World Health Organization (2021) WHO global air quality guidelines: Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide . https://www.who.int/publications/i/item/9789240034228 Wu Y, Lin S, Shi K, Ye Z, Fang Y (2022) Seasonal prediction of daily PM₂.₅ concentrations with interpretable machine learning: a case study of Beijing, China. Environ Sci Pollut Res Int 29(30):45821–45836. https://doi.org/10.1007/s11356-022-18913-9 Xu S, Li W, Zhu Y, Xu A (2022) A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks. Sci Rep 12:14434. https://doi.org/10.1038/s41598-022-17754-3 Zaini N, Ean LW, Ahmed AN, Malek A, M., Chow MF (2022) PM₂.₅ forecasting for an urban area based on deep learning and decomposition method. Sci Rep 12:17565. https://doi.org/10.1038/s41598-022-21769-1 Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8291122","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":556043712,"identity":"97376b06-0210-4aa8-8115-1e7256a4af04","order_by":0,"name":"Alireza Faghani Ghodrat","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5klEQVRIie3PMQrCMBSA4RceJMtD14par1Bxb68iFJzcBEE6VCjo4gH0IpmFgr2BCC6tXkAHoUIHEyy4NY6C+ZeXIR/JA7DZfjHBlu+5BJarSS0jwZrQHtDTk5tJPRXhjj4YSTvB1e1W+UAgjovH1O9xwOJyaiBOypLdlkJFaH7uy1B9jI9G06ZnFEFyEAKgybkjURHi3SYy0KTyYv3KZNaRsZl4msA41bsc2F2mZjJUhG32GRESdpnMiKNhFzdbX6GsIpfEurg/ZRS0RVJcG9evI0DykPQRTXc/iZyV39+22Wy2P+oFg9M7CBi8e1UAAAAASUVORK5CYII=","orcid":"https://orcid.org/0009-0008-3808-1674","institution":"Independent Artificial Intelligence Researcher","correspondingAuthor":true,"prefix":"","firstName":"Alireza","middleName":"Faghani","lastName":"Ghodrat","suffix":""},{"id":556043713,"identity":"7c63366f-f8fc-4000-a390-4114d9a06c25","order_by":1,"name":"Somayeh Rezaei Sough","email":"","orcid":"","institution":"Department of Urban Engineering, ST.C., Islamic Azad University, Tehran, Iran","correspondingAuthor":false,"prefix":"","firstName":"Somayeh","middleName":"Rezaei","lastName":"Sough","suffix":""}],"badges":[],"createdAt":"2025-12-06 00:00:52","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-8291122/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8291122/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":97900324,"identity":"699a4cdd-58f0-44da-8a05-806474630cba","added_by":"auto","created_at":"2025-12-10 15:45:22","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1480514,"visible":true,"origin":"","legend":"","description":"","filename":"ManuscriptPreprint.docx","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/d4e5a4e215628fda1a960820.docx"},{"id":97900743,"identity":"5bbd308d-fb81-4045-9e26-ee5c92f8a9e2","added_by":"auto","created_at":"2025-12-10 15:45:49","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":342,"visible":true,"origin":"","legend":"","description":"","filename":"rs8291122.json","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/7c965d34f5d962aa8990371c.json"},{"id":97899284,"identity":"09735c52-ffd6-44f7-82a7-d1c2e09c8998","added_by":"auto","created_at":"2025-12-10 15:42:34","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":143275,"visible":true,"origin":"","legend":"","description":"","filename":"rs82911220enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/a75eb93f54f80326e5e66efe.xml"},{"id":97900235,"identity":"a3bbdb27-401c-45d3-b59d-d77f6865513f","added_by":"auto","created_at":"2025-12-10 15:45:19","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":60180,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/55867be45fe4b45a8e662898.png"},{"id":97897803,"identity":"6f485378-e2b2-4938-aad7-774fbfe864dd","added_by":"auto","created_at":"2025-12-10 15:38:15","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":31957,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/0bcea41170884b4b018db6d7.png"},{"id":97861626,"identity":"48f36f21-7e9e-40a9-b927-5e73cd100486","added_by":"auto","created_at":"2025-12-10 08:55:41","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":40102,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/79b4a3d88658018ccdf671ca.png"},{"id":97861629,"identity":"39f33417-e1fd-48c9-b482-3bcc332b9963","added_by":"auto","created_at":"2025-12-10 08:55:42","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":253285,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/38d29ec0038be586573cd52c.png"},{"id":97861630,"identity":"f4a8d2cc-2a7f-4b44-91c1-ec3dea102438","added_by":"auto","created_at":"2025-12-10 08:55:42","extension":"xml","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":142079,"visible":true,"origin":"","legend":"","description":"","filename":"rs82911220structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/cb92e4514319c5a2cd9dbf38.xml"},{"id":97861631,"identity":"edf55ff1-4d7f-4dc9-a98a-6d84f2655e75","added_by":"auto","created_at":"2025-12-10 08:55:42","extension":"html","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":149107,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/d93f4092bd3450418c9d08c0.html"},{"id":97861619,"identity":"17b056f9-da1c-4e7b-b9da-4d7ac7769c80","added_by":"auto","created_at":"2025-12-10 08:55:41","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":208599,"visible":true,"origin":"","legend":"\u003cp\u003ePollutants Correlation by District\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/e1301c91f59f5a335e1a09b8.png"},{"id":97861620,"identity":"f76a115f-e716-46e8-9b1e-7236b34dd86e","added_by":"auto","created_at":"2025-12-10 08:55:41","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":84049,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP-Based Feature Importance Plot\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/854f7abf2afa89cfef578f0b.png"},{"id":97861623,"identity":"aa74a735-b39c-4fc7-bb13-dc59a5da5833","added_by":"auto","created_at":"2025-12-10 08:55:41","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":110194,"visible":true,"origin":"","legend":"\u003cp\u003ePollutant Levels Before, During, After the Pandemic\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/0cf1879c1d4d6ee4d80c677f.png"},{"id":97861622,"identity":"a202d2ad-113e-4643-95f4-6058dd7d1cc9","added_by":"auto","created_at":"2025-12-10 08:55:41","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":987230,"visible":true,"origin":"","legend":"\u003cp\u003ePrediction vs Actual Plots for Each District’s Leading Pollutant\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/99c026db90a8dd2e1197ecd8.png"},{"id":97903442,"identity":"12977c84-b716-4c98-8f79-c51087672b34","added_by":"auto","created_at":"2025-12-10 15:55:31","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2518744,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8291122/v1/c3dc0d8b-a0cb-4e89-91de-9547dcfbd4bd.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eAssessing Air Quality in Tehran: Explainable Artificial Intelligence for Megacities\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eTehran, the capital of Iran and a major Middle Eastern metropolis with over 15\u0026nbsp;million residents in its greater urban region, faces intensifying air quality challenges driven by rapid urbanization, industrial expansion, and escalating vehicular emissions. Enclosed by the Alborz mountain range, the city\u0026rsquo;s topography acts as a natural barrier that traps pollutants near the surface, amplifying health risks from elevated concentrations of PM₂.₅, PM₁₀, NO₂, CO, SO₂, and O₃\u0026mdash;substances strongly linked to respiratory and cardiovascular diseases (Lelieveld et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Pope \u0026amp; Dockery, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2006\u003c/span\u003e). Globally, air pollution ranks among the leading causes of premature mortality, with low- and middle-income countries such as Iran bearing a disproportionate burden (Lelieveld et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; World Health Organization, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). In Tehran, deteriorating air quality has been correlated with increased hospital admissions for asthma and chronic obstructive pulmonary disease, underscoring the urgent need for robust, data-driven forecasting tools to protect public health and inform evidence-based environmental governance.\u003c/p\u003e\u003cp\u003eLong-term monitoring is essential to fully capture the complexity of urban air pollution\u0026mdash;including seasonal variability, multiyear trends, and the effects of major disruptions such as regulatory measures or global events like the COVID-19 pandemic (Le et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Shi \u0026amp; Brasseur, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Tehran\u0026rsquo;s pollution dynamics are further shaped by recurring meteorological phenomena, particularly winter temperature inversions that confine pollutants near the surface (Nejad et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). This intricate interplay between anthropogenic emissions and atmospheric processes necessitates high-resolution, spatially disaggregated analysis to reveal localized exposure patterns that city-wide averages inevitably obscure.\u003c/p\u003e\u003cp\u003eThis study presents a decade-long, district-level assessment of Tehran\u0026rsquo;s air pollution (April 2015\u0026ndash;April 2025)\u0026mdash;one of the most comprehensive evaluations to date. Six districts\u0026mdash;1, 2, 6, 14, 15, and 19\u0026mdash;were strategically selected to represent the city\u0026rsquo;s spatial, socioeconomic, and emission heterogeneity. District 1 corresponds to the northern residential/foothill zone, characterized by low local emissions but high sensitivity to topographically trapped and regionally transported pollutants. District 2 serves as a major high-traffic corridor in the northwest, intersected by multiple expressways and dominated by mobile-source emissions. District 6 encompasses central Tehran\u0026rsquo;s administrative and commercial core, marked by intense vehicular activity and dense urban morphology. District 14, in the southeast, typifies high-density residential neighborhoods influenced by pollutant transport from adjacent industrial areas. District 15, located in the far south, hosts Tehran\u0026rsquo;s primary industrial cluster, generating substantial stationary-source and heavy-vehicle emissions. Finally, District 19, situated in the low-lying south-southwest basin, functions as a receptor zone highly prone to pollutant accumulation. This spatial stratification provides a controlled analytical framework that uncovers district-specific pollution regimes otherwise masked in aggregated city-level statistics.\u003c/p\u003e\u003cp\u003eWe develop a transparent, reproducible, and highly interpretable machine learning framework based on three state-of-the-art tree-based ensemble models: XGBoost, CatBoost, and LightGBM. These models were selected for their proven accuracy, computational efficiency, and native support for SHAP (Shapley Additive exPlanations)-based interpretability (Chen \u0026amp; Guestrin, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Lundberg \u0026amp; Lee, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). Using only historical pollutant concentrations as input, our approach avoids reliance on external covariates while maintaining scientific rigor and real-world deployability. Critically, SHAP analysis transforms predictive outputs into actionable, physically interpretable insights, revealing the dominant pollution drivers in each district and season\u0026mdash;thereby advancing recent developments in explainable AI (XAI) for environmental monitoring (Wu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThis work delivers five key contributions, positioning it at the forefront of AI-driven environmental science:\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eA Decade of High-Resolution Data Across Strategically Selected Districts: We analyze 10 years of hourly monitoring data from six socio-spatially diverse districts, enabling robust modeling of long-term trends and heterogeneous urban responses across Tehran\u0026rsquo;s complex fabric.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eA Highly Accurate and Interpretable Modeling Framework: Our ensemble of tree-based models achieves precise short-term forecasting while ensuring full transparency through SHAP-based explainability\u0026mdash;offering a reproducible alternative to complex deep learning or hybrid architectures that often require extensive external covariates (Lin et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Zaini et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Xu et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eContext-Aware Pollution Regime Identification: SHAP analysis uncovers interpretable, spatiotemporally dynamic pollution signatures that reflect local emission sources, seasonal patterns, and urban form\u0026mdash;providing evidence-based guidance for precision environmental policy.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eQuantification of Real-World Policy Impacts: We capture significant air quality improvements during the 2020\u0026ndash;2022 mobility restrictions and their post-pandemic rebound, demonstrating the model\u0026rsquo;s sensitivity to societal interventions\u0026mdash;consistent with global observations of pandemic-era air quality shifts (Le et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Taghizadeh et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eA Scalable, Replicable Blueprint for Global Megacities: The end-to-end pipeline\u0026mdash;from preprocessing to prediction and explanation\u0026mdash;is designed for adaptation in other data-constrained urban environments, particularly in low- and middle-income regions where monitoring infrastructure is limited yet governance needs are urgent (Taghizadeh et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003eBy advancing a transparent, data-driven, and policy-ready framework, this study directly informs environmental decision-making in Tehran and provides a transferable model for cities worldwide seeking to transform air quality monitoring data into actionable intelligence for healthier, more sustainable urban futures. Our work demonstrates how existing monitoring infrastructure can be leveraged to generate high-resolution, policy-relevant exposure intelligence through modern data science.\u003c/p\u003e"},{"header":"2. Methodology","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Data Source\u003c/h2\u003e\u003cp\u003eHourly air quality data were obtained from the Tehran Air Quality Monitoring System, operated by the Tehran Environmental Monitoring Center. The dataset includes measurements of six key pollutants\u0026mdash;PM₂.₅, PM₁₀, NO₂, CO, SO₂, and O₃\u0026mdash;collected from fixed monitoring stations in six districts: 1, 2, 6, 14, 15, and 19. These districts were selected to represent Tehran\u0026rsquo;s socio-spatial and emission heterogeneity:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 1: northern residential/foothill zone (low local emissions, high sensitivity to regional transport),\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistricts 2 and 14: high-traffic corridors (dominated by mobile-source emissions),\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6: central administrative and commercial core (intense vehicular and energy-related activity),\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15: southern industrial cluster (stationary-source and heavy-vehicle emissions),\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19: south-southwest receptor basin (prone to pollutant accumulation).\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eCovering April 2015 to April 2025, this decade-long dataset enables robust long-term trend analysis, seasonal decomposition, and rigorous out-of-sample forecasting. Data quality assessments confirmed high completeness, with missing values systematically addressed during preprocessing.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Preprocessing\u003c/h2\u003e\u003cp\u003eData preprocessing ensured temporal consistency and modeling suitability. Missing values were handled using a time-series-aware imputation strategy: linear interpolation for gaps\u0026thinsp;\u0026le;\u0026thinsp;12 hours, followed by a 24-hour centered rolling mean for longer gaps, with global mean imputation as a final fallback. This approach preserved temporal autocorrelation while minimizing distortion of extreme events.\u003c/p\u003e\u003cp\u003eFeature engineering focused on lagged predictors (t\u0026thinsp;\u0026minus;\u0026thinsp;1) for each pollutant to capture short-term autocorrelation\u0026mdash;a critical property of urban air quality time series. All features were normalized to the [0, 1] range using MinMaxScaler to stabilize model training.\u003c/p\u003e\u003cp\u003eThe Air Quality Index (AQI) was computed following U.S. EPA (2020) guidelines:\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:AQI=\\text{m}\\text{a}\\text{x}\\left({\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{P}\\text{M}₂.₅}{,\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{P}\\text{M}₁₀}{,\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{N}\\text{O}₂}{,\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{S}\\text{O}₂}{,\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{C}\\text{O}}{,\\text{S}\\text{u}\\text{b}-\\text{I}\\text{n}\\text{d}\\text{e}\\text{x}}_{\\text{O}₃}\\right)$$\u003c/div\u003e\u003c/div\u003e,\u003c/p\u003e\u003cp\u003ewhere sub-indices were derived from EPA breakpoints\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Concentration Ranking and Seasonal Analysis\u003c/h2\u003e\u003cp\u003eTo characterize spatial and temporal pollution patterns, we computed descriptive statistics for all six pollutants:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eAnnual mean concentrations (2015\u0026ndash;2025) for each district.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eSeasonal mean concentrations, stratified into warm (March\u0026ndash;September) and cold (October\u0026ndash;February) periods to reflect Tehran\u0026rsquo;s distinct meteorological regimes (e.g., winter inversions and summer photochemistry).\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThese analyses provide essential context for interpreting model predictions and SHAP-based driver identification.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4 Model Architecture\u003c/h2\u003e\u003cp\u003eA transparent and reproducible machine learning framework was developed using XGBoost (Chen \u0026amp; Guestrin, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e), CatBoost, and LightGBM\u0026mdash;state-of-the-art tree-based ensembles selected for their accuracy, computational efficiency, and native support for SHAP-based interpretability.\u003c/p\u003e\u003cp\u003eAll models were implemented as multi-output regressors to simultaneously predict the six pollutant concentrations. Input features consisted of the lag-1 values of all six pollutants; the target was the current-hour concentration vector. This design enables short-term forecasting using only historical pollutant data, ensuring broad applicability in urban settings equipped with monitoring infrastructure.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5 Optimization, Model Selection, and AQI Derivation\u003c/h2\u003e\u003cp\u003eModel hyperparameters were optimized using Bayesian optimization (20 trials per model) to maximize out-of-sample R\u0026sup2; on a validation set. For each district, the best-performing model was selected based on R\u0026sup2; and MAE on the holdout test period (January 2024\u0026ndash;April 2025).\u003c/p\u003e\u003cp\u003eAir Quality Index (AQI) categories (e.g., \u003cem\u003eGood\u003c/em\u003e, \u003cem\u003eModerate\u003c/em\u003e, \u003cem\u003eUnhealthy\u003c/em\u003e) were derived post-hoc by applying U.S. EPA breakpoints to the predicted pollutant concentrations. This approach ensures perfect alignment between scientific forecasts (\u0026micro;g/m\u0026sup3;) and public health outputs (AQI categories), without requiring a separate classification model or introducing prediction misalignment.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6 Training Protocol\u003c/h2\u003e\u003cp\u003eThe dataset was partitioned chronologically to preserve temporal dynamics:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eTraining: April, 2015 \u0026ndash; December, 2023\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eTest: January, 2024 \u0026ndash; April, 2025\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThis split enables rigorous out-of-sample forecasting and avoids data leakage. All models were trained on the lag-1 feature set and evaluated on their ability to predict the next hour\u0026rsquo;s pollutant concentrations.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e2.7 Forecasting, Evaluation, and Explainability\u003c/h2\u003e\u003cp\u003eThe best-performing model for each district was used to generate forecasts for the test period (January 2024\u0026ndash;April 2025). Evaluation was conducted at two levels:\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003ePollutant-level regression performance, assessed using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R\u0026sup2;.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eAQI-level classification performance, evaluated using accuracy, precision, recall, and F1-score on AQI categories derived from predicted pollutant concentrations via U.S. EPA breakpoints.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003eExplainability was provided through SHAP (Shapley Additive exPlanations) analysis, which quantified the contribution of each lagged pollutant feature to model predictions. Annual SHAP rankings were computed using the full training set, while seasonal SHAP rankings were derived by stratifying the training data into warm (March\u0026ndash;September) and cold (October\u0026ndash;February) periods.\u003c/p\u003e\u003cp\u003eThe framework\u0026rsquo;s sensitivity to real-world events was demonstrated through an analysis of pre/during/post-COVID pollutant trends.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e3.1 Annual and Seasonal Pollutant Concentrations\u003c/h2\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e presents the annual mean concentrations of six key pollutants across Tehran\u0026rsquo;s six socio-spatially stratified districts over the 2015\u0026ndash;2025 period. District 19 (South-Southwest receptor basin) consistently records the highest concentrations for PM₂.₅ (48.2 \u0026micro;g/m\u0026sup3;), PM₁₀ (109.7 \u0026micro;g/m\u0026sup3;), and CO (2287.3 \u0026micro;g/m\u0026sup3;), confirming its status as Tehran\u0026rsquo;s most polluted district\u0026mdash;consistent with its dense residential/commercial land use, high vehicular density, and topographical vulnerability to pollutant accumulation in the low-lying basin. District 2 (northwest high-traffic corridor) exhibits the highest NO₂ (90.1 \u0026micro;g/m\u0026sup3;) and the second-highest CO, reflecting its role as a major transportation artery intersected by multiple expressways.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eAnnual Mean Pollutant Concentrations (2015\u0026ndash;2025)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003ePM₂.₅ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePM₁₀ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eNO₂ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCO (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO₂ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eO₃ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e22.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e61.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e108.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2062.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e13.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e42.7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e40.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e89.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e90.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2534.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e20.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e37.6\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e33.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e77.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2254.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e16.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e43\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e35.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e87.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e84.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1918.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e19.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e42.3\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e24.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e70.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e86.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1485.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e14.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e46.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e48.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e109.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e73.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2287.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e20.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e33.7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eSO₂ concentrations peak in District 2 (20.8 \u0026micro;g/m\u0026sup3;) and District 19 (20.1 \u0026micro;g/m\u0026sup3;), suggesting contributions from industrial or energy-related combustion sources beyond mobile emissions alone. In contrast, District 15 (southern industrial cluster) shows the lowest levels of PM₂.₅, PM₁₀, and CO, aligning with its profile as a regulated industrial zone that likely implements emission controls despite its stationary-source activity. Critically, all districts exceed the WHO\u0026rsquo;s annual PM₂.₅ guideline (5 \u0026micro;g/m\u0026sup3;) by 5 to 10 times, underscoring the severity and persistence of Tehran\u0026rsquo;s air quality crisis.\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e reveals pronounced seasonal patterns shaped by Tehran\u0026rsquo;s semi-arid climate and recurrent meteorological phenomena:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDuring the cold season (October\u0026ndash;February), concentrations of PM₂.₅, PM₁₀, NO₂, CO, and SO₂ are systematically elevated across all districts. For example, District 19\u0026rsquo;s PM₂.₅ increases from 44.2 \u0026micro;g/m\u0026sup3; (warm) to 53.3 \u0026micro;g/m\u0026sup3; (cold)\u0026mdash;a rise driven by wintertime temperature inversions that suppress vertical dispersion and by increased demand for residential and commercial heating.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eIn the warm season (March\u0026ndash;September), O₃ concentrations rise sharply due to enhanced photochemical activity under intense solar radiation. For instance, District 1\u0026rsquo;s O₃ jumps from 25.9 \u0026micro;g/m\u0026sup3; (cold) to 54.9 \u0026micro;g/m\u0026sup3; (warm), reflecting the formation of secondary pollutants from NOₓ and VOC precursors.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThese seasonal shifts exhibit notable spatial heterogeneity:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eTraffic-heavy districts (2 and 14) display the most dramatic cold-season increases in NO₂ and CO\u0026mdash;e.g., District 2\u0026rsquo;s CO rises from 2403.4 \u0026micro;g/m\u0026sup3; (warm) to 2717.9 \u0026micro;g/m\u0026sup3; (cold)\u0026mdash;highlighting the compounding effect of traffic emissions and meteorological trapping.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eIndustrial districts (6 and 15) show only modest winter increases in particulate matter but maintain relatively stable SO₂ levels between seasons, suggesting regulated or consistent industrial combustion processes.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19, despite its high precursor (NOₓ) concentrations, exhibits relatively low summer O₃ (43.2 \u0026micro;g/m\u0026sup3;) compared to other districts (54\u0026ndash;56 \u0026micro;g/m\u0026sup3;), likely due to NO titration, where fresh NO emissions chemically destroy O₃\u0026mdash;a hallmark of traffic-influenced photochemical regimes in receptor zones.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThese empirically observed concentration dynamics closely validate the SHAP-based driver rankings in Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eIn the cold season, SHAP identifies CO and NO₂ as dominant predictive features across most districts, mirroring their elevated concentrations and primary-emission origins.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eIn the warm season, O₃ and NO₂ emerge as top predictors\u0026mdash;particularly in photochemical hotspots like District 19\u0026mdash;directly reflecting the shift toward secondary pollution chemistry.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThus, the concentration trends in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e provide the physical foundation for the model\u0026rsquo;s interpretable insights, reinforcing the mechanistic coherence of our AI-driven air quality assessment framework.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSeasonal Mean Pollutant Concentrations (2015\u0026ndash;2025)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"8\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSeason\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePM₂.₅ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003ePM₁₀ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNO₂ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eCO (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eSO₂ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\"\u003e\u003cp\u003eO₃ (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e20.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e58.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e103.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1695.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e12.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e54.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e26.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e64.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e115.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2568.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e14.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e25.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e37.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e88.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e80.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2403.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e46.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e91.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e103.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2717.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e24.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e23.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e27.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e81.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e67.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2027.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e14.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e56.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e41.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e83\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e90.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2563.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e20.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e24.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e29.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e86.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e79.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1636\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e16.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e54.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e43.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e88.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e91.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2307.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e24\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e26.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e23.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e72.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e75.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1358.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e13.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e54\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e26\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e67.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e101\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1668.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e16.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e35\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e44.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e104.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e69.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2011\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e16.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e43.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e53.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e116.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e78\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2631\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e24.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e21.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e3.2 Regression and AQI Classification Performance\u003c/h2\u003e\u003cp\u003eOur framework achieves exceptional predictive accuracy for PM₂.₅ across all six socio-spatially stratified districts of Tehran, with Mean Absolute Error (MAE) ranging from 3.03 to 9.02 \u0026micro;g/m\u0026sup3;, Root Mean Squared Error (RMSE) from 4.18 to 13.91 \u0026micro;g/m\u0026sup3;, and coefficient of determination (R\u0026sup2;) from 0.65 to 0.91 (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). These results\u0026mdash;derived from a full decade of hourly monitoring data (April 2015\u0026ndash;April 2025)\u0026mdash;demonstrate that interpretable tree-based ensemble models (CatBoost and LightGBM) can effectively capture the nonlinear, temporally dynamic nature of urban air pollution without relying on external meteorological covariates.\u003c/p\u003e\u003cp\u003ePerformance varies systematically with district characteristics:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 1 (northern residential/foothill zone) and District 14 (southeastern commercial/transit hub) exhibit the highest predictive performance (R\u0026sup2; = 0.914 and 0.873, respectively), reflecting relatively stable and temporally consistent emission patterns dominated by regional transport (D1) and recurrent traffic congestion (D14).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 (southern industrial cluster) shows the lowest R\u0026sup2; (0.654), attributable to the intermittent nature of industrial operations and the high volatility of PM₁₀ and SO₂ emissions associated with unregulated fugitive dust and episodic stack releases.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (south-southwest receptor basin), despite being Tehran\u0026rsquo;s most polluted district in terms of mean PM₂.₅ (48.2 \u0026micro;g/m\u0026sup3;), achieves a moderate R\u0026sup2; (0.791) but the highest MAE (9.02 \u0026micro;g/m\u0026sup3;), consistent with its role as a pollution accumulation zone where sharp, inversion-driven spikes challenge even high-performing models.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eRegression Models Performance by District\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eBest Model\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eMAE (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eRMSE (\u0026micro;g/m\u0026sup3;)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eR\u0026sup2;\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCatBoost\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e3.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e4.175\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.914\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e6.762\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e9.503\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.796\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.29\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e11.274\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.741\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.011\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e7.62\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.873\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e4.527\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e10.658\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.654\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCatBoost\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e9.015\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e13.914\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.791\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eAir Quality Index (AQI) categories\u0026mdash;spanning \u003cem\u003eGood\u003c/em\u003e to \u003cem\u003eHazardous\u003c/em\u003e\u0026mdash;were derived post-hoc by applying U.S. EPA (2020) breakpoints directly to the continuous PM₂.₅ predictions generated by each district\u0026rsquo;s best-performing regression model. This unified, two-stage approach (regression \u0026rarr; AQI conversion) eliminates misalignment between scientific forecasts and public health outputs, ensuring that classification results are fully traceable to physical concentration predictions.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eClassification Models Performance by District\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eClassification Method\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAccuracy\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003ePrecision\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eRecall\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eF1-Score\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCatBoost \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.899\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.899\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.894\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.915\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.919\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.915\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.913\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.919\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.919\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.912\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.939\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.939\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.939\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.936\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.937\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCatBoost \u0026rarr; AQI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.928\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.919\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe method delivers exceptional AQI classification performance (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e), with F1-scores ranging from 0.894 to 0.937 and accuracy between 89.9% and 94.0%\u0026mdash;metrics that meet or exceed benchmarks in recent ML-based air quality studies (Wu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Notably:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 15 achieves the highest F1-score (0.937), likely because its industrial emission profile produces more distinct, class-separable AQI levels (e.g., frequent \u003cem\u003eModerate\u003c/em\u003e vs. rare \u003cem\u003eHazardous\u003c/em\u003e).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 1, despite high regression accuracy, shows the lowest F1-score (0.894), possibly due to mixed-source complexity (regional transport, photochemistry, and residential heating) that blurs AQI category boundaries during transitional periods.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eTogether, these results confirm that a transparent, pollutant-only ML pipeline can deliver both high-fidelity concentration forecasts and operationally reliable public health alerts, even in a data-constrained megacity like Tehran. This dual capability\u0026mdash;scientific precision coupled with regulatory alignment\u0026mdash;is essential for real-world deployment in environmental monitoring and public health response systems.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e3.3 AQI Statistics and Health Risk\u003c/h2\u003e\u003cp\u003eOver the 10-year study period (2015\u0026ndash;2025), the Air Quality Index (AQI) across Tehran\u0026rsquo;s six districts spanned the entire U.S. EPA classification spectrum\u0026mdash;from \u0026ldquo;Good\u0026rdquo; (0\u0026ndash;50) during clean, post-rain conditions to \u0026ldquo;Hazardous\u0026rdquo; (\u0026gt;\u0026thinsp;300) during severe winter inversion episodes. District-level mean AQI values ranged from 77.9 (District 15) to 126.7 (District 19), with hourly extremes as low as 11.3 and as high as 313.7 (Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). This wide variability reflects Tehran\u0026rsquo;s bimodal pollution regime: in winter, temperature inversions trap primary pollutants such as PM₂.₅, NO₂, and CO near ground level, producing hazardous air quality; in summer, intense solar radiation fuels photochemical reactions that elevate ozone (O₃) concentrations.\u003c/p\u003e\u003cp\u003eDistrict 19\u0026mdash;a low-lying, densely populated receptor basin in the south-southwest\u0026mdash;emerges as Tehran\u0026rsquo;s most polluted district, with the highest mean AQI (126.7) and peak PM₂.₅ levels (48.2 \u0026micro;g/m\u0026sup3;). Its topography acts as a natural sink, accumulating emissions transported from surrounding high-traffic and industrial zones under stable atmospheric conditions. In contrast:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistricts 2 and 14, major traffic corridors, show consistently elevated NO₂ and CO, reflecting their role as primary transit arteries.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (central commercial core) and District 15 (southern industrial cluster) exhibit moderate SO₂ but notable PM₁₀, largely driven by dust resuspension in arid peripheral areas and industrial activity.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eHealth Risk Interpretation Using U.S. EPA AQI Categories\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe spatial distribution of AQI reveals stark disparities in population-level health exposure:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistricts 1 (northern residential/foothill zone) and 15 (industrial cluster) experience consistently \u0026ldquo;Moderate\u0026rdquo; air quality (Mean AQI\u0026thinsp;\u0026asymp;\u0026thinsp;78; EPA category: 51\u0026ndash;100). While generally acceptable for the general public, both districts occasionally experience \u0026ldquo;Very Unhealthy\u0026rdquo; episodes (AQI\u0026thinsp;\u0026gt;\u0026thinsp;200) during extreme winter inversions or regional dust storms\u0026mdash;posing acute risks to sensitive individuals.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistricts 6 and 14 hover near the upper boundary of the Moderate range (Mean AQI\u0026thinsp;=\u0026thinsp;96.0 and 98.3, respectively), frequently surpassing AQI 100 in winter. Though not classified as high-risk on average, they regularly expose children, the elderly, and those with cardiopulmonary conditions to unhealthy air.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 2 (high-traffic corridor) crosses into the \u0026ldquo;Unhealthy for Sensitive Groups\u0026rdquo; category (Mean AQI\u0026thinsp;=\u0026thinsp;110.1) and routinely endures \u0026ldquo;Very Unhealthy\u0026rdquo; spikes, underscoring the chronic burden of traffic-related pollution.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19 is the only district with a mean AQI exceeding 100, placing it in the \u0026ldquo;Unhealthy\u0026rdquo; range for the general population (EPA label: AQI 151\u0026ndash;200). Even though its mean (126.7) falls below 151, this value implies frequent days in the \u0026ldquo;Unhealthy\u0026rdquo; or \u0026ldquo;Very Unhealthy\u0026rdquo; categories (as confirmed by its median AQI of 132.1 and max of 313.7). Residents here face chronic, city-wide worst-case exposure, with implications for long-term respiratory and cardiovascular health.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThis spatial stratification highlights a pressing environmental justice issue: vulnerable populations in low-elevation, high-density districts like 19 bear a disproportionate cumulative exposure burden, despite contributing minimally to emissions. Conversely, District 15, despite hosting industrial activity, achieves relatively better air quality\u0026mdash;likely due to lower residential density and possibly stricter emission controls. These findings argue compellingly for district-specific air quality management strategies that integrate not only emission inventories but also topographical vulnerability, land-use patterns, and demographic sensitivity. In a city where monitoring data are abundant but health outcomes are unequally distributed, such granular exposure intelligence is essential for equitable environmental governance.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eAir Quality Index (AQI) Statistics by District\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMean AQI\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eMedian AQI\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eMin AQI\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eMax AQI\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e77.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e17.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e234.7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e110.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e107.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e11.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e257.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e96\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e88.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e13.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e272.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e98.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e93.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e24.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e248.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e78\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e75.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e20.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e215.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e126.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e132.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e30.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e313.7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e3.4 Seasonal SHAP Rankings: Uncovering District-Specific Pollution Drivers\u003c/h2\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e presents a comprehensive, data-driven quantification of pollutant feature importance across six socio-spatially distinct districts of Tehran, stratified by season (cold: October\u0026ndash;February; warm: March\u0026ndash;September). By leveraging SHAP (SHapley Additive exPlanations), this analysis moves beyond simple concentration metrics to reveal the \u003cem\u003epredictive influence\u003c/em\u003e of each lagged pollutant in forecasting air quality outcomes. The results expose clear, seasonally dynamic emission regimes that align precisely with Tehran\u0026rsquo;s documented pollution sources, land-use patterns, and meteorological context.\u003c/p\u003e\u003cp\u003e\u003cb\u003eCold Season: Traffic and Heating Emissions Dominate\u003c/b\u003e\u003c/p\u003e\u003cp\u003eDuring the cold season, temperature inversions and increased heating demand trap emissions near the surface, creating conditions where primary pollutants from combustion processes dominate. This is reflected in high SHAP values for CO and NO₂ across traffic-heavy and mixed-use districts:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistricts 1, 2, and 19 all show CO as the primary predictive driver (SHAP: 0.0755\u0026ndash;0.0784), directly implicating vehicular traffic and residential heating as dominant sources.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 14 (southeastern transit hub) shows NO₂ (0.0703) slightly edging out CO (0.0691) as the top driver\u0026mdash;a distinction reflecting its role as a major commuter corridor with persistent idling and congestion.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (central commercial core) exhibits NO₂ as the top driver (SHAP\u0026thinsp;=\u0026thinsp;0.0851), consistent with intense vehicular activity and commercial energy use in this dense urban hub\u0026mdash;not industrial emissions.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 (southern industrial cluster) uniquely identifies PM₁₀ as the leading predictor (SHAP\u0026thinsp;=\u0026thinsp;0.0451), highlighting the role of resuspended dust and unpaved industrial surfaces in this arid, peripheral zone.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThe minor but consistent presence of O₃ in cold-season rankings\u0026mdash;even under reduced sunlight\u0026mdash;reflects carryover precursor chemistry and NO titration dynamics, where residual NOₓ interacts with background ozone.\u003c/p\u003e\u003cp\u003e\u003cb\u003eWarm Season: Photochemical Smog and Persistent Dust Signatures Emerge\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe warm season is characterized by enhanced photochemical activity and reduced inversion trapping, leading to a distinct shift in predictive drivers:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (south-southwest receptor basin) exhibits a textbook photochemical regime, with O₃ as the dominant predictor (SHAP\u0026thinsp;=\u0026thinsp;0.0999)\u0026mdash;the highest of any pollutant in any district/season. This aligns with its low-lying basin topography, which accumulates NOₓ and VOC precursors from surrounding traffic corridors, enabling intense ozone production under strong solar radiation.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 1 (northern residential/foothill zone) shows SO₂ as the primary driver (SHAP\u0026thinsp;=\u0026thinsp;0.0817), likely due to regional transport of combustion plumes from southern industrial and power-generation sources, with O₃ as a strong secondary influence (SHAP\u0026thinsp;=\u0026thinsp;0.0647).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistricts 2 (northwest traffic corridor) and 14 (southeast transit hub) show a transition toward O₃ dominance: in District 2, CO (0.0592) and O₃ (0.0574) are nearly co-dominant; in District 14, NO₂ (0.0326) and O₃ (0.0292) lead, reflecting the shift from primary to secondary pollution chemistry in summer.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 remains PM₁₀-dominated (SHAP\u0026thinsp;=\u0026thinsp;0.0464), with CO and NO₂ as secondary drivers, underscoring the year-round influence of dust resuspension alongside traffic in this industrial-southeastern zone.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6 shows the strongest NO₂ signal citywide in summer (SHAP\u0026thinsp;=\u0026thinsp;0.1281), reinforcing its identity as a traffic- and energy-intensive central core, where combustion emissions remain the dominant forecasting signal even during high-photochemistry periods.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eScientific and Policy Significance\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe seasonal SHAP rankings (Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e) reveal pollutant\u0026ndash;district signatures that are physically interpretable and strongly consistent with Tehran\u0026rsquo;s known emission dynamics and atmospheric controls. These patterns are not statistical artifacts but empirically grounded insights:\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThe cold-season dominance of CO and NO₂\u0026mdash;especially in Districts 2, 14, and 19\u0026mdash;mirrors Tehran\u0026rsquo;s vehicle-driven pollution regime, where mobile sources contribute\u0026thinsp;~\u0026thinsp;80\u0026ndash;85% of primary pollutants (Taghizadeh et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). This aligns with inversion-enhanced accumulation (Nejad et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2023\u003c/span\u003e) and ML-based findings in global megacities (Wu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThe intense warm-season O₃ sensitivity in District 19 reflects a secondary pollution hotspot, consistent with Tehran\u0026rsquo;s municipal air quality reports. Despite the absence of VOC measurements, the O₃ SHAP dominance signals active NOₓ\u0026ndash;VOC photochemistry under solar forcing\u0026mdash;a mechanism well-documented in ML-O₃ studies (Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Xu et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eThe persistent PM₁₀ leadership in District 15 across both seasons corroborates satellite and ground evidence of dust entrainment in southeastern Tehran (Mamić et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e), validating the model\u0026rsquo;s sensitivity to non-combustion particulate sources.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSeasonal SHAP Rankings\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"8\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDistrict\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSeason\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePrimary\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eSecondary\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eTertiary\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eQuaternary\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eQuinary\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\"\u003e\u003cp\u003eSenary\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCO_ugm3 (0.0784)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eO3_ugm3 (0.0203)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNO2_ugm3 (0.0178)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO2_ugm3 (0.0004)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0003)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003ePM10_ugm3 (0.0001)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSO2_ugm3 (0.0817)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eO3_ugm3 (0.0647)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCO_ugm3 (0.0462)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNO2_ugm3 (0.0333)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0237)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003ePM10_ugm3 (0.0147)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCO_ugm3 (0.0755)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eNO2_ugm3 (0.048)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eO3_ugm3 (0.0201)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM10_ugm3 (0.0135)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.009)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eSO2_ugm3 (0.0074)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCO_ugm3 (0.0592)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eO3_ugm3 (0.0574)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNO2_ugm3 (0.0438)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM10_ugm3 (0.0142)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0134)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eSO2_ugm3 (0.0044)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNO2_ugm3 (0.0851)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCO_ugm3 (0.0416)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eO3_ugm3 (0.0294)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO2_ugm3 (0.0176)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM10_ugm3 (0.0123)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0021)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNO2_ugm3 (0.1281)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCO_ugm3 (0.0392)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eO3_ugm3 (0.0387)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM10_ugm3 (0.0291)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eSO2_ugm3 (0.0193)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0136)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNO2_ugm3 (0.0703)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCO_ugm3 (0.0691)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eSO2_ugm3 (0.0254)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM10_ugm3 (0.0101)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0019)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eO3_ugm3 (0.0019)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNO2_ugm3 (0.0326)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eO3_ugm3 (0.0292)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCO_ugm3 (0.0163)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO2_ugm3 (0.0145)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM10_ugm3 (0.0078)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0042)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePM10_ugm3 (0.0451)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCO_ugm3 (0.0419)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNO2_ugm3 (0.0156)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO2_ugm3 (0.0044)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0016)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eO3_ugm3 (0.0016)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePM10_ugm3 (0.0464)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCO_ugm3 (0.0389)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNO2_ugm3 (0.0379)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0348)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eO3_ugm3 (0.025)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eSO2_ugm3 (0.0121)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ecold\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCO_ugm3 (0.078)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eNO2_ugm3 (0.0446)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003ePM10_ugm3 (0.0122)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSO2_ugm3 (0.0025)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0009)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eO3_ugm3 (0.0003)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003ewarm\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eO3_ugm3 (0.0999)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eNO2_ugm3 (0.0705)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCO_ugm3 (0.0486)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePM2.5_ugm3 (0.0316)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003ePM10_ugm3 (0.0315)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003eSO2_ugm3 (0.0137)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003ePolicy implications are spatially precise:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eTraffic-intensive districts (2, 14, 19): Prioritize low-emission zones, public transit expansion, and fleet electrification to reduce CO/NO₂ (Taghizadeh et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eCentral commercial core (6): Target energy efficiency in commercial buildings and traffic flow optimization to curb NO₂.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eIndustrial cluster (15): Enforce dust suppression, paved surfaces, and stack emission controls to address PM₁₀ and SO₂.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003ePhotochemical hotspot (19): Implement NOₓ/VOC precursor management, green infrastructure, and urban cooling strategies to mitigate ozone formation (Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eBy delivering decadal, district-level, seasonally resolved XAI, this framework transcends city-wide averages to provide a mechanistically coherent evidence base for precision environmental governance in complex megacities.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003e3.5 Visual Analysis\u003c/h2\u003e\u003cp\u003e\u003cb\u003ePollutant Correlation Matrices\u003c/b\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e presents correlation matrices for six strategically selected districts of Tehran, illustrating pairwise relationships between key pollutants (PM₂.₅, PM₁₀, NO₂, CO, SO₂, O₃). The color gradient (red for strong positive correlations, blue for negative or weak correlations) reveals distinct spatially heterogeneous pollution dynamics, directly reflecting the socio-spatial and emission profiles of each district.\u003c/p\u003e\u003cp\u003eThe analysis reveals three primary patterns:\u003c/p\u003e\u003cp\u003e\u003cem\u003e1. Industrial and Central Urban Zones (Districts 6 and 15)\u003c/em\u003e\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (Central Commercial/High-Traffic Core): This district exhibits the most complex correlation structure. It shows moderate positive correlations between SO₂ and other pollutants (e.g., SO₂-NO₂ = 0.16, SO₂-CO\u0026thinsp;=\u0026thinsp;0.27), suggesting shared sources from combustion processes associated with dense urban activity rather than heavy industry. The strongest correlations are between PM₁₀ and PM₂.₅ (0.63) and between CO and NO₂ (0.47), reinforcing its identity as a high-traffic zone where exhaust emissions and road dust are dominant.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 (Industrial Cluster): This district displays weak to moderate positive correlations across most pollutant pairs. Notably, SO₂ shows very low correlation with other pollutants (e.g., SO₂-NO₂ = 0.21, SO₂-CO\u0026thinsp;=\u0026thinsp;0.19), which contradicts the notion of SO₂ being a dominant, co-emitted industrial tracer. Instead, the highest correlation is between PM₁₀ and PM₂.₅ (0.58), consistent with its profile as an area dominated by dust resuspension and particulate matter from industrial and heavy-vehicle activity. The relatively weak link between SO₂ and other pollutants suggests that its emissions may be more isolated or regulated compared to other sources.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003e2. Traffic-Dominated Districts (Districts 2 and 14)\u003c/em\u003e\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 2 (High-Traffic Corridor): This district shows strong positive correlations centered on CO and NO₂ (CO-NO₂ = 0.35), directly confirming vehicular traffic as the primary source. It also exhibits high correlations between PM₁₀ and PM₂.₅ (0.61), indicating co-emission from exhaust and road dust. The correlation between NO₂ and PM₂.₅ (0.29) is moderate, supporting the link between traffic emissions and fine particulate matter.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 14 (Commercial/Transit Hub): Similar to District 2, this district shows strong correlations between CO and NO₂ (0.46) and among the particulate matter fractions (PM₁₀-PM₂.₅ = 0.62). However, it also shows a notable positive correlation between SO₂ and CO (0.53), which is higher than in other districts. This suggests that in this transit hub, there may be a contribution from diesel-powered commercial vehicles or nearby industrial activity influencing SO₂ levels alongside traffic emissions.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003e3. Residential and Receptor Zones (Districts 1 and 19)\u003c/em\u003e\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 1 (Northern Residential/Foothill Zone): This district displays weaker overall correlations compared to others, reflecting its lower local emissions and sensitivity to regional transport. The strongest correlation is between PM₁₀ and PM₂.₅ (0.60), indicating shared sources like transported dust or secondary aerosols. There is a moderate negative correlation between O₃ and CO (-0.38) and O₃ and NO₂ (-0.24), which is consistent with photochemical titration in a receptor zone where O₃ can be consumed by high concentrations of NOₓ.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (South-Southwest Receptor Basin): This district shows the weakest correlations overall, particularly for O₃, which has low to negative correlations with all other pollutants (e.g., O₃-CO = -0.28, O₃-NO₂ = -0.32). This is characteristic of a photochemical hotspot, where O₃ is formed independently from precursor gases (NOₓ, VOCs) and can be destroyed by fresh NO emissions, leading to an inverse relationship. The only strong correlation is between PM₁₀ and PM₂.₅ (0.51), again pointing to common sources of particulate matter.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eSHAP-Based Feature Importance\u003c/b\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e presents the mean absolute SHAP values for the lag-1 features of each pollutant, ranked by their predictive importance for air quality in each of the six districts. This analysis reveals the dominant drivers of short-term pollution fluctuations within each district, providing a clear, interpretable signature of local emission sources and atmospheric dynamics.The findings from Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e are as follows:\u003c/p\u003e\u003cp\u003eDistrict 1 (mixed-use): The most influential predictor is SO₂ (Mean |SHAP| = 0.083), followed by O₃ (0.066). This indicates that emissions linked to fossil fuel combustion (e.g., power generation or industrial activity) and photochemical processes are the primary drivers of short-term variability in this district, consistent with its central location and mixed land use.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eDistrict 2 (High-traffic corridor): The leading contributor is CO (Mean |SHAP| = 0.060), followed closely by O₃ (0.059). This strongly reinforces the dominance of vehicular traffic as the primary source of pollution, with CO being a direct tracer of combustion and O₃ indicating active photochemistry in this major corridor.\u003c/p\u003e\u003cp\u003eDistrict 6 (Central commercial / high-traffic core): NO₂ is the most important factor (Mean |SHAP| = 0.125), followed by CO (0.038) and O₃ (0.035). This highlights the significant role of industrial combustion processes, which emit large quantities of NOₓ, and suggests that these emissions also contribute to secondary ozone formation.\u003c/p\u003e\u003cp\u003eDistrict 14 (Commercial/transit hub): NO₂ dominates (Mean |SHAP| = 0.043), followed by O₃ (0.034) and CO (0.023). This pattern confirms the district's identity as a major transit hub, where traffic-related emissions (NO₂, CO) drive air quality, with O₃ reflecting secondary pollutant formation under sunny conditions.\u003c/p\u003e\u003cp\u003eDistrict 15 (Industrial): PM₁₀ stands out as the primary driver (Mean |SHAP| = 0.045), followed by CO and NO₂ (both at 0.032). This is consistent with the district's profile as an industrial cluster, where dust resuspension (PM₁₀) and industrial/vehicular combustion (CO, NO₂) are key sources of pollution.\u003c/p\u003e\u003cp\u003eDistrict 19 (Residential/commercial receptor): O₃ is the most influential pollutant (Mean |SHAP| = 0.101), followed by NO₂ (0.067) and CO (0.049). This identifies District 19 as a distinct photochemical hotspot, where high levels of precursor gases (NO₂, CO) from dense traffic lead to intense ozone production, a finding supported by its topographical position and summer meteorology.\u003c/p\u003e\u003cp\u003eThese results provide a granular, data-driven basis for targeted environmental management and validate the reliability of the AI/ML models. The utility of eXplainable Artificial Intelligence (XAI) is underscored, as it transforms complex model outputs into actionable, physically interpretable insights for environmental monitoring and decision-making.\u003c/p\u003e\u003cp\u003e\u003cb\u003ePandemic Impact on Pollutant Levels\u003c/b\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e presents a comparative analysis of key pollutants across six districts during three distinct timeframes: Pre-COVID (April 2015 \u0026ndash; November 2019), During-COVID (December 2019 \u0026ndash; April 2023), and Post-COVID (May 2023 \u0026ndash; April 2025). The data reveal a complex, district- and pollutant-specific response to the pandemic, with significant reductions in primary pollutants during the lockdown period, followed by a heterogeneous recovery in the post-pandemic era.\u003c/p\u003e\u003cp\u003eThe most pronounced declines occurred in traffic-dominated districts:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 2 (high-traffic corridor) saw its mean NO₂ levels increase from 80.4 \u0026micro;g/m\u0026sup3; (Pre) to 101.0 \u0026micro;g/m\u0026sup3; (During), a rise of approximately 25.7%. This counter-intuitive increase may reflect localized measurement effects or changes in traffic composition during restrictions. Its CO levels fell from 2719.8 \u0026micro;g/m\u0026sup3; (Pre) to 2409.7 \u0026micro;g/m\u0026sup3; (During), a reduction of 11.4%.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 14 (commercial/transit hub) exhibited a decrease in NO₂ from 86.1 \u0026micro;g/m\u0026sup3; (Pre) to 81.9 \u0026micro;g/m\u0026sup3; (During), and a more substantial drop in CO from 2034.8 \u0026micro;g/m\u0026sup3; (Pre) to 1707.9 \u0026micro;g/m\u0026sup3; (During), a 16.0% reduction.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eIndustrial districts showed varied responses:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (central commercial/high-traffic core) experienced a slight decrease in SO₂ from 18.4 \u0026micro;g/m\u0026sup3; (Pre) to 16.5 \u0026micro;g/m\u0026sup3; (During), while its PM₂.₅ levels increased from 32.3 \u0026micro;g/m\u0026sup3; (Pre) to 34.4 \u0026micro;g/m\u0026sup3; (During).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 (industrial cluster) recorded a significant drop in SO₂ from 16.7 \u0026micro;g/m\u0026sup3; (Pre) to 13.7 \u0026micro;g/m\u0026sup3; (During), consistent with reduced industrial output. Its CO levels decreased dramatically from 1768.5 \u0026micro;g/m\u0026sup3; (Pre) to 1182.2 \u0026micro;g/m\u0026sup3; (During), a 33.2% reduction.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eResidential/commercial areas displayed mixed results:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (receptor basin) saw its NO₂ fall from 79.5 \u0026micro;g/m\u0026sup3; (Pre) to 73.9 \u0026micro;g/m\u0026sup3; (During), while CO dropped significantly from 2808.0 \u0026micro;g/m\u0026sup3; (Pre) to 1726.0 \u0026micro;g/m\u0026sup3; (During), a 38.5% reduction.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eHowever, the post-pandemic recovery is highly variable. In many cases, concentrations have returned to, or even exceeded, pre-pandemic levels, but the pattern is not uniform:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 6 shows PM₂.₅ rebounding to 32.8 \u0026micro;g/m\u0026sup3; in the Post-COVID period\u0026mdash;slightly above its pre-pandemic level of 32.3 \u0026micro;g/m\u0026sup3;\u0026mdash;indicating a near-complete return to baseline after a transient increase during lockdowns.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 2\u0026rsquo;s CO has recovered to 2278.6 \u0026micro;g/m\u0026sup3; (Post), still below its pre-pandemic value of 2719.8 \u0026micro;g/m\u0026sup3;.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19\u0026rsquo;s NO₂ has fallen further to 61.4 \u0026micro;g/m\u0026sup3; (Post), well below its pre-pandemic level, while its PM₁₀ has surged to 128.4 \u0026micro;g/m\u0026sup3; (Post), far exceeding its pre-pandemic level of 89.1 \u0026micro;g/m\u0026sup3;.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eNotably, District 15 maintained lower SO₂ levels in the post-pandemic period (12.7 \u0026micro;g/m\u0026sup3;) compared to pre-pandemic (16.7 \u0026micro;g/m\u0026sup3;), suggesting potential long-term changes in industrial operations or emission controls.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eA unique pattern emerged for Ozone (O₃). While primary pollutants declined, O₃ levels often remained stable or showed minor increases during the pandemic, contrary to the common expectation of an increase due to reduced NOₓ titration. For instance:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 1 (residential/foothill) saw O₃ rise slightly from 41.0 \u0026micro;g/m\u0026sup3; (Pre) to 43.7 \u0026micro;g/m\u0026sup3; (During).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (central core) experienced a slight dip in O₃ from 42.6 \u0026micro;g/m\u0026sup3; (Pre) to 42.1 \u0026micro;g/m\u0026sup3; (During), before surging to 47.5 \u0026micro;g/m\u0026sup3; (Post).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (receptor basin) experienced a decrease in O₃ from 36.9 \u0026micro;g/m\u0026sup3; (Pre) to 32.1 \u0026micro;g/m\u0026sup3; (During), before partially recovering to 34.2 \u0026micro;g/m\u0026sup3; in the post-pandemic period. This suggests that in this specific receptor zone, other factors (e.g., changes in precursor mix, meteorology, or background pollution) may have outweighed the simple NOₓ-titration effect.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThese findings highlight the limited and non-uniform long-term impact of temporary interventions and underscore the need for sustained, structural policy actions beyond episodic reductions. The model\u0026rsquo;s sensitivity to these real-world perturbations validates its utility for evaluating the impact of policy measures and designing resilient, long-term air quality strategies.\u003c/p\u003e\u003cp\u003e\u003cb\u003ePrediction vs. Actual Trends\u003c/b\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e presents a direct comparison between actual and predicted concentrations for the \u003cem\u003eleading pollutant\u003c/em\u003e in each of the six districts over the test period (January 2024\u0026ndash;April 2025). The leading pollutant for each district was determined by its highest mean absolute SHAP value across the full training set (as shown in Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e and discussed in Section \u003cspan refid=\"Sec14\" class=\"InternalRef\"\u003e3.4\u003c/span\u003e), ensuring the plots reflect the model\u0026rsquo;s most influential predictive feature for each location.\u003c/p\u003e\u003cp\u003eThe visual comparison reveals strong alignment between actual and predicted values across all districts, demonstrating the model\u0026rsquo;s capacity to accurately reproduce real-world pollution dynamics on an hourly basis. The relatively narrow Mean Absolute Error (MAE) bands, displayed as shaded regions around the predicted line, indicate high precision and low uncertainty in the forecasts.\u003c/p\u003e\u003cp\u003eSeasonal patterns are consistently captured:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eElevated pollutant levels during the cold season (October\u0026ndash;February) are evident in most plots, reflecting the impact of winter temperature inversions and increased heating demand.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eLower concentrations and distinct diurnal cycles are observed during the warm season (March\u0026ndash;September), consistent with enhanced atmospheric dispersion and photochemical activity.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eShort-term variability, including sharp spikes linked to traffic congestion, industrial activity, or dust events, is also well-represented, highlighting the model\u0026rsquo;s sensitivity to dynamic emission sources and temporal dependencies.\u003c/p\u003e\u003cp\u003eDistrict-level performance demonstrates robustness across Tehran\u0026rsquo;s diverse urban environments:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eDistrict 1 (Leading Pollutant: SO₂): The model accurately tracks the highly variable SO₂ concentrations, capturing both baseline fluctuations and episodic peaks. This aligns with the district\u0026rsquo;s role as a receptor zone for regionally transported pollutants, where SO₂ can be influenced by distant industrial sources.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 2 (Leading Pollutant: CO): Predictions for CO show excellent agreement with observations, particularly during peak traffic hours. The model effectively captures the daily and weekly patterns associated with vehicular emissions, confirming its sensitivity to mobile-source drivers.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 6 (Leading Pollutant: NO₂): The periodic nature of NO₂ emissions, primarily linked to dense urban traffic and commercial activity in central Tehran, is effectively modeled. While occasional deviations occur during sudden emission surges (e.g., from heavy traffic or local events), the predictions remain largely within the MAE band.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 14 (Leading Pollutant: NO₂): The model precisely predicts the NO₂ peaks associated with morning and evening rush hours, reflecting the influence of traffic congestion in this major transit hub. The close tracking of daily cycles underscores the model\u0026rsquo;s ability to capture localized, time-dependent emission patterns.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 15 (Leading Pollutant: PM₁₀): Predictions for PM₁₀ demonstrate reasonable accuracy, maintaining forecasts within the MAE band despite periods of abrupt change. This reflects the complex interplay of stationary industrial sources and resuspended dust, which can lead to highly variable concentrations.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eDistrict 19 (Leading Pollutant: O₃): The model successfully captures the seasonal and diurnal patterns of ozone, including its characteristic afternoon peaks during the warm season. The strong alignment between actual and predicted O₃ levels validates the model\u0026rsquo;s ability to forecast secondary pollutant formation in this photochemical hotspot.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eModel performance metrics support these qualitative observations. The MAE values reported in the figure captions (ranging from 0.142 ppm for CO in District 2 to 8.378 \u0026micro;g/m\u0026sup3; for PM₁₀ in District 15) are consistent with the quantitative results presented in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. These results affirm the effectiveness of the tree-based ensemble framework in handling nonlinear relationships and complex interactions among pollutants.\u003c/p\u003e\u003cp\u003eThe framework also exhibits resilience in predicting extreme pollution episodes. While minor deviations occur during very sharp spikes, most predictions remain within the uncertainty bounds, enabling users to gauge confidence levels in real-time applications. These findings validate the predictive power of the interpretable ML framework and its suitability for deployment in real-world air quality management systems.\u003c/p\u003e\u003cp\u003eThe framework also exhibits resilience in predicting extreme pollution episodes such as dust storms or industrial outbursts. While minor deviations occur during very sharp spikes, most predictions remain within the uncertainty bounds, enabling users to gauge confidence levels in real-time applications. These findings validate the predictive power of our interpretable machine learning framework and its suitability for deployment in real-world air quality management systems.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eThis study introduces a transparent, reproducible, and highly interpretable machine learning framework for district-level air quality assessment in Tehran, Iran, grounded in a decade of high-resolution hourly monitoring data (April 2015\u0026ndash;April 2025). A central contribution of this work is its exceptional spatiotemporal granularity in source attribution\u0026mdash;offering a nuanced picture of urban pollution dynamics that transcends city-wide averages. Through SHAP-based explainable AI, we uncover district-specific and seasonally dynamic pollution drivers that closely align with Tehran\u0026rsquo;s well-documented emission landscape and atmospheric behavior.\u003c/p\u003e\u003cp\u003eThe central commercial core (District 6) is consistently dominated by NO₂, reflecting intense vehicular and energy-related combustion in Tehran\u0026rsquo;s dense administrative hub, while the industrial cluster (District 15) is uniquely characterized by PM₁₀ dominance, attributable to dust resuspension and industrial activity\u0026mdash;patterns well established in Tehran\u0026rsquo;s environmental literature. Major traffic corridors (District 2) and transit-influenced residential zones (District 14) exhibit CO and NO₂ as primary predictive features year-round, consistent with their roles as key commuter arteries. In contrast, the northern residential/foothill zone (District 1) and the south-southwest receptor basin (District 19) display distinct photochemical regimes: SO₂ (in D1) and O₃ (in D19) emerge as dominant warm-season drivers, while CO prevails in winter\u0026mdash;directly mirroring seasonal shifts in precursor availability, inversion frequency, and atmospheric stability under Tehran\u0026rsquo;s semi-arid climate.\u003c/p\u003e\u003cp\u003eThese patterns are not statistical artifacts but empirically grounded signatures of Tehran\u0026rsquo;s complex airshed. They corroborate documented evidence on industrial clustering, transportation emissions, and meteorological controls (Taghizadeh et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Furthermore, our identification of winter inversion effects\u0026mdash;manifested in elevated SHAP importance for CO and NO₂ during cold months\u0026mdash;is strongly supported by recent Tehran-focused studies that document sharp pollutant entrapment under stable atmospheric conditions (Nejad et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). This alignment underscores the capacity of interpretable machine learning to recover physically meaningful mechanisms that resonate with the broader environmental science literature.\u003c/p\u003e\u003cp\u003eThe framework also demonstrates high sensitivity to real-world environmental interventions. It accurately captures 10\u0026ndash;30% reductions in NO₂, 20\u0026ndash;30% declines in CO, and notable drops in PM₂.₅ during the 2020\u0026ndash;2022 mobility restrictions, followed by a post-pandemic rebound. This temporal responsiveness highlights the model\u0026rsquo;s dual utility\u0026mdash;not only for forecasting, but also for evaluating the effectiveness of environmental policies and informing resilient, long-term air quality strategies (Le et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). These findings echo regional and global observations that machine learning\u0026ndash;based spatiotemporal frameworks can reliably track high-resolution pollutant responses to both anthropogenic disruptions and policy actions (Wu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Lin et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Wang et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2024\u003c/span\u003e), reinforcing the generalizability of our approach across diverse urban contexts.\u003c/p\u003e\u003cp\u003eA key limitation of this study is the absence of explicit meteorological covariates\u0026mdash;such as temperature, wind speed, or relative humidity\u0026mdash;in the input data. This constraint stems from the design of Tehran\u0026rsquo;s existing monitoring infrastructure, which focuses on pollutant concentrations rather than co-located meteorological measurements. Importantly, however, our pipeline implicitly captures many meteorologically driven phenomena through lagged pollutant features and seasonal stratification. For instance, the elevated SHAP importance of CO and NO₂ in winter accurately reflects inversion-related accumulation, even without direct weather inputs. Despite this data limitation, the models deliver robust predictive performance across pollutants, districts, and seasons\u0026mdash;demonstrating remarkable adaptability under real-world constraints.\u003c/p\u003e\u003cp\u003eThis performance is consistent with findings from other megacities like Beijing, Delhi, and Kuala Lumpur, where high-performing ML systems often rely primarily on pollutant time-series structure rather than external covariates (Xu et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Masood et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Ke et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). While future integration of meteorological data is expected to further refine predictions\u0026mdash;especially for secondary pollutants like O₃\u0026mdash;the current results affirm the practical utility of pollutant-only frameworks in regions with sparse meteorological monitoring (Mamić et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). This is particularly relevant for low- and middle-income countries where air quality monitoring is often prioritized over comprehensive weather instrumentation.\u003c/p\u003e\u003cp\u003eBy leveraging an ensemble of state-of-the-art tree-based models\u0026mdash;XGBoost, CatBoost, and LightGBM\u0026mdash;we achieve exceptional predictive accuracy (MAE: 3.03\u0026ndash;9.02 \u0026micro;g/m\u0026sup3;; R\u0026sup2;: 0.65\u0026ndash;0.91) and reliable AQI classification (F1-score: 0.894\u0026ndash;0.937) without meteorological inputs. This demonstrates that high-impact, policy-actionable forecasting is feasible even in data-constrained settings\u0026mdash;a critical insight for megacities across the Global South. Our results align with prior Tehran-based work showing that pollutant-only models can extract meaningful signals when the modeling architecture and feature engineering are robust (Ghaemi et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), and with global studies affirming that interpretable, structured models can match or outperform complex deep learning systems when covariate data are limited (Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Zaini et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Mamić et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eAlthough this study does not incorporate individual-level epidemiological records, the U.S. EPA Air Quality Index (AQI) provides a scientifically validated framework for estimating population-level health risk exposure. AQI categories map directly to defined public health guidance: \u0026lsquo;Good\u0026rsquo; (0\u0026ndash;50) poses little or no risk; \u0026lsquo;Moderate\u0026rsquo; (51\u0026ndash;100) may affect unusually sensitive individuals; \u0026lsquo;Unhealthy for Sensitive Groups\u0026rsquo; (101\u0026ndash;150) can impact children, the elderly, and those with respiratory or cardiovascular conditions; \u0026lsquo;Unhealthy\u0026rsquo; (151\u0026ndash;200) is harmful to the general population; \u0026lsquo;Very Unhealthy\u0026rsquo; (201\u0026ndash;300) triggers serious health warnings; and \u0026lsquo;Hazardous\u0026rsquo; (\u0026gt;\u0026thinsp;300) represents emergency conditions posing severe risk to all (U.S. EPA, 2020). Our district-level AQI forecasts\u0026mdash;spanning from Good (min\u0026thinsp;=\u0026thinsp;11.3) to Hazardous (max\u0026thinsp;=\u0026thinsp;313.7)\u0026mdash;thus deliver actionable, spatially resolved estimates of relative exposure burden, particularly for vulnerable subpopulations in high-risk zones like District 19.\u003c/p\u003e\u003cp\u003eOur modular, reproducible pipeline is designed for direct scalability to other cities facing similar challenges\u0026mdash;from Delhi to Cairo\u0026mdash;where pollutant-only monitoring is routine but meteorological data remain scarce. The integration of high accuracy, full interpretability, and regulatory-aligned outputs (U.S. EPA AQI) ensures immediate policy relevance. Importantly, this work challenges the common assumption that forecasting quality depends on model complexity. Instead, it shows that transparent, explainable architectures can deliver superior utility for environmental governance (Peng et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Wu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Structured boosting models, when paired with XAI, can match or exceed the performance of hybrid or optimization-driven systems\u0026mdash;such as ELM\u0026ndash;metaheuristic models or spatiotemporal attention networks\u0026mdash;while preserving transparency and decision-making clarity (Masood et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Wang et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eIn sum, this research moves beyond technical prediction to offer a data-driven, district-aware, and seasonally informed blueprint for urban air quality governance. It demonstrates that even with limited input data, a thoughtfully designed, interpretable ML framework can yield scientifically rigorous and policy-relevant insights\u0026mdash;paving the way for healthier, more sustainable cities in Tehran and beyond. By anchoring our findings in both Iran\u0026rsquo;s environmental reality and the evolving global discourse on ML for air quality, this study provides a robust foundation for future integration of meteorology, remote sensing, and causal inference in next-generation urban environmental intelligence. Crucially, it affirms that the path to equitable, high-impact environmental monitoring does not require perfect data\u0026mdash;but rather intelligent, context-sensitive design grounded in real-world constraints and governance needs.\u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study presents a scalable, interpretable, and high-fidelity machine learning framework for district-level air quality forecasting in data-constrained megacities. Drawing on a decade of high-resolution, multi-pollutant monitoring data from six socio-spatially stratified districts of Tehran, we demonstrate that tree-based ensemble models\u0026mdash;augmented with SHAP-based explainability\u0026mdash;can achieve exceptional predictive accuracy without relying on meteorological covariates. Critically, the framework goes beyond mere forecasting: it reveals seasonally dynamic, district-specific pollution regimes that reflect the interplay of local emission sources, topographic trapping, and atmospheric processes.\u003c/p\u003e\u003cp\u003eRather than offering another opaque predictive system, this work delivers a transparent, interpretable, and policy-actionable blueprint for urban environmental governance. Its demonstrated sensitivity to real-world perturbations\u0026mdash;such as the 10\u0026ndash;30% reductions in NO₂ and CO during the 2020\u0026ndash;2022 mobility restrictions\u0026mdash;and its compatibility with existing monitoring infrastructure make it immediately deployable in cities across the Global South, where meteorological data are often unavailable but air quality governance is urgently needed. Our findings affirm a key methodological principle: high-impact environmental intelligence depends not on data abundance, but on context-aware design grounded in real-world monitoring constraints.\u003c/p\u003e\u003cp\u003eBy transforming ten years of Tehran\u0026rsquo;s air quality records into a reproducible, open, and interpretable pipeline, we provide more than a forecasting tool\u0026mdash;we establish a foundation for precision air quality management, where interventions are tailored to the unique spatiotemporal pollution fingerprint of each urban neighborhood. In doing so, this study advances Tehran\u0026rsquo;s capacity for evidence-based environmental action and offers a globally transferable model for equitable, data-driven urban resilience in the face of escalating air pollution challenges.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eData Availability:\u003c/h2\u003e\u003cp\u003eThe datasets analyzed in this study were obtained from the Tehran Air Quality Monitoring System (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://air.tehran.ir\u003c/span\u003e\u003cspan address=\"http://air.tehran.ir\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), operated by the Tehran Environmental Monitoring Center, subject to the provider\u0026rsquo;s terms. Due to regional network restrictions, this website is most likely accessible only within Iran. The data used in this study are available from the corresponding author upon reasonable request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eChen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In \u003cem\u003eProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining\u003c/em\u003e (pp. 785\u0026ndash;794). Association for Computing Machinery. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1145/2939672.2939785\u003c/span\u003e\u003cspan address=\"10.1145/2939672.2939785\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGhaemi Z, Alimohammadi A, Farnaghi M (2018) LaSVM-based big data learning system for dynamic prediction of air pollution in Tehran. Environ Monit Assess 190 Article 300. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10661-018-6659-6\u003c/span\u003e\u003cspan address=\"10.1007/s10661-018-6659-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKe H, Gong S, He J, Zhang L, Mo J (2022) A hybrid XGBoost-SMOTE model for optimization of operational air quality numerical model forecasts. Front Environ Sci 10:1007530. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fenvs.2022.1007530\u003c/span\u003e\u003cspan address=\"10.3389/fenvs.2022.1007530\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLe T, Wang Y, Liu L, Yang J, Yung YL, Li G, Seinfeld JH (2020) Unexpected air pollution with marked emission reductions during the COVID-19 outbreak in China. Science 369(6504):702\u0026ndash;706. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1126/science.abb7431\u003c/span\u003e\u003cspan address=\"10.1126/science.abb7431\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLelieveld J, Evans JS, Fnais M, Giannadaki D, Pozzer A (2015) The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 525(7569):367\u0026ndash;371. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nature15371\u003c/span\u003e\u003cspan address=\"10.1038/nature15371\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLin S, Zhao J, Li J et al (2022) A spatial\u0026ndash;temporal causal convolution network framework for accurate and fine-grained PM₂.₅ concentration prediction. Entropy 24(8):1125. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/e24081125\u003c/span\u003e\u003cspan address=\"10.3390/e24081125\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. \u003cem\u003eAdvances in Neural Information Processing Systems, 30\u003c/em\u003e, 4765\u0026ndash;4774. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf\u003c/span\u003e\u003cspan address=\"https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMamić L, Gašparović M, Kaplan G (2023) Developing PM₂.₅ and PM₁₀ prediction models on a national and regional scale using open-source remote sensing data. Environ Monit Assess 195:644. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10661-023-11212-x\u003c/span\u003e\u003cspan address=\"10.1007/s10661-023-11212-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMasood A, Hameed MM, Srivastava A, Pham QB, Ahmad K, Razali SFM, Baowidan SA (2023) Improving PM₂.₅ prediction in New Delhi using a hybrid extreme learning machine coupled with snake optimization algorithm. Sci Rep 13:21057. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-023-47492-z\u003c/span\u003e\u003cspan address=\"10.1038/s41598-023-47492-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNejad MT, Ghalehteimouri J, Talkhabi K, H., Dolatshahi Z (2023) The relationship between atmospheric temperature inversion and urban air pollution characteristics: A case study of Tehran, Iran. Discover Environ 1:17. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s44274-023-00018-w\u003c/span\u003e\u003cspan address=\"10.1007/s44274-023-00018-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePeng J, Han H, Yi Y, Huang H, Xie L (2022) Machine learning and deep learning modeling and simulation for predicting PM₂.₅ concentrations. Chemosphere 308:136353. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.chemosphere.2022.136353\u003c/span\u003e\u003cspan address=\"10.1016/j.chemosphere.2022.136353\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePope CA, III, Dockery DW (2006) Health effects of fine particulate air pollution: Lines that connect. J Air Waste Manag Assoc 56(6):709\u0026ndash;742. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/10473289.2006.10464485\u003c/span\u003e\u003cspan address=\"10.1080/10473289.2006.10464485\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShi X, Brasseur GP (2020) The response in air quality to the reduction of Chinese economic activities during the COVID-19 outbreak. \u003cem\u003eGeophysical Research Letters, 47\u003c/em\u003e(11), e2020GL088070. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1029/2020GL088070\u003c/span\u003e\u003cspan address=\"10.1029/2020GL088070\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTaghizadeh F, Mokhtarani B, Rahmanian N (2023) Air pollution in Iran: The current status and potential solutions. Environ Monit Assess 195:737. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10661-023-11296-5\u003c/span\u003e\u003cspan address=\"10.1007/s10661-023-11296-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eU.S. Environmental Protection Agency (2020) \u003cem\u003eTechnical assistance document for the reporting of daily air quality\u0026mdash;Air Quality Index (AQI)\u003c/em\u003e (EPA-454/B-20-005). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.epa.gov/sites/default/files/2020-09/documents/aqi-technical-assistance-document-sept2020.pdf\u003c/span\u003e\u003cspan address=\"https://www.epa.gov/sites/default/files/2020-09/documents/aqi-technical-assistance-document-sept2020.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang Y, Tian S, Zhang P (2024) Novel spatio-temporal attention causal convolutional neural network for multi-site PM₂.₅ prediction. Front Environ Sci 12:1408370. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fenvs.2024.1408370\u003c/span\u003e\u003cspan address=\"10.3389/fenvs.2024.1408370\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWorld Health Organization (2021) \u003cem\u003eWHO global air quality guidelines: Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.who.int/publications/i/item/9789240034228\u003c/span\u003e\u003cspan address=\"https://www.who.int/publications/i/item/9789240034228\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWu Y, Lin S, Shi K, Ye Z, Fang Y (2022) Seasonal prediction of daily PM₂.₅ concentrations with interpretable machine learning: a case study of Beijing, China. Environ Sci Pollut Res Int 29(30):45821\u0026ndash;45836. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11356-022-18913-9\u003c/span\u003e\u003cspan address=\"10.1007/s11356-022-18913-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXu S, Li W, Zhu Y, Xu A (2022) A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks. Sci Rep 12:14434. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-022-17754-3\u003c/span\u003e\u003cspan address=\"10.1038/s41598-022-17754-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZaini N, Ean LW, Ahmed AN, Malek A, M., Chow MF (2022) PM₂.₅ forecasting for an urban area based on deep learning and decomposition method. Sci Rep 12:17565. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-022-21769-1\u003c/span\u003e\u003cspan address=\"10.1038/s41598-022-21769-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Air Pollution, Machine Learning, Explainable Artificial Intelligence (XAI), Urban Environment.","lastPublishedDoi":"10.21203/rs.3.rs-8291122/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8291122/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eUrban air pollution poses a persistent threat to public health and environmental sustainability in megacities like Tehran, where complex emission sources and topographical constraints amplify exposure risks. This study presents a transparent, interpretable, and highly accurate machine learning framework for district-level air quality assessment, leveraging a decade of high-resolution hourly monitoring data (April 2015\u0026ndash;April 2025) across six socio-spatially stratified districts. Using an ensemble of tree-based models\u0026mdash;XGBoost, CatBoost, and LightGBM\u0026mdash;the framework forecasts concentrations of six key pollutants (PM₂.₅, PM₁₀, NO₂, CO, SO₂, O₃) with exceptional fidelity (MAE: 3.0\u0026ndash;9.0 \u0026micro;g/m\u0026sup3;; R\u0026sup2;: 0.65\u0026ndash;0.91). Through SHAP-based explainable Artificial Intelligence, the model identifies seasonally dynamic, district-specific pollution drivers\u0026mdash;such as CO/NO₂ dominance in traffic corridors during winter and O₃-driven photochemical regimes in receptor basins during summer\u0026mdash;revealing signatures that align with Tehran\u0026rsquo;s known emission geography. By converting pollutant forecasts into U.S. EPA Air Quality Index (AQI) categories, the framework provides spatially resolved estimates of population-level health risk exposure\u0026mdash;from Moderate to Hazardous conditions\u0026mdash;enabling targeted public health interventions. Designed for scalability and immediate policy utility, this work delivers a reproducible blueprint for data-driven air quality governance. By demonstrating that high-impact forecasting is achievable even in data-constrained settings, the study offers a globally transferable model for healthier, more resilient megacities\u0026mdash;particularly in low- and middle-income regions where monitoring infrastructure is limited but governance needs are urgent.\u003c/p\u003e","manuscriptTitle":"Assessing Air Quality in Tehran: Explainable Artificial Intelligence for Megacities","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-10 08:55:35","doi":"10.21203/rs.3.rs-8291122/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"be7b96a1-5eea-4ce4-9f00-1557de94b70e","owner":[],"postedDate":"December 10th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":59184197,"name":"Artificial Intelligence and Machine Learning"},{"id":59184198,"name":"Environmental Engineering"}],"tags":[],"updatedAt":"2025-12-10T08:55:35+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-10 08:55:35","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8291122","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8291122","identity":"rs-8291122","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00