A dual-objective QSAR framework integrating quantum reactivity and structural descriptors for predicting antifungal activity and honey bee toxicity of pesticide mixtures

preprint OA: closed
Full text JSON View at publisher
Full text 169,641 characters · extracted from preprint-html · click to expand
A dual-objective QSAR framework integrating quantum reactivity and structural descriptors for predicting antifungal activity and honey bee toxicity of pesticide mixtures | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A dual-objective QSAR framework integrating quantum reactivity and structural descriptors for predicting antifungal activity and honey bee toxicity of pesticide mixtures Chia Ming Chang, Tien-Cheng Liu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9203836/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This study presents a unified descriptor-based QSAR framework for analyzing pesticide mixtures from the dual perspective of antifungal efficacy and honey bee toxicity. Rather than developing independent predictive models optimized for individual endpoints, the framework is designed to enable cross-endpoint comparison within a consistent chemical representation space, thereby supporting selectivity-oriented interpretation. Two conceptually linked but independently developed QSAR models were constructed to describe (i) antifungal activity against Macrophomina phaseolina and (ii) acute contact toxicity toward Apis mellifera . Molecular representation was achieved by integrating quantum reactivity descriptors (QRDs), derived from density functional theory (DFT), with a curated subset of structural descriptors (CaPS) calculated using the PaDEL platform. Mixture properties were encoded using a composition-weighted descriptor scheme, and both endpoints were modeled using an identical genetic algorithm–multiple linear regression (GA–MLR) workflow to ensure methodological consistency at the descriptor level. The results reveal clear divergence in descriptor contributions across endpoints. Antifungal efficacy is primarily associated with structural features such as molecular size, polarity, and topological organization, whereas honey bee toxicity is more strongly governed by electronic reactivity, charge-transfer behavior, and polarity-dependent transport properties. This distinction indicates that efficacy and non-target toxicity are not intrinsically coupled, and that a descriptor-level selectivity window may be identified. Pesticide mixtures Antifungal activity Honey bee toxicity Unified descriptor framework Selectivity Quantum reactivity descriptors Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 1. Introduction Modern crop protection is increasingly shaped by a dual imperative: maintaining effective control of phytopathogens while minimizing unintended harm to beneficial non-target organisms. This tension is particularly evident in the context of fungal disease management and pollinator protection. Macrophomina phaseolina is a globally distributed, soil-borne pathogen with a remarkably broad host range and strong persistence under field conditions, and disease severity is typically amplified under hot and water-limited environments, making management especially challenging in drought-prone production systems (Shirai and Eulgem 2023). At the same time, pesticide exposure is now recognized as an important driver of adverse effects on non-target organisms across biological scales, including both managed and wild bees, while landscape-scale evidence indicates that pesticide use can alter bee occurrence, colony performance, and community structure over large spatial extents (Raine and Rundlöf 2024; Nicholson et al. 2024). One practical response to declining fungicide efficacy and increasing resistance pressure has been the use of pesticide mixtures (De Mio et al. 2024 ; Ballu et al. 2024). In crop protection, mixtures are commonly deployed to broaden activity spectra, improve consistency of disease control, and delay resistance evolution. However, the same mixture-based strategy that may improve efficacy against target fungi also complicates ecological safety assessment, because pollinators are rarely exposed to single active ingredients in isolation (Topping et al. 2024). For bees, pesticide mixtures can generate additive, antagonistic, or synergistic effects, and some of the most concerning interactions involve fungicide-associated disruption of detoxification pathways, particularly cytochrome P450 (CYP)-mediated processes (Migdał et al. 2024 ; Rondeau and Raine 2024 ). Recent evidence further suggests that concentration addition provides a robust baseline expectation for many acute bee-mixture datasets, whereas deviations from additivity, although less frequent, are mechanistically informative and therefore highly relevant for risk interpretation (Schuhmann et al. 2022 ; Taenzler et al. 2023 ). These parallel challenges expose a major limitation of conventional experimental efficacy and toxicology testing: the combinatorial chemical space expands much faster than it can be evaluated empirically (Komprda et al. 2026). This problem is especially acute for binary and higher-order mixtures, where exhaustive laboratory testing rapidly becomes impractical (Wang et al. 2018 ). Quantitative structure–activity relationship (QSAR) modeling therefore offers an attractive route for prioritization, screening, and mechanistic interpretation. In the regulatory context, the Organisation for Economic Co-operation and Development (OECD) has long emphasized that credible QSAR models should be associated with a defined endpoint, an unambiguous algorithm, a defined applicability domain, appropriate measures of goodness-of-fit, robustness, and predictivity, and, where possible, a mechanistic interpretation. More recently, the second edition of the OECD (Q)SAR Assessment Framework (QAF) has strengthened guidance for evaluating model confidence, transparency, and regulatory usability (Framework 2024 ; Gissi et al. 2024 ). While QSAR methods are now widely used for many single-compound endpoints, mixture-oriented applications remain relatively limited, especially for endpoints directly related to pesticide selectivity (Evangelista and Papa 2025 ; Gustavsson et al. 2023 ). Existing research has demonstrated that the acute contact toxicity of binary organic mixtures of Apis mellifera can be successfully modeled using QSAR methods, showing the feasibility of toxicological prediction for pollinator mixtures (Carnesecchi et al. 2020 ). Similarly, the antifungal activity of binary fungicide mixtures against M. phaseolina has recently been explored using data-driven QSAR and read-across strategies (Rahimi-Soujeh et al. 2024 ). However, a more generalized, selectivity-oriented framework—one that can directly compare the structural determinants of "desired pesticide efficacy" and "undesired pollinator toxicity"—has not yet been fully developed (Devillers 2014 ; Kar, Amin, and Li 2025 ). If pesticide design is to move beyond simply optimizing potency and further towards a more clearly defined potency-toxicity tradeoff at the molecular level, such a framework is needed (Guo et al. 2025 ; Wang et al. 2025 ). QSAR models have been successfully applied to predict individual endpoints such as antifungal efficacy or bee toxicity, these models are typically built independently, thus providing only limited insights into cross-endpoint relationships. In real-world agricultural contexts, the application of pesticide mixtures aims not only to maximize control over target pathogens but also, ideally, to minimize adverse effects on beneficial non-target organisms. However, the lack of an integrative approach makes it difficult for researchers to compare potency and toxicity on a common basis, especially when different endpoints use different sets of descriptors, modeling strategies, and datasets. Therefore, the descriptor hierarchy governing the potency-toxicity tradeoff remains insufficiently explored. To overcome this limitation, this study proposes a unified descriptor framework that allows cross-endpoint comparisons to be conducted within a consistent chemical characterization space. This study established two independent but conceptually interconnected QSAR models for binary pesticide mixtures: one describing the antifungal activity of Macrophomina phaseolina , and the other describing the acute contact toxicity of Apis mellifera . Instead of forcibly incorporating these two endpoints into a single reaction space, we modeled them separately and then compared them at the descriptor contribution level. To achieve this, we integrated quantum reactivity descriptors calculated using density functional theory (DFT) and a subset of structural descriptors calculated and screened using the PaDEL platform, further extending these descriptors to the mixture level through composition-weighted integration. This strategy aims to capture complementary information such as electronic structure, polarity, topology, and mixture composition within a single interpretable chemical information framework. The goal of this study is not to go beyond existing endpoint-specific QSAR models, but to identify transferable descriptor patterns and mechanistic divergences to support selectivity-oriented pesticide mixture design (Sharma, Ranjan, and Chakraborty 2024 ; Yap 2011 ). Numerous QSAR studies have reported high predictive performance using large descriptor pools and nonlinear machine learning approaches, such models are often optimized for single endpoints and may offer limited support for cross-endpoint interpretation. In pesticide mixture assessment, however, the key question is not only how accurately one endpoint can be predicted in isolation, but also how efficacy and non-target toxicity can be examined within a common analytical framework. For this reason, the present study adopts a unified descriptor strategy that prioritizes interpretability and cross-endpoint comparability over performance maximization alone. This design enables identification of descriptor-level patterns associated with efficacy–toxicity trade-offs that are not readily accessible from independently optimized endpoint-specific models. 2. Materials and Methods 2.1 Data Sources and Study Design This study used two independent but complementary datasets to establish a unified descriptor framework for pesticide mixtures. The first dataset described the antifungal activity of binary fungicide mixtures against the plant pathogenic fungus Macrophomina phaseolina ; the second dataset described the acute contact toxicity of organic binary mixtures against honey bees ( Apis mellifera ). Because the two datasets differed significantly in biotargets, endpoint definitions, and mixture design strategies, they were not merged into a single response matrix. Instead, this study treated them as complementary chemical spaces representing potency and non-target toxicity, respectively. QSAR models were built at both endpoints, and then compared based on descriptor contributions and mechanistic trends. This design allows comparisons to be performed at the descriptor level, rather than directly at the compound or mixture level, thus providing a molecular characterization framework for assessing pesticide selectivity (Carnesecchi et al. 2020 ; Rahimi-Soujeh et al. 2024 ). 2.2 Antifungal Activity Dataset The antifungal dataset was drawn from previously published experimental studies evaluating the inhibitory activity of binary fungicide mixtures of Macrophomina phaseolina . This pathogen is a soil- and seed-borne fungus of significant agricultural importance due to its wide host range and high environmental persistence, infecting a variety of economically valuable crops and causing severe yield reductions (Shirai and Eulgem 2023; Marquez et al. 2021 ). The original study investigated six widely used fungicides: propiconazole, difenoconazole, tebuconazole, thiophanate-methyl, iprodione, and kresoxim-methyl. These fungicides were further combined to produce fifteen binary mixture systems. The mixtures were evaluated using a fixed-ratio beam design (FRRD) to form binary mixed beams for bioactivity assays (Rahimi-Soujeh et al. 2024 ). Antifungal activity was determined using the toxicochemical method. Different concentrations of a single fungicide or mixtures thereof were added to potato dextrose agar (PDA), and 6 mm mycelial blocks obtained from 5-day-old Macrophomina phaseolina colonies were placed in the center of the petri dish. After incubation at 25°C for 7 days, the colony diameter was measured and the percentage of mycelial growth inhibition was calculated. The dose-response curve was then fitted using nonlinear regression to determine the half-maximal effective concentration (EC₅₀, mg L⁻¹) of each mixture. To facilitate subsequent modeling, EC₅₀ values were further converted to their negative logarithmic form (pEC₅₀ = −log EC₅₀) to reduce skewness and improve regression stability (Rahimi-Soujeh et al. 2024 ). 2.3 Honey bee Toxicity Dataset The second data set consists of acute contact toxicity data of organic binary mixtures to honey bees ( Apis mellifera ). These data were taken from the EFSA Mixture Toxicity Database, which integrates multiple laboratory studies reporting the combined toxicity of pesticide mixtures under acute contact exposure conditions (Carnesecchi et al. 2020 ; Authority et al. 2023). The toxicity endpoint used in this data set is the median lethal dose (LD₅₀-mix), defined as the mixture dose that causes 50% mortality 24 hours after exposure. To improve the statistical distribution of the response variable in regression modeling, the LD₅₀-mix value was converted to its negative logarithmic form (pLD₅₀-mix) and used as the response variable in the bee toxicity model. To ensure comparability across datasets, all chemical structures were standardized before calculating the descriptors, and both biological endpoints were represented in logarithmic form (Carnesecchi et al. 2020 ). 2.4 Quantum Chemical Calculations and QRD Descriptor Derivation Quantum chemical calculations were performed to establish a set of descriptors characterizing electronic structure, reactivity, electrostatics, and solvation behavior. All calculations were performed using Gaussian 16 (Revision C.01) (Frisch et al. 2016 ). The initial molecular structure was taken from the PubChem 3D SDF archive and geometrically optimized at the M06-2X/6–31 + G(d,p) theoretical level. The solvent effect was considered using a water-based SMD continuous medium solvent model (Zhao and Truhlar 2008; Marenich, Cramer, and Truhlar 2009 ). Resonant frequencies were calculated at the same theoretical level to confirm the obtained structure as a true minimum, and a thermal correction for the Gibbs free energy at 298.15 K was obtained. A standard state correction of RT ln(24.46) (1.89 kcal mol⁻¹) was added to convert 1 atm to 1 M. The free energy (IP) and electron affinity (EA) were calculated using the ΔSCF method, expressed as: \(\:IP=E(N-1)-E\left(N\right),EA=E\left(N\right)-E(N+1)\) where E(N), E(N−1), and E(N+1) represent the total electron energies of the neutral molecule, cation, and anion, respectively. Based on these quantities, concept density functional theory (cDFT) descriptors can be further derived (Geerlings et al. 2020; Yang and Parr 1985 ). The chemical potential (µ), global hardness (η), and global softness (S) are calculated as follows: \(\:\mu\:=-\frac{IP+EA}{2},\eta\:=IP-EA,S=\frac{1}{\eta\:}\) These descriptors characterize the overall electron acceptance and donation tendency of a molecule, as well as its resistance to charge transfer. To further resolve directional reactivity, the chemical potential is decomposed into acceptor and donor components (µ⁺, µ⁻), representing electron acceptance and electron donation capabilities, respectively (Gázquez, Cedillo, and Vela 2007 ; Chattaraj, Maiti, and Sarkar 2003). Local reactivity is described using the Fukui function and local condensation softness indices (s⁺, s⁻) to quantify the sensitivity of specific molecule sites to nucleophilic or electrophilic attacks (Geerlings et al. 2020; Sharma, Ranjan, and Chakraborty 2024 ). Electrostatic properties were described by the molecular electrostatic potential: \(\:V\left(\mathbf{r}\right)=\sum\:_{A}\frac{{Z}_{A}}{\mid\:\mathbf{r}-{\mathbf{R}}_{A}\mid\:}-\int\:\frac{\rho\:\left({\mathbf{r}}^{\mathbf{{\prime\:}}}\right)}{\mid\:\mathbf{r}-{\mathbf{r}}^{\mathbf{{\prime\:}}}\mid\:}d{\mathbf{r}}^{\mathbf{{\prime\:}}}\) with the maximum surface potential (V max ) characterizing the most electrophilic region (Politzer, Murray, and Concha 2002 ). Solvation and size effects are represented by the solvation Gibbs free energy (ΔG) and molecular volume (MV). These quantities collectively constitute the quantum reactivity descriptor (QRD) set, providing an electrochemically meaningful descriptor layer for a unified framework. 2.5 Structural Descriptor Calculation and CaPS Screening Two-dimensional molecular descriptors were calculated using PaDEL software (Yap 2011 ). Zero-variable descriptors were removed, and redundancy was reduced by excluding highly correlated variables (|r| ≥ 0.90). Multicollinearity was further controlled using the variation expansion factor criterion (VIF ≤ 5) (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). The final screened PaDEL subset (CaPS) was designed to retain descriptors with interpretable structure and physicochemical meaning, including those related to lipophilicity, polar surface features, molecular size, electron distribution, polarizability, topological autocorrelation, and hydrogen bond acceptor ability. In this research framework, CaPS descriptors provide a structural layer for molecular characterization and complement the electronic reactivity information captured by QRD. 2.6 Mixture Descriptor Calculation Compound-level descriptors were converted to mixture-level descriptors using a composition-weighted representation. For multi-component mixtures, the mixture descriptor value is calculated as follows: \(\:{D}_{mix}=\sum\:_{i}{x}_{i}{D}_{i}\) where x i represents the molar fraction of the i-th component. This formula is consistently applied to both the structural descriptor (CaPS) and the quantum descriptor (QRD). This composition-weighted strategy provides a first-order approximation of the complex rational chemical space of binary mixtures, allowing both sets of data to be encoded within the same descriptor framework (Carnesecchi et al. 2020 ; Rahimi-Soujeh et al. 2024 ; Taenzler et al. 2023 ). 2.7 Data Set Splitting To ensure comparability across endpoints, both datasets were split into a training set and an external prediction set using the same strategy. 70% of the samples were allocated to the training set for model building, and the remaining 30% were used for external validation. This procedure preserves the representative distribution of chemical diversity and biological activity and reduces the risk of overfitting (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). 2.8 QSAR Model Building The QSAR model was built using multiple linear regression (MLR). The response variable for the antifungal model was pEC₅₀, while the response variable for the bee toxicity model was pLD₅₀-mix. Descriptor selection was performed using a genetic algorithm (GA), which searched for appropriate descriptor combinations from a unified QRD + CaPS descriptor pool. In this study, both endpoints employed the same GA-MLR workflow. The importance of this approach lies not in the novelty of GA-MLR itself, but in the fact that methodological consistency facilitates descriptor-level comparison while minimizing the impact of differences in model architecture (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). 2.9 Model Validation Model robustness was assessed using standard internal and external validation metrics. Internal validation employed leave-one-out cross-validation (LOO-CV), with the cross-validation determination coefficient (Q² LOO ) as the primary measure of internal predictive power. External validation used independent prediction sets, evaluating predictive performance using the external determination coefficient (R² ext ), root mean square error (RMSE), and consistency correlation coefficient (CCC) (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). Within this research framework, the validation objective is to establish the reliability of the model as a fundamental tool for cross-endpoint comparisons, rather than maximizing the predictive performance of a single endpoint. 2.10 Y-scrambling Test The Y-scrambling test (response value permutation test) is used to evaluate whether observed descriptor-response relationships are likely derived from randomized correlations. In this procedure, response values are randomly permuted, while the descriptor matrix remains unchanged. If the relationships in the original model are meaningful, the model built from the randomized dataset should have significantly lower R² and Q² values than the original model (Rücker, Rücker, and Meringer 2007). 2.11 Application Domain The application domain (AD) for each model is evaluated using the Williams plot method. The leverage threshold is defined as follows: h* = 3(p + 1) / n where p represents the number of descriptors and n represents the number of training samples. Observations with leverage values exceeding this threshold are considered influential, while those with standardized residuals exceeding ± 3 are considered potential outliers (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). The purpose of including the applicable domain analysis is to ensure that the descriptor-level comparison is performed within the structural-chemical space represented by the training data. 2.12 Comparative Descriptor Analysis To identify the structural determinants of pesticide selectivity, this study compared the descriptor contributions in antifungal efficacy models and bee toxicity models, using standardized regression coefficients for evaluation. Since both endpoints are represented in the same QRD + CaPS descriptor space and modeled using the same workflow, this analysis allows for a hierarchical comparison of efficacy and toxicity drivers. The focus of this comparison is not a direct one-to-one ranking of mixtures, but rather the identification of generalizable structural and electronic patterns related to the efficacy-toxicity tradeoff in pesticide design (Carnesecchi et al. 2020 ; Rahimi-Soujeh et al. 2024 ). 2.13 QSAR Validation Criteria Model performance was evaluated according to widely accepted QSAR validation criteria. Model fit was measured by the coefficient of determination (R²), while internal predictive power was evaluated by Q² LOO . A model with R² > 0.6 and Q² LOO > 0.5 was considered to have acceptable interpretability and robustness. External predictive performance was evaluated by R² ext , RMSE, and CCC (Gramatica 2007 ; Gramatica et al. 2013 ; Pratim Roy et al. 2009). The overall modeling process followed the OECD principles for QSAR validation, including well-defined endpoints, transparent algorithms, well-defined domains of application, appropriate predictive performance, and mechanistic interpretability (Framework 2024 ; Oecd 2007 ). In this study, these criteria were used to establish the credibility of the model as an interpretable comparison tool within a unified descriptor framework. It should be emphasized that the antifungal efficacy and honey bee toxicity datasets used in this study are independent and are not directly matched at the mixture level. Accordingly, the purpose of the present framework is not to perform one-to-one comparison of identical mixtures across endpoints. Instead, both datasets were encoded within the same descriptor space and modeled under the same analytical workflow, allowing comparison at the level of molecular representation. Under this design, differences in descriptor contributions are interpreted as endpoint-related patterns within a unified chemoinformatic framework rather than as artifacts arising from inconsistent data treatment. The focus of the present study therefore lies in identifying transferable structure–property relationships relevant to selectivity, rather than ranking the same mixtures across both endpoints. 3. Results 3.1 Basic Modeling in a Unified Descriptor Space Two endpoint-specific models were built using the same descriptor language, combining quantum reactive descriptors (QRDs) and screened structural descriptors (CaPS), and employing a consistent GA-MLR workflow. The purpose of this design was not to compete with previously published single-endpoint models optimized for maximum prediction accuracy, but rather to provide a common analytical space for comparing antifungal efficacy and bee toxicity at the descriptor level. For the antifungal endpoint, the resulting model met generally accepted validation criteria and demonstrated stable internal and external predictive capabilities (Table 1 ). The adjusted coefficients of determination for the training set were R² adj = 0.6387, RMSE tr = 0.2054, MAE tr = 0.1723, and CCC tr = 0.8048. Low underfit values ​​(LOF = 0.0641) and collinearity exponents (K xx = 0.4115) further support the stability of the model structure. Internal validation yielded Q² LOO = 0.5743 and Q² LMO = 0.5436, with RMSE cv = 0.2345 and MAE cv = 0.1964, indicating that the selected descriptor space can support consistent predictions on the resampled sample set. External validation showed R² ext = 0.6999, RMSE ext = 0.1908, MAE ext = 0.1524, and CCC ext = 0.8308. The external prediction statistics Q² F1 = 0.6974, Q² F2 = 0.6953, Q² F3 = 0.7183, and r² m (average) = 0.5986 (Δr² m = 0.1574), all falling within the generally acceptable threshold range for predictive QSAR models (Gramatica 2007 ; Pratim Roy et al. 2009). Table 1 Statistical parameters of the GA-MLR QSAR model for antifungal activity Parameter Value Model equation pEC₅₀ = −407.9396 + 0.1918 MV − 1201.9769 µ⁻ + 0.2042 TopoPSA + 156.2841 GATS2v − 4.5894 nHBAcc Descriptors MV (cm³ mol⁻¹), µ⁻, TopoPSA, GATS2v, nHBAcc R² / R² adj 0.6734 / 0.6387 RMSE tr / MAE tr 0.2054 / 0.1723 Q² LOO 0.5743 R² ext 0.6999 RMSE ext / MAE ext 0.1908 / 0.1524 CCC tr / CCC ext 0.8048 / 0.8308 Q² F1 / Q² F2 / Q² F3 0.6974 / 0.6953 / 0.7183 r² m (avg) / Δr² m 0.5986 / 0.1574 Notes. MV, molar volume; µ⁻, directional chemical potential descriptors; TopoPSA, topological polar surface area; GATS2v, Geary autocorrelation descriptor (lag 2, weighted by van der Waals volume); nHBAcc, number of hydrogen bond acceptors; RMSE, root mean square error; MAE, mean absolute error; CCC, concordance correlation coefficient; LOO, leave-one-out cross validation. Similarly, the honey bee toxicity model also exhibited satisfactory internal and external performance, supporting its use as a benchmark for descriptor-level interpretation (Table 2 ). Its adjusted coefficient of determination was R² adj = 0.6427, and the external validation coefficient was R² ext = 0.7119. The fitting and prediction errors remained within acceptable ranges, with RMSE tr = 0.3740, MAE tr = 0.3082, RMSE ext = 0.4031, and MAE ext = 0.3500, respectively. Internal validation yielded Q² LOO = 0.5884 and Q² LMO = 0.5634, while the consistency coefficients CCC tr = 0.8084 and CCC ext = 0.8350 indicate stable consistency between predicted and observed values. External validation parameters Q² F1 = 0.7257, Q² F2 = 0.7052, Q² F3 = 0.6263, and r² m (average) = 0.6265 (Δr² m = 0.1190) further support the reliability of the models within their defined range (Gramatica 2007 ; Pratim Roy et al. 2009). Table 2 Statistical parameters of the GA-MLR QSAR model for honey bee toxicity Parameter Value Model equation pLD₅₀ = −11.3109 − 4.6634 V max − 0.0946 ΔG − 0.2007 s⁺ − 17.2695 (-µ⁺) − 11.0294 µ⁻ − 0.0055 TopoPSA Descriptors V max , ΔG, s⁺, -µ⁺, µ⁻, TopoPSA R² / R² adj 0.6784 / 0.6427 RMSE tr / MAE tr 0.3740 / 0.3082 Q² LOO 0.5884 R² ext 0.7119 RMSE ext / MAE ext 0.4031 / 0.3500 CCC tr / CCC ext 0.8084 / 0.8350 Q² F1 / Q² F2 / Q² F3 0.7257 / 0.7052 / 0.6263 r² m (avg) / Δr² m 0.6265 / 0.1190 Notes. V max , maximum molecular interaction descriptor; ΔG, solvation free energy descriptor; s⁺, electrophilic reactivity descriptor; µ⁺/µ⁻, directional chemical potential descriptors; TopoPSA, topological polar surface area; RMSE, root mean square error; MAE, mean absolute error; CCC, concordance correlation coefficient; LOO, leave-one-out cross validation. These results show that both models are robust enough to serve as interpretable baseline models within a unified descriptor framework. This level of performance is appropriate for the purposes of this study; that is, this study focuses on comparative interpretation across endpoints rather than maximizing predictive performance for a single endpoint. 3.2 Comparison of Descriptor Contributions Across Endpoints The most important result of this modeling work is not the absolute predictive performance of either model, but rather the emergence of significantly different descriptor contribution patterns between the two endpoints. Since both datasets were encoded using the same QRD + CaPS descriptor pool and modeled under the same GA-MLR strategy, the differences in the selected descriptors can be interpreted as divergent mechanisms related to endpoints, rather than artifacts arising from data representation or model construction. In the antifungal model, descriptor contributions fall between structural and quantum-derived variables. The electronic descriptor µ⁻ shows the strongest contribution, while structural descriptors related to molecular size and polarity, including MV and TopoPSA, also show positive contributions. Furthermore, the topological descriptor GATS2v shows that the spatial distribution of atomic properties within the molecular backbone plays a crucial role in regulating interactions with fungal targets. Overall, antifungal endpoints appear to be dominated by structure-electronic complex features, with molecular size, polarity, and topological organization all contributing to their effectiveness. In contrast, the bee toxicity model is dominated by electronic and electrostatic descriptors. The parameter − µ⁺ shows the strongest normalization contribution, followed by ΔG. V max , µ⁻, s⁺, and TopoPSA also contribute additionally. This model indicates a stronger correlation between bee toxicity and descriptors describing charge transfer behavior, electrophilic responses, electrostatic distribution, and solvation-related effects. In particular, the negative contributions of polarity-related descriptors such as TopoPSA and s⁺ suggest that increased polarity may reduce the bee's ability to penetrate its hydrophobic cuticle barrier, thereby reducing effective internal exposure. Consequently, cross-endpoint descriptor comparative analysis shows a strong correlation between antifungal efficacy and a set of features comprising a mixture of stereo, polar, and topological properties; while bee toxicity is primarily controlled by descriptors characterizing electronic reactivity and environment-dependent reactivity. 3.3 Descriptor Hierarchy Divergence and Selectivity Implications The divergence between these two sets of descriptor features provides a mechanistic basis for discussing pesticide selectivity. Antifungal efficacy appears to be more strongly dependent on target-binding characteristics, including stereocomplementarity, molecular size, and the spatial arrangement of polar functional groups. In contrast, bee toxicity is strongly correlated with electronic reactivity and physicochemical characteristics influencing bioavailability and receptor-level interactions. One particularly informative example is TopoPSA, which contributes positively in the antifungal model but negatively in the bee toxicity model. This opposite effect suggests that polar functional groups may simultaneously enhance effective interactions with fungal targets while reducing their ability to penetrate the hydrophobic cuticle barrier of bees. More broadly, descriptor comparisons show that features favorable to efficacy do not necessarily increase non-target toxicity. Instead, these two endpoints occupy partially separated regions in the descriptor space, thus creating the possibility of exploitable selective windows. This study should not be interpreted as a direct one-to-one comparison of mixtures from two datasets. Instead, it should be viewed as a descriptor-level analysis of the efficacy-toxicity tradeoff within a unified chemical information framework. This distinction is important because the two sets of data are biologically independent and not directly paired at the compound or formulation level. Therefore, the advantage of our approach lies in identifying transferable design rules, rather than ranking identical mixtures at two endpoints. 3.4 Model Reliability and Application Domain Residual analysis, application domain assessment, and Y-scrambling tests collectively support the reliability of the two models as a comparison tool. For the antifungal model, the relationship between observed and predicted values ​​shows that most of the training set and external samples are distributed close to the ideal diagonal (Fig. 1 ), indicating satisfactory agreement between experimental and predicted values. The angle between the regression line and the ideal diagonal is − 8.2015°, further supporting this agreement. The residuals are roughly symmetrically distributed around zero and do not exhibit a significant funnel-shaped pattern (Fig. 2 ), indicating the absence of major systematic bias or heterogeneous variation within the simulated activity range. Williams plot analysis showed that most training and predicted samples fell within the defined applicable domain, and no compounds exceeded the ± 3 residual limit (Fig. 3 ), indicating that the model was built in a representative structural space. The statistical results obtained from Y-scrambling (R² Yscr = 0.0958; Q² Yscr = − 0.1544) were significantly worse than the original model (R² = 0.6734; Q² LOO = 0.5743), confirming that the observed relationship was unlikely to originate from random correlation (Fig. 4 ) (Rücker, Rücker, and Meringer 2007). A similar pattern was observed in the bee toxicity model. The relationship between observed and predicted values ​​(Fig. 5 ) was consistent with the statistical validation indices, while the residual plot (Fig. 6 ) showed a zero-centered random distribution, without significant systematic bias or error variation increasing with the toxicity range. Williams plot analysis (Fig. 7 ) showed that most training and external samples were within the applicable domain, and no major outlier was observed. Y-scrambling again yielded significantly lower values ​​(R² Yscr = 0.0982; Q² Yscr = − 0.1560), compared to the original model's R² = 0.6784; Q² LOO = 0.5884, supporting the existence of a genuine structure-toxicity relationship rather than a random statistical artifact (Fig. 8 ) (Rücker, Rücker, and Meringer 2007). While these analyses cannot eliminate all uncertainties associated with mixture modeling, they do support using our model as a stable and interpretable benchmark for cross-endpoint descriptor comparisons. For the purposes of this study, this reliability is more important than sacrificing transparency to maximize predictive performance. In the present study, quantum reactivity descriptors (QRDs) are not interpreted as direct representations of biological mechanisms, but as chemically meaningful proxies for electronic properties that influence molecular interactions with biological systems. Descriptors related to electrophilicity, charge redistribution, electrostatic potential, and solvation provide indirect yet informative measures of how molecules respond to polar environments and participate in charge-transfer processes. Such properties are especially relevant for toxicity-related interactions, where electronic responsiveness can shape access to, and interaction with, sensitive biological targets. By integrating QRDs with curated structural descriptors, the present framework supports interpretation at the level of physicochemical behavior, thereby bridging the gap between purely statistical QSAR models and fully mechanistic biological explanations. 4. Discussion 4.1 Mechanism of Antifungal Efficacy Antifungal models show that the inhibitory efficacy against Macrophomina phaseolina is governed by a synergistic balance of multiple features modulated by stereochemistry, polarity, topology, and electronic properties, rather than by a single physicochemical parameter. The selected descriptors—molecular volume (MV), µ⁻, TopoPSA, GATS2v, and nHBAcc—collectively define a efficacy profile, where molecular size, spatial organization, and controlled polarity appear to be the core factors determining fungal growth inhibition. The positive contribution of MV indicates that a larger molecular skeleton is beneficial for antifungal efficacy. Mechanistically, this is consistent with the mechanisms of action of many systemic fungicides, particularly azoles. Triazole fungicides inhibit the cytochrome P450 enzyme lanosterol 14α-demethylase (CYP51), which is crucial for the biosynthesis of ergosterol in the fungal cell membrane (Rybak et al. 2024; Sun et al. 2024 ). Increased molecular size is generally associated with a larger hydrophobic surface area, which may enhance its van der Waals interaction at the enzyme's active site and facilitate its partitioning into lipid-rich membrane environments. Previous structural studies have also shown that hydrophobic substituents significantly contribute to stabilizing the fungicide–CYP51 complex and enhancing its potency (Kelly and Kelly 2013 ; Warrilow et al. 2010 ). TopoPSA also shows a positive contribution to antifungal potency, indicating that controlled polarity facilitates effective target interactions. Polar surfaces do not simply enhance hydrophilicity but may support effective binding through heteroatom-containing functional groups while maintaining membrane permeability compatibility. In triazole fungicides, these heteroatoms may coordinate with the hemosiderin of CYP51 or form hydrogen bonds with amino acid residues in the active site pocket (Hargrove et al. 2017 ; Strushkevich, Usanov, and Park 2010). Similar relationships between TopoPSA and antifungal activity have been reported in QSAR studies of azole and strobilurin fungicides (Cherkasov et al. 2014). The positive contribution of GATS2v further demonstrates the importance of topological organization in antifungal efficacy. This descriptor reflects the spatial distribution of atomic volumes within the molecular backbone, thus capturing stereotactic features related to ligand orientation and binding affinity. This interpretation is consistent with known fungicidal mechanisms, including the inhibitory effect of strobilurin-like compounds on the Qo site of the mitochondrial cytochrome bc₁ complex, where the positions of aromatic rings and hydrophobic substituents strongly influence activity (Bartlett et al. 2002). In contrast, the negative contribution of µ⁻ suggests that excessive electron-donating properties may be detrimental to antifungal efficacy. As a cDFT-derived descriptor related to electron-donating ability, µ⁻ reflects the tendency of a molecule to donate electron density in intermolecular interactions. Its negative contribution suggests that optimal antifungal activity may require a more balanced electronic profile, rather than simply a strong electron-donating ability. Similarly, the negative contribution of nHBAcc indicates that excessive hydrogen bond acceptors may reduce efficacy, possibly because it causes molecular polarity beyond what is permissible for efficient membrane transport. The antifungal endpoint can be interpreted as a "structure-driven system modulated by electronic properties." High antifungal efficacy appears to require a molecular architecture with sufficient size, good topological organization, and moderate polarity, while avoiding excessive electron-donating properties or too many hydrogen bond acceptor functional groups. This interpretation is consistent with existing understanding of the structure-activity relationship of triazole and strobilurin fungicides, namely that optimal activity depends on a synergistic balance between the hydrophobic framework and appropriately configured heteroatoms (Bartlett et al. 2002; Copping and Duke 2007). 4.2 Mechanism Analysis of Honey bee Toxicity Compared to the antifungal endpoint, the bee toxicity model is more clearly centered on electronic and electrostatic descriptors. This model shows that the acute contact toxicity of Apis mellifera is influenced not only by its overall physicochemical properties but also by how the molecule interacts electronically with the biomarker and how effectively it reaches the target after penetrating the cuticle. The negative relationship of V max indicates that molecules with highly localized positive electrostatic regions do not necessarily possess the highest toxicity. One possible explanation is that extreme localization of the positive potential may reduce its efficiency in interacting effectively with toxicity-related biological interfaces or alter the balance between transport and receptor binding (Nation Sr 2022 ; Casida 2018 ). The strong contribution of ΔG suggests that solvation-related stabilization is also a significant determinant of toxicity, implying that the molecule's electronic response to the surrounding medium may modulate its effective bioavailability and biointeraction potential. The negative contribution of s⁺ further supports that bee toxicity is shaped by a specific electronic response window, rather than a simple amplification of any single reactivity index. As a descriptor related to electrophilicity, s⁺ reflects the ease with which a molecule interacts with electron-rich sites in biomolecules. The negative correlation between polarity and toxicity suggests that lower electrophilic reactivity may limit interactions with toxic targets. Similarly, the contributions of − µ⁺ and µ⁻ indicate that directional electron acceptance and donation tendencies are important in defining the overall molecular interaction profile in the bee system. TopoPSA again provides a particularly informative comparison. In the bee model, its negative contribution suggests that increased polarity may reduce passive transport across hydrophobic biological barriers, thereby decreasing effective internal exposure. This is consistent with general expectations: higher polarity molecules are less likely to penetrate the lipid-rich cuticle of bees and are therefore less likely to reach internal target sites at toxicologically significant concentrations (Tennekes and Sanchez-Bayo 2011 ). The honey bee toxicity model points to an endpoint "regulated by electronic properties and limited by transport-related structural factors." Non-target toxicity is not simply determined by molecular size or general lipophilicity, but rather by more specific interactions between electron distribution, charge transfer behavior, solvation effects, and polarity-dependent permeation. Within this context, the QRD layer offers clear added value because it provides mechanistic explanations that are difficult to capture directly using traditional 2D structural descriptors. 4.3 Implications of Cross-Endpoint Selective Pesticide Design The core contribution of this study lies in the hierarchical comparison of descriptors between potency and toxicity. Because the two models are built on different but methodologically consistent datasets, this framework cannot identify "safe mixtures" through direct ranking. Instead, it identifies which molecular property domains are strongly associated with desired antifungal performance and which are strongly associated with unwanted pollinator toxicity. From this perspective, the results of this study propose a practical design principle: features associated with 3D size, good topological organization, and controlled polarity may contribute to improved antifungal potency; while excessive electronic reactivity and electrostatic sensitivity may increase the likelihood of bee toxicity. Some descriptors, particularly polarity-related variables like TopoPSA, exhibit opposite effects at the two endpoints, indicating that improved selectivity cannot be achieved simply by maximizing activity. Instead, it requires adjusting structural features to better suit the fungal target while avoiding the formation of electronic feature profiles associated with bee damage. This shift from "endpoint-specific prediction" to "cross-endpoint design logic" is what distinguishes this study from previous single-endpoint QSAR work. Therefore, this research framework provides a foundation for selectivity scoring, prioritization, and hypothesis generation in future pesticide mixture design, especially after obtaining directly paired efficacy-toxicity datasets. In this sense, our results support a descriptor-level selectivity window where beneficial efficacy and reduced non-target harm do not necessarily have to change simultaneously. 4.4 Relationship with Previous QSAR Research Previous studies have achieved higher predictive performance at both the antifungal efficacy and bee toxicity endpoints through nonlinear machine learning, quasi-SMILES, q-RASAR, or other more flexible modeling frameworks. Regarding antifungal endpoints, Rahimi-Soujeh et al. reported strong nonlinear predictive capabilities for binary fungicide mixtures against Macrophomina phaseolina , with their Support Vector Regression (SVR) and Gaussian Process Regression (GPR) models achieving high levels of fit and internal validation (Rahimi-Soujeh et al. 2024 ). For honey bee toxicity, Carnesecchi et al. and later Chatterjee et al. demonstrated that machine learning models based on quasi-SMILES/CORAL and q-RASAR provide significantly stronger external predictive capabilities for organic binary mixtures (Carnesecchi et al. 2020 ; Chatterjee et al. 2023 ). This study does not attempt to numerically surpass these models. Instead, it addresses a different question: can a common descriptor language be used to compare the mechanistic drivers of pesticide efficacy and honey bee toxicity in two independent mixture datasets? In this respect, this study complements, rather than replaces, existing QSAR research. Previous studies on antifungal and honey bee research have demonstrated that each endpoint can be successfully modeled independently. The added value of this study lies in integrating QRD and CaPS descriptors into a single comparative framework, thereby enabling a descriptor-level interpretation of selective correlation trade-offs. Consequently, this study is more conservative in its predictive claims, but stronger in its mechanistic and design-oriented arguments. This emphasis on transparency, interpretability, clear applicability, and regulatory availability aligns with the current OECD QAF requirements for the reliability of computational models (Framework 2024 ; Gissi et al. 2024 ). 4.5 Research Limitations and Future Directions Several limitations of this study need to be addressed. First, the two datasets are independent and not directly paired at the mixture level. Therefore, the current framework supports descriptor-level comparisons rather than direct cross-endpoint ranking of the same formulation. Second, the composition-weighted mixture representation only provides a first-order approximation of mixture behavior and does not explicitly address synergistic or antagonistic effects. Third, current model development is limited to a finite region within the pesticide chemistry space; therefore, extrapolation beyond the applicable domain should be approached with caution. These limitations also point to clear future directions. If paired potency-toxicity datasets can be established, selectivity models can be built directly at the formulation level. Interaction-aware mixture representations, hybrid descriptor/machine learning methods, and graph-based approaches may further improve predictive performance while preserving interpretability. Recent advancements in graph neural networks and the emergence of augmented datasets such as ApisTox also suggest that future models can achieve better generalization capabilities while maintaining mechanism relevance (Adamczyk, Poziemski, and Siedlecki 2025 ). Furthermore, more explicit uncertainty reporting and closer alignment with regulatory formats such as QPRF/QRRF will help enhance the practical application value of this framework in early screening and decision support (Framework 2024 ; Gissi et al. 2024 ; Barber, Heghes, and Johnston 2024 ). However, even in its current form, this study demonstrates that unified descriptor analysis can advance mixture QSAR from single-endpoint prediction to mechanistic-based selective assessment. This shift is the most significant conceptual contribution of this study and provides a more useful basis for safer pesticide formulation design than endpoint prediction alone. From an environmental and regulatory perspective, the ability to distinguish structural features associated with desired efficacy from those linked to non-target toxicity is essential for the development of safer pesticide formulations. The descriptor-level framework proposed here provides a transparent and interpretable basis for early-stage screening and prioritization of pesticide mixtures. Rather than replacing experimental testing, this approach offers a complementary tool for identifying selectivity trends and guiding rational design toward reduced ecological risk and improved formulation safety. 5. Conclusion This study proposes a unified descriptor framework for analyzing pesticide mixtures from two perspectives: antifungal efficacy and bee toxicity. By applying the same QRD + CaPS descriptor space and a consistent GA-MLR workflow to two independent datasets, this study establishes a consistent cross-endpoint comparative basis at the descriptor level, rather than at the directly paired mixture level. The results show that these two endpoints are governed by different but mechanistically related descriptor domains. Antifungal efficacy is primarily related to structural features such as molecular size, polarity, and topology, while bee toxicity is driven more by electronic reactivity, charge transfer behavior, solvation-related properties, and polarity-influenced transport limitations. This divergence suggests that efficacy and non-target toxicity do not necessarily increase synchronously and implies the identification of selectivity windows at the descriptor level that can contribute to safer pesticide design. Consequently, the model in this study should be considered as a comparative and mechanistic benchmark, rather than a predictive tool optimized for a single endpoint. The primary contribution of this study is not to surpass existing high-accuracy QSAR models, but rather to provide an interpretable and transferable framework for distinguishing between descriptor patterns related to target efficacy and those related to pollinating insect damage. In this sense, this study extends mixture QSAR from endpoint-specific predictions to mechanistic-based selective assessments. Declarations Acknowledgments This research was supported by Bureau of Animal and Plant Health Inspection and Quarantine (BAPHIQ), Council of Agriculture, which provided the financial resources necessary for undertaking this comprehensive study. We thank the anonymous reviewers for their constructive feedback, which greatly enhanced the quality and clarity of our manuscript. Funding This research received no external funding. Authors’ Contributions Conceptualization: C.M.C. and T.-C.L.; Methodology: C.M.C.; Data curation: C.M.C.; Formal analysis: C.M.C.; Writing – original draft: C.M.C.; Writing – review & editing: T.-C.L.; All authors have read and approved the final manuscript. Ethical Approval This is not applicable. Consent to Participate This is not applicable. Consent to Publish This is not applicable. Competing Interests The authors declare no competing interests. Data Availability Statement The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. References Adamczyk J, Poziemski J, Siedlecki P (2025) ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees. Sci Data 12:5 Authority EF, Safety P, Adriaanse A, Arce A, Focks B, Ingels D, Jölli Sébastien Lambin, Maj Rundlöf, Dirk Süßenbach, Monica Del Aguila, Valeria Ercolano, Franco Ferilli, Alessio Ippolito, Csaba Szentes, Franco Maria Neri, Laura Padovani, Agnès Rortais, Jacoba Wassenberg, and Domenica Auteri. 2023. 'Revised guidance on the risk assessment of plant protection products on bees (Apis mellifera, Bombus spp. and solitary bees)'. EFSA J, 21: e07989 Ballu A, Ugazio C, Duplaix Clémentine, Noly A, Wullschleger J, Stefano FF, Torriani A, Dérédec Florence Carpentier, and Anne-Sophie Walker. 2024. 'Preventing multiple resistance above all: New insights for managing fungal adaptation'. Environ Microbiol, 26: e16614 Barber C, Heghes C, Johnston L (2024) A framework to support the application of the OECD guidance documents on (Q)SAR model validation and prediction assessment for regulatory decisions. Comput Toxicol 30:100305 Bartlett DW, John M, Clough, Jeremy R, Godwin, Alison A, Hall Mick Hamer, and Bob Parr-Dobrzanski. 2002. 'The strobilurin fungicides'. Pest Manag Sci, 58: 649–662 Carnesecchi E, Toropov AA, Toropova AP, Kramer N, Svendsen C, Dorne JL, Emilio Benfenati (2020) Predicting acute contact toxicity of organic binary mixtures in honey bees (A. mellifera) through innovative QSAR models. Sci Total Environ 704:135302 Casida JE (2018) Neonicotinoids and Other Insect Nicotinic Receptor Competitive Modulators: Progress and Prospects. Ann Rev Entomol 63:125–144 Chattaraj P, Kumar B, Maiti, Sarkar U (2003) Philicity: A Unified Treatment of Chemical Reactivity and Selectivity. J Phys Chem A 107:4973–4975 Chatterjee M, Banerjee A, Tosi S, Carnesecchi E, Benfenati E, and Kunal Roy (2023) Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees. J Hazard Mater 460:132358 Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Victor E Kuz’min, Richard Cramer, Romualdo Benigni, Chihae Yang, James Rathman, Lothar Terfloth, Johann Gasteiger, Ann Richard, and Alexander Tropsha. 2014. 'QSAR Modeling: Where Have You Been? Where Are You Going To?'. J Med Chem, 57: 4977–5010 Copping LG, Stephen OD (2007) Natural products that have been used commercially as crop protection agents. Pest Manag Sci 63:524–554 De Mio LL, May NA, Peres G, Schnabel, Ishii H (2024) A special isssue on fungicide resistance and management strategies. Trop Plant Pathol 49:1–4 Devillers J (2014) 'QSAR Modeling of Pesticide Toxicity to Bees.' in Evangelista M, Papa E (2025) A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances. Toxics 13:799 Framework SARA, Paris (2024) France Frisch MJ, ea GW, Trucks HB, Schlegel GE, Scuseria MA, Robb JR, Cheeseman G, Scalmani VPGA, Barone GA, Petersson, Nakatsuji HJRA (2016) Gaussian 16. In.: Gaussian, inc. Wallingford, CT Gázquez José, Cedillo, Vela A (2007) 'Electrodonating and Electroaccepting Powers', The journal of physical chemistry. A , 111: 1966-70 Geerlings P, Chamorro E, Chattaraj PK, De Proft F, Gázquez JoséL, Liu S Christophe Morell, Alejandro Toro-Labbé, Alberto Vela, and Paul Ayers. 2020. 'Conceptual density functional theory: status, prospects, issues'. Theor Chem Acc, 139: 36 Gissi A, Tcheremenskaia O, Bossa C, Battistelli CL, and Patience Browne (2024) The OECD (Q)SAR Assessment Framework: A tool for increasing regulatory uptake of computational approaches. Comput Toxicol 31:100326 Gramatica P (2007) Principles of QSAR models validation: internal and external. QSAR Comb Sci 26:694–701 Gramatica P, Chirico N, Papa E, Cassani S, Kovarich S (2013) QSARINS: A new software for the development, analysis, and validation of QSAR MLR models. J Comput Chem 34:2121–2132 Guo W, Song X, Gao Y, Yang S, Tang J, Zhao C, Wang H, Ren J, Zeng L, Hanhong Xu (2025) Exploring Insecticidal Molecules with Random Forest: Toward High Insecticidal Activity and Low Bee Toxicity. J Agric Food Chem 73:5573–5584 Gustavsson M, Molander S, Backhaus T, Kristiansson E (2023) Risk assessment of chemicals and their mixtures are hindered by scarcity and inconsistencies between different environmental exposure limits. Environ Res 225:115372 Hargrove TY, Friggeri L, Wawrzak Z, Qi A, Hoekstra WJ, Schotzinger RJ, York JD, Peter F, Guengerich, Galina I, Lepesheva (2017) Structural analyses of Candida albicans sterol 14α-demethylase complexed with azole drugs address the molecular basis of azole-mediated inhibition of fungal sterol biosynthesis. J Biol Chem 292:6728–6743 Kar S, Amin SA, Li L (2025) Unveiling chemical space, scaffold diversity, critical structural features of pesticides: A comprehensive QSAR, qRASAR, machine learning studies to predict pesticides toxicity. Sci Total Environ 1001:180489 Kelly SL, Kelly DE (2013) 'Microbial cytochromes P450: biodiversity and biotechnology. Where do cytochromes P450 come from, what do they do and what can they do for us?'. Philosophical Trans Royal Soc B: Biol Sci, 368 Komprda Jiří, Lörinczová Katarína, Toušová Z, Smutná M Soňa Smetanová, Klára Komprdová, and Klára Hilscherová. 2026. 'Methodological Challenges in the Application of QSAR Models for Chemical Prioritization and Toxicity Assessment: A Case Study on Aryl Hydrocarbon Receptor Activity in Environmental Pollutant Mixtures', ACS Environmental Au Marenich AV, Cramer CJ, Truhlar DG (2009) Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. J Phys Chem B 113:6378–6396 Marquez N, Giachero MaríaL, Stéphane, Declerck, Ducasse DA (2021) 'Macrophomina phaseolina: General Characteristics of Pathogenicity and Methods of Control', Frontiers in Plant Science , Volume 12–2021 Migdał Paweł, Murawska A, Berbeć E, Zarębski K, Ratajczak N, Roman A, Krzysztof Latarowski (2024) Biochemical Indicators and Mortality in Honey Bee (Apis mellifera) Workers after Oral Exposure to Plant Protection Products and Their Mixtures. Agriculture 14:5 Nation Sr JL (2022) Insect physiology and biochemistry. CRC Nicholson CC, Knapp J, Kiljanek T, Albrecht M, Chauzat M-P, Costa C, De la Rúa P, Klein A-M, Marika Mänd SG, Potts O, Schweiger I, Bottero E, Cini, Joachim R, de Miranda JC Stout, and Maj Rundlöf. 2024. 'Pesticide use negatively affects bumble bees across European landscapes', Nature , 628: 355 – 58 Oecd DE (2007) 'Guidance document on the validation of (quantitative) structure-activity relationship [(Q) SAR] models', Organ. Econ. Co-operation Dev. Paris Fr Politzer P, Murray JS, Concha MC (2002) The complementary roles of molecular surface electrostatic potentials and average local ionization energies with respect to electrophilic processes. Int J Quantum Chem 88:19–27 Pratim Roy, Partha S, Paul I, Mitra, Kunal, Roy (2009) On Two Novel Parameters for Validation of Predictive QSAR Models. Molecules 14:1660–1701 Rahimi-Soujeh Z, Safaie N, Moradi S, Abbod M, Sharifi R, Mojerlou S, Mokhtassi-Bidgoli A (2024) New binary mixtures of fungicides against Macrophomina phaseolina: Machine learning-driven QSAR, read-across prediction, and molecular dynamics simulation. Chemosphere 366:143533 Raine NE, Maj, Rundlöf (2024) Pesticide Exposure and Effects on Non-Apis Bees. Ann Rev Entomol 69:551–576 Rondeau S, Raine NE (2024) 'Single and combined exposure to ‘bee safe’ pesticides alter behaviour and offspring production in a ground-nesting solitary bee (Xenoglossa pruinosa)', Proceedings of the Royal Society B: Biological Sciences , 291 Rücker C, Rücker G, Markus Meringer (2007) y-Randomization and Its Variants in QSPR/QSAR. J Chem Inf Model 47:2345–2357 Rybak JM, Xie J, Martin-Vicente A, Guruceaga X, Thorn HI, Nywening AV, Ge W, Souza ACO, Shetty AC, Carrie, McCracken VM, Bruno JE, Parker SL, Kelly HM, Snell CA, Cuomo PD, Rogers and Jarrod R. Fortwendel. 2024. 'A secondary mechanism of action for triazole antifungals in Aspergillus fumigatus mediated by hmg1', Nature Communications , 15: 3642 Schuhmann A, Schmid AP, Manzer S, Schulte J, and Ricarda Scheiner (2022). 'Interaction of InsecticidesFungicides in Bees'. Front Insect Sci, Volume 1–2021 Sharma P, Ranjan P, Chakraborty T (2024) Applications of conceptual density functional theory in reference to quantitative structure–activity / property relationship. Mol Phys 122:e2331620 Shirai M, and Thomas Eulgem (2023) 'Molecular interactions between the soilborne pathogenic fungus Macrophomina phaseolina and its host plants'. Front Plant Sci, Volume 14–2023 Strushkevich N, Usanov SA, Hee-Won Park (2010) Structural Basis of Human CYP51 Inhibition by Antifungal Azoles. J Mol Biol 397:1067–1078 Sun Y, Liu R, Luo Z, Zhang J, Gao Z, Liu R, Liu N, Zhang H, Li K, Wu X, Yin W, Qin Q, Su X, Zhao D, Maosheng Cheng (2024) Identification of novel and potent triazoles targeting CYP51 for antifungal: Design, synthesis, and biological study. Eur J Med Chem 280:116942 Taenzler V, Weyers A, Maus C, Ebeling M, Levine S, Cabrera A, Schmehl D, Gao Z, Ismael Rodea-Palomares (2023) Acute toxicity of pesticide mixtures to honey bees is generally additive, and well predicted by Concentration Addition. Sci Total Environ 857:159518 Tennekes HA, Sanchez-Bayo FP (2011) 'Time-dependent toxicity of neonicotinoids and other toxicants: implications for a new approach to risk assessment', Journal of Environmental & Analytical Toxicology Topping C, Bednarska A, Benfenati E, Chetcuti J, Simon-Delso N, Duan X, Focks A, Laskowski R, Lombardo A, Marcussen L, Metodiev T, Rubinigg M, Rundlöf M Fabio Sgolastra, Carla Stoyanova, Gregor Sušanj, James Williams, and Elzbieta Ziolkowska. 2024. 'PollinERA: Understanding pesticide-Pollinator interactions to support EU Environmental Risk Assessment and policy'. Res Ideas Outcomes, 10 Wang C, Luo X, Zhang Y, Pu X, Liu J, Zhang S, Ye M, Li X (2025) Green Pesticide Research and Development Integrating Molecular Targets, Mechanisms, Resistance, and Innovation in Theory and Technology. J Agric Food Chem 73:32460–32489 Wang T, Tang L, Luan F, Natália M, Cordeiro DS (2018) Prediction of the Toxicity of Binary Mixtures by QSAR Approach Using the Hypothetical Descriptors. Int J Mol Sci 19:3423 Warrilow AGS, Claire M, Martel JE, Parker N, Melo DC, Lamb WD, Nes DE, Kelly, Steven LK (2010) Azole binding properties of Candida albicans sterol 14-α demethylase (CaCYP51). Antimicrob Agents Chemother 54:4235–4245 Yang W, Parr RG (1985) 'Hardness, softness, and the fukui function in the electronic theory of metals and catalysis', Proceedings of the National Academy of Sciences , 82: 6723-26 Yap CW (2011) PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474 Zhao Y, and Donald G. Truhlar (2008) The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: two new functionals and systematic testing of four M06-class functionals and 12 other functionals. Theor Chem Acc 120:215–241 Supplementary Files Supplementarymaterialfinal.xlsx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9203836","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":620070479,"identity":"861e4bb8-a365-4767-ae50-e85735b83660","order_by":0,"name":"Chia Ming Chang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABEElEQVRIiWNgGAWjYDACCcYGhgQGCQZ+IAkFPCCCmYCWBAkGyQYMLWy4tIAIoGqDA8Rq4Z/d3Pbg4Q8LOePjuQ8/fqm4w2DOfvaYBEOFdWKDfI8BVkvuHGw3ADrM2OzMc2NpmTPPGCx78tIkGM6kJzaw8WDVwnAjsU0CqCVx2400BmnJtsMMBjd4zCQY2w4DtfBuwKZDHqZl84w05t8ILf9wazGAadkgkcYm+RGupQG3FsM7B4Fa0iSMJc48Y7NmOHOYx+BMjrFFwrF04za2/A/YtMjdbn8m+cOmTo6/PY355o+Kw3IGx88Y3vhQYy3bz3wsAav3kQEzDzRSwCkBV0yiAMYfxKgaBaNgFIyCEQcAnOZezy0hf6gAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-3383-4517","institution":"National Chung Hsing University","correspondingAuthor":true,"prefix":"","firstName":"Chia","middleName":"Ming","lastName":"Chang","suffix":""},{"id":620070480,"identity":"719f7ede-ae0a-4a3b-8ccf-e3887565c499","order_by":1,"name":"Tien-Cheng Liu","email":"","orcid":"","institution":"BAPHIQ: Bureau of Animal and Plant Health Inspection and Quarantine","correspondingAuthor":false,"prefix":"","firstName":"Tien-Cheng","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2026-03-23 18:41:47","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9203836/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9203836/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107124159,"identity":"4c6c6e82-03d9-4ef7-8eb9-bcda72a4f3d6","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":158001,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComparison between observed and predicted antifungal activity (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003epEC₅₀\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) obtained from the GA-MLR QSAR model. Yellow circles represent training-set samples and blue circles represent external prediction-set samples. The solid diagonal line indicates ideal agreement between experimental and predicted values.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/ba33167ab112f7fb1e9d6e96.png"},{"id":107124160,"identity":"2c8cd8e6-c9ee-4474-857f-b36e838f4552","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":100861,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eResidual distribution of the antifungal QSAR model. Residuals are plotted against predicted \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003epEC₅₀\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e values. Yellow circles represent training-set samples and blue circles represent external prediction-set samples. The horizontal line indicates zero residual error.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/36dd1dd9179be43eb825caac.png"},{"id":107124161,"identity":"0290f6c0-3cbf-4f79-a7ac-40b43492a700","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":154926,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eWilliams plot illustrating the applicability domain of the antifungal QSAR model. Standardized residuals are plotted against leverage (HAT) values. The vertical line indicates the leverage threshold (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eh\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e★), while the horizontal dashed lines represent standardized residual limits (±3). Yellow circles represent training-set samples and blue circles represent external prediction-set samples.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/c92b6a85dba7afab8cbffe5a.png"},{"id":107481966,"identity":"26ba6ffe-fae1-4683-b35c-a7bde330f8a1","added_by":"auto","created_at":"2026-04-22 02:21:14","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":223332,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eY-scrambling validation results for the antifungal QSAR model. The coefficients of determination (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eR²\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) and cross-validated coefficients (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eQ²\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) of models generated from randomized response variables are plotted against the correlation coefficient (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eK\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003exy) between original and permuted responses.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/59b1250da0b477ef2e595854.png"},{"id":107481263,"identity":"cb2dc0ed-4d41-496f-8d6d-b94d87e5b877","added_by":"auto","created_at":"2026-04-22 02:16:46","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":155603,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComparison between observed and predicted honey bee toxicity (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003epLD₅₀\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) obtained from the GA-MLR QSAR model. Yellow circles represent training-set samples and blue circles represent external prediction-set samples. The solid diagonal line indicates ideal agreement between experimental and predicted values.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/de645bd6c91a33290fdcfe55.png"},{"id":107481928,"identity":"25a91510-2929-4b8a-8f10-8e66a4808cbe","added_by":"auto","created_at":"2026-04-22 02:20:57","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":110262,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eResidual distribution of the honey bee toxicity QSAR model. Residuals are plotted against predicted \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003epLD₅₀\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e values. Yellow circles represent training-set samples and blue circles represent external prediction-set samples. The horizontal line indicates zero residual error.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/0a836d73a5b69a2ef589accd.png"},{"id":107124166,"identity":"7136a0bf-def9-4b82-9a1b-195932fb7059","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":152499,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eWilliams plot illustrating the applicability domain of the honey bee toxicity QSAR model. Standardized residuals are plotted against leverage (HAT) values. The vertical line indicates the leverage threshold (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eh\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e★), while the horizontal dashed lines represent standardized residual limits (±3). Yellow circles represent training-set samples and blue circles represent external prediction-set samples.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/17ca83e9307211444b14801d.png"},{"id":107124167,"identity":"2f9775ca-32d4-42f6-9587-c905c0faf36e","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":222415,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eY-scrambling validation results for the honey bee toxicity QSAR model. The coefficients of determination (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eR²\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) and cross-validated coefficients (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eQ²\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e) of models generated from randomized response variables are plotted against the correlation coefficient (\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eK\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003exy) between original and permuted responses.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/e05484e6f4319024614a64df.png"},{"id":107124168,"identity":"a5a4f10d-cc2a-4e06-baf5-061846f3ebdc","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":70283,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRelative importance of molecular descriptors in the honey bee toxicity QSAR model based on standardized regression coefficients. Bars represent the magnitude of standardized coefficients, indicating the contribution of each descriptor to the predicted toxicity endpoint.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/3d92f8706966021cef1eba96.png"},{"id":109204452,"identity":"7f3bfb1f-5ca3-4d0f-9859-1aeb6bd5315a","added_by":"auto","created_at":"2026-05-13 15:00:13","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1545593,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/d46a3d05-51c8-4a85-8d53-983e8ce4dfb9.pdf"},{"id":107124164,"identity":"7b96f716-b539-494b-8762-d8c75a503142","added_by":"auto","created_at":"2026-04-17 05:28:55","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":25367,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterialfinal.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-9203836/v1/db459800460df524578c3442.xlsx"}],"financialInterests":"","formattedTitle":"A dual-objective QSAR framework integrating quantum reactivity and structural descriptors for predicting antifungal activity and honey bee toxicity of pesticide mixtures","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eModern crop protection is increasingly shaped by a dual imperative: maintaining effective control of phytopathogens while minimizing unintended harm to beneficial non-target organisms. This tension is particularly evident in the context of fungal disease management and pollinator protection. \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e is a globally distributed, soil-borne pathogen with a remarkably broad host range and strong persistence under field conditions, and disease severity is typically amplified under hot and water-limited environments, making management especially challenging in drought-prone production systems (Shirai and Eulgem 2023). At the same time, pesticide exposure is now recognized as an important driver of adverse effects on non-target organisms across biological scales, including both managed and wild bees, while landscape-scale evidence indicates that pesticide use can alter bee occurrence, colony performance, and community structure over large spatial extents (Raine and Rundl\u0026ouml;f 2024; Nicholson et al. 2024).\u003c/p\u003e \u003cp\u003eOne practical response to declining fungicide efficacy and increasing resistance pressure has been the use of pesticide mixtures (De Mio et al. \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Ballu et al. 2024). In crop protection, mixtures are commonly deployed to broaden activity spectra, improve consistency of disease control, and delay resistance evolution. However, the same mixture-based strategy that may improve efficacy against target fungi also complicates ecological safety assessment, because pollinators are rarely exposed to single active ingredients in isolation (Topping et al. 2024). For bees, pesticide mixtures can generate additive, antagonistic, or synergistic effects, and some of the most concerning interactions involve fungicide-associated disruption of detoxification pathways, particularly cytochrome P450 (CYP)-mediated processes (Migdał et al. \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Rondeau and Raine \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Recent evidence further suggests that concentration addition provides a robust baseline expectation for many acute bee-mixture datasets, whereas deviations from additivity, although less frequent, are mechanistically informative and therefore highly relevant for risk interpretation (Schuhmann et al. \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Taenzler et al. \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThese parallel challenges expose a major limitation of conventional experimental efficacy and toxicology testing: the combinatorial chemical space expands much faster than it can be evaluated empirically (Komprda et al. 2026). This problem is especially acute for binary and higher-order mixtures, where exhaustive laboratory testing rapidly becomes impractical (Wang et al. \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Quantitative structure\u0026ndash;activity relationship (QSAR) modeling therefore offers an attractive route for prioritization, screening, and mechanistic interpretation. In the regulatory context, the Organisation for Economic Co-operation and Development (OECD) has long emphasized that credible QSAR models should be associated with a defined endpoint, an unambiguous algorithm, a defined applicability domain, appropriate measures of goodness-of-fit, robustness, and predictivity, and, where possible, a mechanistic interpretation. More recently, the second edition of the OECD (Q)SAR Assessment Framework (QAF) has strengthened guidance for evaluating model confidence, transparency, and regulatory usability (Framework \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Gissi et al. \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eWhile QSAR methods are now widely used for many single-compound endpoints, mixture-oriented applications remain relatively limited, especially for endpoints directly related to pesticide selectivity (Evangelista and Papa \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; Gustavsson et al. \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Existing research has demonstrated that the acute contact toxicity of binary organic mixtures of \u003cem\u003eApis mellifera\u003c/em\u003e can be successfully modeled using QSAR methods, showing the feasibility of toxicological prediction for pollinator mixtures (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Similarly, the antifungal activity of binary fungicide mixtures against \u003cem\u003eM. phaseolina\u003c/em\u003e has recently been explored using data-driven QSAR and read-across strategies (Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). However, a more generalized, selectivity-oriented framework\u0026mdash;one that can directly compare the structural determinants of \"desired pesticide efficacy\" and \"undesired pollinator toxicity\"\u0026mdash;has not yet been fully developed (Devillers \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; Kar, Amin, and Li \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). If pesticide design is to move beyond simply optimizing potency and further towards a more clearly defined potency-toxicity tradeoff at the molecular level, such a framework is needed (Guo et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; Wang et al. \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). QSAR models have been successfully applied to predict individual endpoints such as antifungal efficacy or bee toxicity, these models are typically built independently, thus providing only limited insights into cross-endpoint relationships. In real-world agricultural contexts, the application of pesticide mixtures aims not only to maximize control over target pathogens but also, ideally, to minimize adverse effects on beneficial non-target organisms. However, the lack of an integrative approach makes it difficult for researchers to compare potency and toxicity on a common basis, especially when different endpoints use different sets of descriptors, modeling strategies, and datasets. Therefore, the descriptor hierarchy governing the potency-toxicity tradeoff remains insufficiently explored.\u003c/p\u003e \u003cp\u003eTo overcome this limitation, this study proposes a unified descriptor framework that allows cross-endpoint comparisons to be conducted within a consistent chemical characterization space. This study established two independent but conceptually interconnected QSAR models for binary pesticide mixtures: one describing the antifungal activity of \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e, and the other describing the acute contact toxicity of \u003cem\u003eApis mellifera\u003c/em\u003e. Instead of forcibly incorporating these two endpoints into a single reaction space, we modeled them separately and then compared them at the descriptor contribution level. To achieve this, we integrated quantum reactivity descriptors calculated using density functional theory (DFT) and a subset of structural descriptors calculated and screened using the PaDEL platform, further extending these descriptors to the mixture level through composition-weighted integration. This strategy aims to capture complementary information such as electronic structure, polarity, topology, and mixture composition within a single interpretable chemical information framework. The goal of this study is not to go beyond existing endpoint-specific QSAR models, but to identify transferable descriptor patterns and mechanistic divergences to support selectivity-oriented pesticide mixture design (Sharma, Ranjan, and Chakraborty \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Yap \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). Numerous QSAR studies have reported high predictive performance using large descriptor pools and nonlinear machine learning approaches, such models are often optimized for single endpoints and may offer limited support for cross-endpoint interpretation. In pesticide mixture assessment, however, the key question is not only how accurately one endpoint can be predicted in isolation, but also how efficacy and non-target toxicity can be examined within a common analytical framework. For this reason, the present study adopts a unified descriptor strategy that prioritizes interpretability and cross-endpoint comparability over performance maximization alone. This design enables identification of descriptor-level patterns associated with efficacy\u0026ndash;toxicity trade-offs that are not readily accessible from independently optimized endpoint-specific models.\u003c/p\u003e"},{"header":"2. Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003e2.1 Data Sources and Study Design\u003c/h2\u003e\n \u003cp\u003eThis study used two independent but complementary datasets to establish a unified descriptor framework for pesticide mixtures. The first dataset described the antifungal activity of binary fungicide mixtures against the plant pathogenic fungus \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e; the second dataset described the acute contact toxicity of organic binary mixtures against honey bees (\u003cem\u003eApis mellifera\u003c/em\u003e). Because the two datasets differed significantly in biotargets, endpoint definitions, and mixture design strategies, they were not merged into a single response matrix. Instead, this study treated them as complementary chemical spaces representing potency and non-target toxicity, respectively. QSAR models were built at both endpoints, and then compared based on descriptor contributions and mechanistic trends. This design allows comparisons to be performed at the descriptor level, rather than directly at the compound or mixture level, thus providing a molecular characterization framework for assessing pesticide selectivity (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n \u003ch2\u003e2.2 Antifungal Activity Dataset\u003c/h2\u003e\n \u003cp\u003eThe antifungal dataset was drawn from previously published experimental studies evaluating the inhibitory activity of binary fungicide mixtures of \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e. This pathogen is a soil- and seed-borne fungus of significant agricultural importance due to its wide host range and high environmental persistence, infecting a variety of economically valuable crops and causing severe yield reductions (Shirai and Eulgem 2023; Marquez et al. \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). The original study investigated six widely used fungicides: propiconazole, difenoconazole, tebuconazole, thiophanate-methyl, iprodione, and kresoxim-methyl. These fungicides were further combined to produce fifteen binary mixture systems. The mixtures were evaluated using a fixed-ratio beam design (FRRD) to form binary mixed beams for bioactivity assays (Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Antifungal activity was determined using the toxicochemical method. Different concentrations of a single fungicide or mixtures thereof were added to potato dextrose agar (PDA), and 6 mm mycelial blocks obtained from 5-day-old \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e colonies were placed in the center of the petri dish. After incubation at 25\u0026deg;C for 7 days, the colony diameter was measured and the percentage of mycelial growth inhibition was calculated. The dose-response curve was then fitted using nonlinear regression to determine the half-maximal effective concentration (EC₅₀, mg L⁻\u0026sup1;) of each mixture. To facilitate subsequent modeling, EC₅₀ values were further converted to their negative logarithmic form (pEC₅₀ = \u0026minus;log EC₅₀) to reduce skewness and improve regression stability (Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n \u003ch2\u003e2.3 Honey bee Toxicity Dataset\u003c/h2\u003e\n \u003cp\u003eThe second data set consists of acute contact toxicity data of organic binary mixtures to honey bees (\u003cem\u003eApis mellifera\u003c/em\u003e). These data were taken from the EFSA Mixture Toxicity Database, which integrates multiple laboratory studies reporting the combined toxicity of pesticide mixtures under acute contact exposure conditions (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Authority et al. 2023). The toxicity endpoint used in this data set is the median lethal dose (LD₅₀-mix), defined as the mixture dose that causes 50% mortality 24 hours after exposure. To improve the statistical distribution of the response variable in regression modeling, the LD₅₀-mix value was converted to its negative logarithmic form (pLD₅₀-mix) and used as the response variable in the bee toxicity model. To ensure comparability across datasets, all chemical structures were standardized before calculating the descriptors, and both biological endpoints were represented in logarithmic form (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\n \u003ch2\u003e2.4 Quantum Chemical Calculations and QRD Descriptor Derivation\u003c/h2\u003e\n \u003cp\u003eQuantum chemical calculations were performed to establish a set of descriptors characterizing electronic structure, reactivity, electrostatics, and solvation behavior. All calculations were performed using Gaussian 16 (Revision C.01) (Frisch et al. \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). The initial molecular structure was taken from the PubChem 3D SDF archive and geometrically optimized at the M06-2X/6\u0026ndash;31\u0026thinsp;+\u0026thinsp;G(d,p) theoretical level. The solvent effect was considered using a water-based SMD continuous medium solvent model (Zhao and Truhlar 2008; Marenich, Cramer, and Truhlar \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). Resonant frequencies were calculated at the same theoretical level to confirm the obtained structure as a true minimum, and a thermal correction for the Gibbs free energy at 298.15 K was obtained. A standard state correction of RT ln(24.46) (1.89 kcal mol⁻\u0026sup1;) was added to convert 1 atm to 1 M.\u003c/p\u003e\n \u003cp\u003eThe free energy (IP) and electron affinity (EA) were calculated using the \u0026Delta;SCF method, expressed as:\u003c/p\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:IP=E(N-1)-E\\left(N\\right),EA=E\\left(N\\right)-E(N+1)\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e\u003c/p\u003e\n \u003cp\u003ewhere E(N), E(N\u0026minus;1), and E(N+1) represent the total electron energies of the neutral molecule, cation, and anion, respectively. Based on these quantities, concept density functional theory (cDFT) descriptors can be further derived (Geerlings et al. 2020; Yang and Parr \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e1985\u003c/span\u003e). The chemical potential (\u0026micro;), global hardness (\u0026eta;), and global softness (S) are calculated as follows:\u003c/p\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:\\mu\\:=-\\frac{IP+EA}{2},\\eta\\:=IP-EA,S=\\frac{1}{\\eta\\:}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e\u003c/p\u003e\n \u003cp\u003eThese descriptors characterize the overall electron acceptance and donation tendency of a molecule, as well as its resistance to charge transfer.\u003c/p\u003e\n \u003cp\u003eTo further resolve directional reactivity, the chemical potential is decomposed into acceptor and donor components (\u0026micro;⁺, \u0026micro;⁻), representing electron acceptance and electron donation capabilities, respectively (G\u0026aacute;zquez, Cedillo, and Vela \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Chattaraj, Maiti, and Sarkar 2003). Local reactivity is described using the Fukui function and local condensation softness indices (s⁺, s⁻) to quantify the sensitivity of specific molecule sites to nucleophilic or electrophilic attacks (Geerlings et al. 2020; Sharma, Ranjan, and Chakraborty \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\n \u003cp\u003eElectrostatic properties were described by the molecular electrostatic potential:\u003c/p\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:V\\left(\\mathbf{r}\\right)=\\sum\\:_{A}\\frac{{Z}_{A}}{\\mid\\:\\mathbf{r}-{\\mathbf{R}}_{A}\\mid\\:}-\\int\\:\\frac{\\rho\\:\\left({\\mathbf{r}}^{\\mathbf{{\\prime\\:}}}\\right)}{\\mid\\:\\mathbf{r}-{\\mathbf{r}}^{\\mathbf{{\\prime\\:}}}\\mid\\:}d{\\mathbf{r}}^{\\mathbf{{\\prime\\:}}}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e\u003c/p\u003e\n \u003cp\u003ewith the maximum surface potential (V\u003csub\u003emax\u003c/sub\u003e) characterizing the most electrophilic region (Politzer, Murray, and Concha \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2002\u003c/span\u003e). Solvation and size effects are represented by the solvation Gibbs free energy (\u0026Delta;G) and molecular volume (MV). These quantities collectively constitute the quantum reactivity descriptor (QRD) set, providing an electrochemically meaningful descriptor layer for a unified framework.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\n \u003ch2\u003e2.5 Structural Descriptor Calculation and CaPS Screening\u003c/h2\u003e\n \u003cp\u003eTwo-dimensional molecular descriptors were calculated using PaDEL software (Yap \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). Zero-variable descriptors were removed, and redundancy was reduced by excluding highly correlated variables (|r| \u0026ge; 0.90). Multicollinearity was further controlled using the variation expansion factor criterion (VIF\u0026thinsp;\u0026le;\u0026thinsp;5) (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009). The final screened PaDEL subset (CaPS) was designed to retain descriptors with interpretable structure and physicochemical meaning, including those related to lipophilicity, polar surface features, molecular size, electron distribution, polarizability, topological autocorrelation, and hydrogen bond acceptor ability. In this research framework, CaPS descriptors provide a structural layer for molecular characterization and complement the electronic reactivity information captured by QRD.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\n \u003ch2\u003e2.6 Mixture Descriptor Calculation\u003c/h2\u003e\n \u003cp\u003eCompound-level descriptors were converted to mixture-level descriptors using a composition-weighted representation. For multi-component mixtures, the mixture descriptor value is calculated as follows:\u003c/p\u003e\n \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u0026nbsp;\u003cspan class=\"mathinline\"\u003e\\(\\:{D}_{mix}=\\sum\\:_{i}{x}_{i}{D}_{i}\\)\u003c/span\u003e\u0026nbsp;\u003c/span\u003e\u003c/p\u003e\n \u003cp\u003ewhere x\u003csub\u003ei\u003c/sub\u003e represents the molar fraction of the i-th component. This formula is consistently applied to both the structural descriptor (CaPS) and the quantum descriptor (QRD). This composition-weighted strategy provides a first-order approximation of the complex rational chemical space of binary mixtures, allowing both sets of data to be encoded within the same descriptor framework (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Taenzler et al. \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\n \u003ch2\u003e2.7 Data Set Splitting\u003c/h2\u003e\n \u003cp\u003eTo ensure comparability across endpoints, both datasets were split into a training set and an external prediction set using the same strategy. 70% of the samples were allocated to the training set for model building, and the remaining 30% were used for external validation. This procedure preserves the representative distribution of chemical diversity and biological activity and reduces the risk of overfitting (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\n \u003ch2\u003e2.8 QSAR Model Building\u003c/h2\u003e\n \u003cp\u003eThe QSAR model was built using multiple linear regression (MLR). The response variable for the antifungal model was pEC₅₀, while the response variable for the bee toxicity model was pLD₅₀-mix. Descriptor selection was performed using a genetic algorithm (GA), which searched for appropriate descriptor combinations from a unified QRD\u0026thinsp;+\u0026thinsp;CaPS descriptor pool. In this study, both endpoints employed the same GA-MLR workflow. The importance of this approach lies not in the novelty of GA-MLR itself, but in the fact that methodological consistency facilitates descriptor-level comparison while minimizing the impact of differences in model architecture (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\n \u003ch2\u003e2.9 Model Validation\u003c/h2\u003e\n \u003cp\u003eModel robustness was assessed using standard internal and external validation metrics. Internal validation employed leave-one-out cross-validation (LOO-CV), with the cross-validation determination coefficient (Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e) as the primary measure of internal predictive power. External validation used independent prediction sets, evaluating predictive performance using the external determination coefficient (R\u0026sup2;\u003csub\u003eext\u003c/sub\u003e), root mean square error (RMSE), and consistency correlation coefficient (CCC) (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009). Within this research framework, the validation objective is to establish the reliability of the model as a fundamental tool for cross-endpoint comparisons, rather than maximizing the predictive performance of a single endpoint.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\n \u003ch2\u003e2.10 Y-scrambling Test\u003c/h2\u003e\n \u003cp\u003eThe Y-scrambling test (response value permutation test) is used to evaluate whether observed descriptor-response relationships are likely derived from randomized correlations. In this procedure, response values are randomly permuted, while the descriptor matrix remains unchanged. If the relationships in the original model are meaningful, the model built from the randomized dataset should have significantly lower R\u0026sup2; and Q\u0026sup2; values than the original model (R\u0026uuml;cker, R\u0026uuml;cker, and Meringer 2007).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\n \u003ch2\u003e2.11 Application Domain\u003c/h2\u003e\n \u003cp\u003eThe application domain (AD) for each model is evaluated using the Williams plot method. The leverage threshold is defined as follows:\u003c/p\u003e\n \u003cp\u003eh* = 3(p\u0026thinsp;+\u0026thinsp;1) / n\u003c/p\u003e\n \u003cp\u003ewhere p represents the number of descriptors and n represents the number of training samples. Observations with leverage values exceeding this threshold are considered influential, while those with standardized residuals exceeding\u0026thinsp;\u0026plusmn;\u0026thinsp;3 are considered potential outliers (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009). The purpose of including the applicable domain analysis is to ensure that the descriptor-level comparison is performed within the structural-chemical space represented by the training data.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\n \u003ch2\u003e2.12 Comparative Descriptor Analysis\u003c/h2\u003e\n \u003cp\u003eTo identify the structural determinants of pesticide selectivity, this study compared the descriptor contributions in antifungal efficacy models and bee toxicity models, using standardized regression coefficients for evaluation. Since both endpoints are represented in the same QRD\u0026thinsp;+\u0026thinsp;CaPS descriptor space and modeled using the same workflow, this analysis allows for a hierarchical comparison of efficacy and toxicity drivers. The focus of this comparison is not a direct one-to-one ranking of mixtures, but rather the identification of generalizable structural and electronic patterns related to the efficacy-toxicity tradeoff in pesticide design (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\n \u003ch2\u003e2.13 QSAR Validation Criteria\u003c/h2\u003e\n \u003cp\u003eModel performance was evaluated according to widely accepted QSAR validation criteria. Model fit was measured by the coefficient of determination (R\u0026sup2;), while internal predictive power was evaluated by Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e. A model with R\u0026sup2; \u0026gt; 0.6 and Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e \u0026gt; 0.5 was considered to have acceptable interpretability and robustness. External predictive performance was evaluated by R\u0026sup2;\u003csub\u003eext\u003c/sub\u003e, RMSE, and CCC (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Gramatica et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Pratim Roy et al. 2009). The overall modeling process followed the OECD principles for QSAR validation, including well-defined endpoints, transparent algorithms, well-defined domains of application, appropriate predictive performance, and mechanistic interpretability (Framework \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Oecd \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). In this study, these criteria were used to establish the credibility of the model as an interpretable comparison tool within a unified descriptor framework.\u003c/p\u003e\n \u003cp\u003eIt should be emphasized that the antifungal efficacy and honey bee toxicity datasets used in this study are independent and are not directly matched at the mixture level. Accordingly, the purpose of the present framework is not to perform one-to-one comparison of identical mixtures across endpoints. Instead, both datasets were encoded within the same descriptor space and modeled under the same analytical workflow, allowing comparison at the level of molecular representation. Under this design, differences in descriptor contributions are interpreted as endpoint-related patterns within a unified chemoinformatic framework rather than as artifacts arising from inconsistent data treatment. The focus of the present study therefore lies in identifying transferable structure\u0026ndash;property relationships relevant to selectivity, rather than ranking the same mixtures across both endpoints.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Basic Modeling in a Unified Descriptor Space\u003c/h2\u003e \u003cp\u003eTwo endpoint-specific models were built using the same descriptor language, combining quantum reactive descriptors (QRDs) and screened structural descriptors (CaPS), and employing a consistent GA-MLR workflow. The purpose of this design was not to compete with previously published single-endpoint models optimized for maximum prediction accuracy, but rather to provide a common analytical space for comparing antifungal efficacy and bee toxicity at the descriptor level.\u003c/p\u003e \u003cp\u003eFor the antifungal endpoint, the resulting model met generally accepted validation criteria and demonstrated stable internal and external predictive capabilities (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The adjusted coefficients of determination for the training set were R\u0026sup2;\u003csub\u003eadj\u003c/sub\u003e = 0.6387, RMSE\u003csub\u003etr\u003c/sub\u003e = 0.2054, MAE\u003csub\u003etr\u003c/sub\u003e = 0.1723, and CCC\u003csub\u003etr\u003c/sub\u003e = 0.8048. Low underfit values ​​(LOF\u0026thinsp;=\u0026thinsp;0.0641) and collinearity exponents (K\u003csub\u003exx\u003c/sub\u003e = 0.4115) further support the stability of the model structure. Internal validation yielded Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e = 0.5743 and Q\u0026sup2;\u003csub\u003eLMO\u003c/sub\u003e = 0.5436, with RMSE\u003csub\u003ecv\u003c/sub\u003e = 0.2345 and MAE\u003csub\u003ecv\u003c/sub\u003e = 0.1964, indicating that the selected descriptor space can support consistent predictions on the resampled sample set. External validation showed R\u0026sup2;\u003csub\u003eext\u003c/sub\u003e = 0.6999, RMSE\u003csub\u003eext\u003c/sub\u003e = 0.1908, MAE\u003csub\u003eext\u003c/sub\u003e = 0.1524, and CCC\u003csub\u003eext\u003c/sub\u003e = 0.8308. The external prediction statistics Q\u0026sup2;\u003csub\u003eF1\u003c/sub\u003e = 0.6974, Q\u0026sup2;\u003csub\u003eF2\u003c/sub\u003e = 0.6953, Q\u0026sup2;\u003csub\u003eF3\u003c/sub\u003e = 0.7183, and r\u0026sup2;\u003csub\u003em (average)\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;0.5986 (Δr\u0026sup2;\u003csub\u003em\u003c/sub\u003e = 0.1574), all falling within the generally acceptable threshold range for predictive QSAR models (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Pratim Roy et al. 2009).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eStatistical parameters of the GA-MLR QSAR model for antifungal activity\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eParameter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValue\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel equation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003epEC₅₀ = \u0026minus;407.9396\u0026thinsp;+\u0026thinsp;0.1918 MV\u0026thinsp;\u0026minus;\u0026thinsp;1201.9769 \u0026micro;⁻ + 0.2042 TopoPSA\u0026thinsp;+\u0026thinsp;156.2841 GATS2v\u0026thinsp;\u0026minus;\u0026thinsp;4.5894 nHBAcc\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDescriptors\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMV (cm\u0026sup3; mol⁻\u0026sup1;), \u0026micro;⁻, TopoPSA, GATS2v, nHBAcc\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR\u0026sup2; / R\u0026sup2;\u003csub\u003eadj\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.6734 / 0.6387\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRMSE\u003csub\u003etr\u003c/sub\u003e / MAE\u003csub\u003etr\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.2054 / 0.1723\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.5743\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR\u0026sup2;\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.6999\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRMSE\u003csub\u003eext\u003c/sub\u003e / MAE\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.1908 / 0.1524\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCCC\u003csub\u003etr\u003c/sub\u003e / CCC\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.8048 / 0.8308\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ\u0026sup2;\u003csub\u003eF1\u003c/sub\u003e / Q\u0026sup2;\u003csub\u003eF2\u003c/sub\u003e / Q\u0026sup2;\u003csub\u003eF3\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.6974 / 0.6953 / 0.7183\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003er\u0026sup2;\u003csub\u003em\u003c/sub\u003e (avg) / Δr\u0026sup2;\u003csub\u003em\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.5986 / 0.1574\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003cb\u003eNotes.\u003c/b\u003e MV, molar volume; \u0026micro;⁻, directional chemical potential descriptors; TopoPSA, topological polar surface area; GATS2v, Geary autocorrelation descriptor (lag 2, weighted by van der Waals volume); nHBAcc, number of hydrogen bond acceptors; RMSE, root mean square error; MAE, mean absolute error; CCC, concordance correlation coefficient; LOO, leave-one-out cross validation.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eSimilarly, the honey bee toxicity model also exhibited satisfactory internal and external performance, supporting its use as a benchmark for descriptor-level interpretation (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Its adjusted coefficient of determination was R\u0026sup2;\u003csub\u003eadj\u003c/sub\u003e = 0.6427, and the external validation coefficient was R\u0026sup2;\u003csub\u003eext\u003c/sub\u003e = 0.7119. The fitting and prediction errors remained within acceptable ranges, with RMSE\u003csub\u003etr\u003c/sub\u003e = 0.3740, MAE\u003csub\u003etr\u003c/sub\u003e = 0.3082, RMSE\u003csub\u003eext\u003c/sub\u003e = 0.4031, and MAE\u003csub\u003eext\u003c/sub\u003e = 0.3500, respectively. Internal validation yielded Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e = 0.5884 and Q\u0026sup2;\u003csub\u003eLMO\u003c/sub\u003e = 0.5634, while the consistency coefficients CCC\u003csub\u003etr\u003c/sub\u003e = 0.8084 and CCC\u003csub\u003eext\u003c/sub\u003e = 0.8350 indicate stable consistency between predicted and observed values. External validation parameters Q\u0026sup2;\u003csub\u003eF1\u003c/sub\u003e = 0.7257, Q\u0026sup2;\u003csub\u003eF2\u003c/sub\u003e = 0.7052, Q\u0026sup2;\u003csub\u003eF3\u003c/sub\u003e = 0.6263, and r\u0026sup2;\u003csub\u003em (average)\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;0.6265 (Δr\u0026sup2;\u003csub\u003em\u003c/sub\u003e = 0.1190) further support the reliability of the models within their defined range (Gramatica \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Pratim Roy et al. 2009).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eStatistical parameters of the GA-MLR QSAR model for honey bee toxicity\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eParameter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValue\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel equation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003epLD₅₀ = \u0026minus;11.3109\u0026thinsp;\u0026minus;\u0026thinsp;4.6634 V\u003csub\u003emax\u003c/sub\u003e \u0026minus; 0.0946 ΔG\u0026thinsp;\u0026minus;\u0026thinsp;0.2007 s⁺ \u0026minus; 17.2695 (-\u0026micro;⁺)\u0026thinsp;\u0026minus;\u0026thinsp;11.0294 \u0026micro;⁻ \u0026minus; 0.0055 TopoPSA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDescriptors\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eV\u003csub\u003emax\u003c/sub\u003e, ΔG, s⁺, -\u0026micro;⁺, \u0026micro;⁻, TopoPSA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR\u0026sup2; / R\u0026sup2;\u003csub\u003eadj\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.6784 / 0.6427\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRMSE\u003csub\u003etr\u003c/sub\u003e / MAE\u003csub\u003etr\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.3740 / 0.3082\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.5884\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR\u0026sup2;\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.7119\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRMSE\u003csub\u003eext\u003c/sub\u003e / MAE\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.4031 / 0.3500\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCCC\u003csub\u003etr\u003c/sub\u003e / CCC\u003csub\u003eext\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.8084 / 0.8350\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ\u0026sup2;\u003csub\u003eF1\u003c/sub\u003e / Q\u0026sup2;\u003csub\u003eF2\u003c/sub\u003e / Q\u0026sup2;\u003csub\u003eF3\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.7257 / 0.7052 / 0.6263\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003er\u0026sup2;\u003csub\u003em\u003c/sub\u003e (avg) / Δr\u0026sup2;\u003csub\u003em\u003c/sub\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.6265 / 0.1190\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003cb\u003eNotes.\u003c/b\u003e V\u003csub\u003emax\u003c/sub\u003e, maximum molecular interaction descriptor; ΔG, solvation free energy descriptor; s⁺, electrophilic reactivity descriptor; \u0026micro;⁺/\u0026micro;⁻, directional chemical potential descriptors; TopoPSA, topological polar surface area; RMSE, root mean square error; MAE, mean absolute error; CCC, concordance correlation coefficient; LOO, leave-one-out cross validation.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThese results show that both models are robust enough to serve as interpretable baseline models within a unified descriptor framework. This level of performance is appropriate for the purposes of this study; that is, this study focuses on comparative interpretation across endpoints rather than maximizing predictive performance for a single endpoint.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Comparison of Descriptor Contributions Across Endpoints\u003c/h2\u003e \u003cp\u003eThe most important result of this modeling work is not the absolute predictive performance of either model, but rather the emergence of significantly different descriptor contribution patterns between the two endpoints. Since both datasets were encoded using the same QRD\u0026thinsp;+\u0026thinsp;CaPS descriptor pool and modeled under the same GA-MLR strategy, the differences in the selected descriptors can be interpreted as divergent mechanisms related to endpoints, rather than artifacts arising from data representation or model construction.\u003c/p\u003e \u003cp\u003eIn the antifungal model, descriptor contributions fall between structural and quantum-derived variables. The electronic descriptor \u0026micro;⁻ shows the strongest contribution, while structural descriptors related to molecular size and polarity, including MV and TopoPSA, also show positive contributions. Furthermore, the topological descriptor GATS2v shows that the spatial distribution of atomic properties within the molecular backbone plays a crucial role in regulating interactions with fungal targets. Overall, antifungal endpoints appear to be dominated by structure-electronic complex features, with molecular size, polarity, and topological organization all contributing to their effectiveness.\u003c/p\u003e \u003cp\u003eIn contrast, the bee toxicity model is dominated by electronic and electrostatic descriptors. The parameter\u0026thinsp;\u0026minus;\u0026thinsp;\u0026micro;⁺ shows the strongest normalization contribution, followed by ΔG. V\u003csub\u003emax\u003c/sub\u003e, \u0026micro;⁻, s⁺, and TopoPSA also contribute additionally. This model indicates a stronger correlation between bee toxicity and descriptors describing charge transfer behavior, electrophilic responses, electrostatic distribution, and solvation-related effects. In particular, the negative contributions of polarity-related descriptors such as TopoPSA and s⁺ suggest that increased polarity may reduce the bee's ability to penetrate its hydrophobic cuticle barrier, thereby reducing effective internal exposure.\u003c/p\u003e \u003cp\u003eConsequently, cross-endpoint descriptor comparative analysis shows a strong correlation between antifungal efficacy and a set of features comprising a mixture of stereo, polar, and topological properties; while bee toxicity is primarily controlled by descriptors characterizing electronic reactivity and environment-dependent reactivity.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Descriptor Hierarchy Divergence and Selectivity Implications\u003c/h2\u003e \u003cp\u003eThe divergence between these two sets of descriptor features provides a mechanistic basis for discussing pesticide selectivity. Antifungal efficacy appears to be more strongly dependent on target-binding characteristics, including stereocomplementarity, molecular size, and the spatial arrangement of polar functional groups. In contrast, bee toxicity is strongly correlated with electronic reactivity and physicochemical characteristics influencing bioavailability and receptor-level interactions.\u003c/p\u003e \u003cp\u003eOne particularly informative example is TopoPSA, which contributes positively in the antifungal model but negatively in the bee toxicity model. This opposite effect suggests that polar functional groups may simultaneously enhance effective interactions with fungal targets while reducing their ability to penetrate the hydrophobic cuticle barrier of bees. More broadly, descriptor comparisons show that features favorable to efficacy do not necessarily increase non-target toxicity. Instead, these two endpoints occupy partially separated regions in the descriptor space, thus creating the possibility of exploitable selective windows.\u003c/p\u003e \u003cp\u003eThis study should not be interpreted as a direct one-to-one comparison of mixtures from two datasets. Instead, it should be viewed as a descriptor-level analysis of the efficacy-toxicity tradeoff within a unified chemical information framework. This distinction is important because the two sets of data are biologically independent and not directly paired at the compound or formulation level. Therefore, the advantage of our approach lies in identifying transferable design rules, rather than ranking identical mixtures at two endpoints.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Model Reliability and Application Domain\u003c/h2\u003e \u003cp\u003eResidual analysis, application domain assessment, and Y-scrambling tests collectively support the reliability of the two models as a comparison tool. For the antifungal model, the relationship between observed and predicted values ​​shows that most of the training set and external samples are distributed close to the ideal diagonal (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), indicating satisfactory agreement between experimental and predicted values. The angle between the regression line and the ideal diagonal is \u0026minus;\u0026thinsp;8.2015\u0026deg;, further supporting this agreement. The residuals are roughly symmetrically distributed around zero and do not exhibit a significant funnel-shaped pattern (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e), indicating the absence of major systematic bias or heterogeneous variation within the simulated activity range. Williams plot analysis showed that most training and predicted samples fell within the defined applicable domain, and no compounds exceeded the \u0026plusmn;\u0026thinsp;3 residual limit (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e), indicating that the model was built in a representative structural space. The statistical results obtained from Y-scrambling (R\u0026sup2;\u003csub\u003eYscr\u003c/sub\u003e = 0.0958; Q\u0026sup2;\u003csub\u003eYscr\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;\u0026minus;\u0026thinsp;0.1544) were significantly worse than the original model (R\u0026sup2; = 0.6734; Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e = 0.5743), confirming that the observed relationship was unlikely to originate from random correlation (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e) (R\u0026uuml;cker, R\u0026uuml;cker, and Meringer 2007).\u003c/p\u003e\u003cp\u003eA similar pattern was observed in the bee toxicity model. The relationship between observed and predicted values ​​(Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e) was consistent with the statistical validation indices, while the residual plot (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e) showed a zero-centered random distribution, without significant systematic bias or error variation increasing with the toxicity range. Williams plot analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e) showed that most training and external samples were within the applicable domain, and no major outlier was observed. Y-scrambling again yielded significantly lower values ​​(R\u0026sup2;\u003csub\u003eYscr\u003c/sub\u003e = 0.0982; Q\u0026sup2;\u003csub\u003eYscr\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;\u0026minus;\u0026thinsp;0.1560), compared to the original model's R\u0026sup2; = 0.6784; Q\u0026sup2;\u003csub\u003eLOO\u003c/sub\u003e = 0.5884, supporting the existence of a genuine structure-toxicity relationship rather than a random statistical artifact (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e) (R\u0026uuml;cker, R\u0026uuml;cker, and Meringer 2007).\u003c/p\u003e\u003cp\u003eWhile these analyses cannot eliminate all uncertainties associated with mixture modeling, they do support using our model as a stable and interpretable benchmark for cross-endpoint descriptor comparisons. For the purposes of this study, this reliability is more important than sacrificing transparency to maximize predictive performance.\u003c/p\u003e \u003cp\u003eIn the present study, quantum reactivity descriptors (QRDs) are not interpreted as direct representations of biological mechanisms, but as chemically meaningful proxies for electronic properties that influence molecular interactions with biological systems. Descriptors related to electrophilicity, charge redistribution, electrostatic potential, and solvation provide indirect yet informative measures of how molecules respond to polar environments and participate in charge-transfer processes. Such properties are especially relevant for toxicity-related interactions, where electronic responsiveness can shape access to, and interaction with, sensitive biological targets. By integrating QRDs with curated structural descriptors, the present framework supports interpretation at the level of physicochemical behavior, thereby bridging the gap between purely statistical QSAR models and fully mechanistic biological explanations.\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Mechanism of Antifungal Efficacy\u003c/h2\u003e \u003cp\u003eAntifungal models show that the inhibitory efficacy against \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e is governed by a synergistic balance of multiple features modulated by stereochemistry, polarity, topology, and electronic properties, rather than by a single physicochemical parameter. The selected descriptors\u0026mdash;molecular volume (MV), \u0026micro;⁻, TopoPSA, GATS2v, and nHBAcc\u0026mdash;collectively define a efficacy profile, where molecular size, spatial organization, and controlled polarity appear to be the core factors determining fungal growth inhibition.\u003c/p\u003e \u003cp\u003eThe positive contribution of MV indicates that a larger molecular skeleton is beneficial for antifungal efficacy. Mechanistically, this is consistent with the mechanisms of action of many systemic fungicides, particularly azoles. Triazole fungicides inhibit the cytochrome P450 enzyme lanosterol 14α-demethylase (CYP51), which is crucial for the biosynthesis of ergosterol in the fungal cell membrane (Rybak et al. 2024; Sun et al. \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Increased molecular size is generally associated with a larger hydrophobic surface area, which may enhance its van der Waals interaction at the enzyme's active site and facilitate its partitioning into lipid-rich membrane environments. Previous structural studies have also shown that hydrophobic substituents significantly contribute to stabilizing the fungicide\u0026ndash;CYP51 complex and enhancing its potency (Kelly and Kelly \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Warrilow et al. \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2010\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTopoPSA also shows a positive contribution to antifungal potency, indicating that controlled polarity facilitates effective target interactions. Polar surfaces do not simply enhance hydrophilicity but may support effective binding through heteroatom-containing functional groups while maintaining membrane permeability compatibility. In triazole fungicides, these heteroatoms may coordinate with the hemosiderin of CYP51 or form hydrogen bonds with amino acid residues in the active site pocket (Hargrove et al. \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Strushkevich, Usanov, and Park 2010). Similar relationships between TopoPSA and antifungal activity have been reported in QSAR studies of azole and strobilurin fungicides (Cherkasov et al. 2014).\u003c/p\u003e \u003cp\u003eThe positive contribution of GATS2v further demonstrates the importance of topological organization in antifungal efficacy. This descriptor reflects the spatial distribution of atomic volumes within the molecular backbone, thus capturing stereotactic features related to ligand orientation and binding affinity. This interpretation is consistent with known fungicidal mechanisms, including the inhibitory effect of strobilurin-like compounds on the Qo site of the mitochondrial cytochrome bc₁ complex, where the positions of aromatic rings and hydrophobic substituents strongly influence activity (Bartlett et al. 2002).\u003c/p\u003e \u003cp\u003eIn contrast, the negative contribution of \u0026micro;⁻ suggests that excessive electron-donating properties may be detrimental to antifungal efficacy. As a cDFT-derived descriptor related to electron-donating ability, \u0026micro;⁻ reflects the tendency of a molecule to donate electron density in intermolecular interactions. Its negative contribution suggests that optimal antifungal activity may require a more balanced electronic profile, rather than simply a strong electron-donating ability. Similarly, the negative contribution of nHBAcc indicates that excessive hydrogen bond acceptors may reduce efficacy, possibly because it causes molecular polarity beyond what is permissible for efficient membrane transport.\u003c/p\u003e \u003cp\u003eThe antifungal endpoint can be interpreted as a \"structure-driven system modulated by electronic properties.\" High antifungal efficacy appears to require a molecular architecture with sufficient size, good topological organization, and moderate polarity, while avoiding excessive electron-donating properties or too many hydrogen bond acceptor functional groups. This interpretation is consistent with existing understanding of the structure-activity relationship of triazole and strobilurin fungicides, namely that optimal activity depends on a synergistic balance between the hydrophobic framework and appropriately configured heteroatoms (Bartlett et al. 2002; Copping and Duke 2007).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Mechanism Analysis of Honey bee Toxicity\u003c/h2\u003e \u003cp\u003eCompared to the antifungal endpoint, the bee toxicity model is more clearly centered on electronic and electrostatic descriptors. This model shows that the acute contact toxicity of \u003cem\u003eApis mellifera\u003c/em\u003e is influenced not only by its overall physicochemical properties but also by how the molecule interacts electronically with the biomarker and how effectively it reaches the target after penetrating the cuticle.\u003c/p\u003e \u003cp\u003eThe negative relationship of V\u003csub\u003emax\u003c/sub\u003e indicates that molecules with highly localized positive electrostatic regions do not necessarily possess the highest toxicity. One possible explanation is that extreme localization of the positive potential may reduce its efficiency in interacting effectively with toxicity-related biological interfaces or alter the balance between transport and receptor binding (Nation Sr \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Casida \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). The strong contribution of ΔG suggests that solvation-related stabilization is also a significant determinant of toxicity, implying that the molecule's electronic response to the surrounding medium may modulate its effective bioavailability and biointeraction potential.\u003c/p\u003e \u003cp\u003eThe negative contribution of s⁺ further supports that bee toxicity is shaped by a specific electronic response window, rather than a simple amplification of any single reactivity index. As a descriptor related to electrophilicity, s⁺ reflects the ease with which a molecule interacts with electron-rich sites in biomolecules. The negative correlation between polarity and toxicity suggests that lower electrophilic reactivity may limit interactions with toxic targets. Similarly, the contributions of\u0026thinsp;\u0026minus;\u0026thinsp;\u0026micro;⁺ and \u0026micro;⁻ indicate that directional electron acceptance and donation tendencies are important in defining the overall molecular interaction profile in the bee system.\u003c/p\u003e \u003cp\u003eTopoPSA again provides a particularly informative comparison. In the bee model, its negative contribution suggests that increased polarity may reduce passive transport across hydrophobic biological barriers, thereby decreasing effective internal exposure. This is consistent with general expectations: higher polarity molecules are less likely to penetrate the lipid-rich cuticle of bees and are therefore less likely to reach internal target sites at toxicologically significant concentrations (Tennekes and Sanchez-Bayo \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2011\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe honey bee toxicity model points to an endpoint \"regulated by electronic properties and limited by transport-related structural factors.\" Non-target toxicity is not simply determined by molecular size or general lipophilicity, but rather by more specific interactions between electron distribution, charge transfer behavior, solvation effects, and polarity-dependent permeation. Within this context, the QRD layer offers clear added value because it provides mechanistic explanations that are difficult to capture directly using traditional 2D structural descriptors.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Implications of Cross-Endpoint Selective Pesticide Design\u003c/h2\u003e \u003cp\u003eThe core contribution of this study lies in the hierarchical comparison of descriptors between potency and toxicity. Because the two models are built on different but methodologically consistent datasets, this framework cannot identify \"safe mixtures\" through direct ranking. Instead, it identifies which molecular property domains are strongly associated with desired antifungal performance and which are strongly associated with unwanted pollinator toxicity.\u003c/p\u003e \u003cp\u003eFrom this perspective, the results of this study propose a practical design principle: features associated with 3D size, good topological organization, and controlled polarity may contribute to improved antifungal potency; while excessive electronic reactivity and electrostatic sensitivity may increase the likelihood of bee toxicity. Some descriptors, particularly polarity-related variables like TopoPSA, exhibit opposite effects at the two endpoints, indicating that improved selectivity cannot be achieved simply by maximizing activity. Instead, it requires adjusting structural features to better suit the fungal target while avoiding the formation of electronic feature profiles associated with bee damage.\u003c/p\u003e \u003cp\u003eThis shift from \"endpoint-specific prediction\" to \"cross-endpoint design logic\" is what distinguishes this study from previous single-endpoint QSAR work. Therefore, this research framework provides a foundation for selectivity scoring, prioritization, and hypothesis generation in future pesticide mixture design, especially after obtaining directly paired efficacy-toxicity datasets. In this sense, our results support a descriptor-level selectivity window where beneficial efficacy and reduced non-target harm do not necessarily have to change simultaneously.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec25\" class=\"Section2\"\u003e \u003ch2\u003e4.4 Relationship with Previous QSAR Research\u003c/h2\u003e \u003cp\u003ePrevious studies have achieved higher predictive performance at both the antifungal efficacy and bee toxicity endpoints through nonlinear machine learning, quasi-SMILES, q-RASAR, or other more flexible modeling frameworks. Regarding antifungal endpoints, Rahimi-Soujeh et al. reported strong nonlinear predictive capabilities for binary fungicide mixtures against \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e, with their Support Vector Regression (SVR) and Gaussian Process Regression (GPR) models achieving high levels of fit and internal validation (Rahimi-Soujeh et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). For honey bee toxicity, Carnesecchi et al. and later Chatterjee et al. demonstrated that machine learning models based on quasi-SMILES/CORAL and q-RASAR provide significantly stronger external predictive capabilities for organic binary mixtures (Carnesecchi et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Chatterjee et al. \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThis study does not attempt to numerically surpass these models. Instead, it addresses a different question: can a common descriptor language be used to compare the mechanistic drivers of pesticide efficacy and honey bee toxicity in two independent mixture datasets? In this respect, this study complements, rather than replaces, existing QSAR research. Previous studies on antifungal and honey bee research have demonstrated that each endpoint can be successfully modeled independently. The added value of this study lies in integrating QRD and CaPS descriptors into a single comparative framework, thereby enabling a descriptor-level interpretation of selective correlation trade-offs.\u003c/p\u003e \u003cp\u003eConsequently, this study is more conservative in its predictive claims, but stronger in its mechanistic and design-oriented arguments. This emphasis on transparency, interpretability, clear applicability, and regulatory availability aligns with the current OECD QAF requirements for the reliability of computational models (Framework \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Gissi et al. \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec26\" class=\"Section2\"\u003e \u003ch2\u003e4.5 Research Limitations and Future Directions\u003c/h2\u003e \u003cp\u003eSeveral limitations of this study need to be addressed. First, the two datasets are independent and not directly paired at the mixture level. Therefore, the current framework supports descriptor-level comparisons rather than direct cross-endpoint ranking of the same formulation. Second, the composition-weighted mixture representation only provides a first-order approximation of mixture behavior and does not explicitly address synergistic or antagonistic effects. Third, current model development is limited to a finite region within the pesticide chemistry space; therefore, extrapolation beyond the applicable domain should be approached with caution.\u003c/p\u003e \u003cp\u003eThese limitations also point to clear future directions. If paired potency-toxicity datasets can be established, selectivity models can be built directly at the formulation level. Interaction-aware mixture representations, hybrid descriptor/machine learning methods, and graph-based approaches may further improve predictive performance while preserving interpretability. Recent advancements in graph neural networks and the emergence of augmented datasets such as ApisTox also suggest that future models can achieve better generalization capabilities while maintaining mechanism relevance (Adamczyk, Poziemski, and Siedlecki \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Furthermore, more explicit uncertainty reporting and closer alignment with regulatory formats such as QPRF/QRRF will help enhance the practical application value of this framework in early screening and decision support (Framework \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Gissi et al. \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Barber, Heghes, and Johnston \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eHowever, even in its current form, this study demonstrates that unified descriptor analysis can advance mixture QSAR from single-endpoint prediction to mechanistic-based selective assessment. This shift is the most significant conceptual contribution of this study and provides a more useful basis for safer pesticide formulation design than endpoint prediction alone.\u003c/p\u003e \u003cp\u003eFrom an environmental and regulatory perspective, the ability to distinguish structural features associated with desired efficacy from those linked to non-target toxicity is essential for the development of safer pesticide formulations. The descriptor-level framework proposed here provides a transparent and interpretable basis for early-stage screening and prioritization of pesticide mixtures. Rather than replacing experimental testing, this approach offers a complementary tool for identifying selectivity trends and guiding rational design toward reduced ecological risk and improved formulation safety.\u003c/p\u003e \u003c/div\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study proposes a unified descriptor framework for analyzing pesticide mixtures from two perspectives: antifungal efficacy and bee toxicity. By applying the same QRD\u0026thinsp;+\u0026thinsp;CaPS descriptor space and a consistent GA-MLR workflow to two independent datasets, this study establishes a consistent cross-endpoint comparative basis at the descriptor level, rather than at the directly paired mixture level.\u003c/p\u003e \u003cp\u003eThe results show that these two endpoints are governed by different but mechanistically related descriptor domains. Antifungal efficacy is primarily related to structural features such as molecular size, polarity, and topology, while bee toxicity is driven more by electronic reactivity, charge transfer behavior, solvation-related properties, and polarity-influenced transport limitations. This divergence suggests that efficacy and non-target toxicity do not necessarily increase synchronously and implies the identification of selectivity windows at the descriptor level that can contribute to safer pesticide design.\u003c/p\u003e \u003cp\u003eConsequently, the model in this study should be considered as a comparative and mechanistic benchmark, rather than a predictive tool optimized for a single endpoint. The primary contribution of this study is not to surpass existing high-accuracy QSAR models, but rather to provide an interpretable and transferable framework for distinguishing between descriptor patterns related to target efficacy and those related to pollinating insect damage. In this sense, this study extends mixture QSAR from endpoint-specific predictions to mechanistic-based selective assessments.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was supported by Bureau of Animal and Plant Health Inspection and Quarantine (BAPHIQ), Council of Agriculture, which provided the financial resources necessary for undertaking this comprehensive study. We thank the anonymous reviewers for their constructive feedback, which greatly enhanced the quality and clarity of our manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research received no external funding.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization: C.M.C. and T.-C.L.;\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMethodology: C.M.C.;\u003c/p\u003e\n\u003cp\u003eData curation: C.M.C.;\u003c/p\u003e\n\u003cp\u003eFormal analysis: C.M.C.;\u003c/p\u003e\n\u003cp\u003eWriting \u0026ndash; original draft: C.M.C.;\u003c/p\u003e\n\u003cp\u003eWriting \u0026ndash; review \u0026amp; editing: T.-C.L.;\u003c/p\u003e\n\u003cp\u003eAll authors have read and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical Approval\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis is not applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to Participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis is not applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to Publish\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis is not applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability Statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAdamczyk J, Poziemski J, Siedlecki P (2025) ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees. Sci Data 12:5\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAuthority EF, Safety P, Adriaanse A, Arce A, Focks B, Ingels D, J\u0026ouml;lli S\u0026eacute;bastien Lambin, Maj Rundl\u0026ouml;f, Dirk S\u0026uuml;\u0026szlig;enbach, Monica Del Aguila, Valeria Ercolano, Franco Ferilli, Alessio Ippolito, Csaba Szentes, Franco Maria Neri, Laura Padovani, Agn\u0026egrave;s Rortais, Jacoba Wassenberg, and Domenica Auteri. 2023. 'Revised guidance on the risk assessment of plant protection products on bees (Apis mellifera, Bombus spp. and solitary bees)'. EFSA J, 21: e07989\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBallu A, Ugazio C, Duplaix Cl\u0026eacute;mentine, Noly A, Wullschleger J, Stefano FF, Torriani A, D\u0026eacute;r\u0026eacute;dec Florence Carpentier, and Anne-Sophie Walker. 2024. 'Preventing multiple resistance above all: New insights for managing fungal adaptation'. Environ Microbiol, 26: e16614\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBarber C, Heghes C, Johnston L (2024) A framework to support the application of the OECD guidance documents on (Q)SAR model validation and prediction assessment for regulatory decisions. Comput Toxicol 30:100305\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBartlett DW, John M, Clough, Jeremy R, Godwin, Alison A, Hall Mick Hamer, and Bob Parr-Dobrzanski. 2002. 'The strobilurin fungicides'. Pest Manag Sci, 58: 649\u0026ndash;662\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarnesecchi E, Toropov AA, Toropova AP, Kramer N, Svendsen C, Dorne JL, Emilio Benfenati (2020) Predicting acute contact toxicity of organic binary mixtures in honey bees (A. mellifera) through innovative QSAR models. Sci Total Environ 704:135302\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCasida JE (2018) Neonicotinoids and Other Insect Nicotinic Receptor Competitive Modulators: Progress and Prospects. Ann Rev Entomol 63:125\u0026ndash;144\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChattaraj P, Kumar B, Maiti, Sarkar U (2003) Philicity: A Unified Treatment of Chemical Reactivity and Selectivity. J Phys Chem A 107:4973\u0026ndash;4975\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChatterjee M, Banerjee A, Tosi S, Carnesecchi E, Benfenati E, and Kunal Roy (2023) Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees. J Hazard Mater 460:132358\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Victor E Kuz\u0026rsquo;min, Richard Cramer, Romualdo Benigni, Chihae Yang, James Rathman, Lothar Terfloth, Johann Gasteiger, Ann Richard, and Alexander Tropsha. 2014. 'QSAR Modeling: Where Have You Been? Where Are You Going To?'. J Med Chem, 57: 4977\u0026ndash;5010\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCopping LG, Stephen OD (2007) Natural products that have been used commercially as crop protection agents. Pest Manag Sci 63:524\u0026ndash;554\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDe Mio LL, May NA, Peres G, Schnabel, Ishii H (2024) A special isssue on fungicide resistance and management strategies. Trop Plant Pathol 49:1\u0026ndash;4\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDevillers J (2014) 'QSAR Modeling of Pesticide Toxicity to Bees.' in\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEvangelista M, Papa E (2025) A Review of Quantitative Structure\u0026ndash;Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances. Toxics 13:799\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFramework SARA, Paris (2024) France\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFrisch MJ, ea GW, Trucks HB, Schlegel GE, Scuseria MA, Robb JR, Cheeseman G, Scalmani VPGA, Barone GA, Petersson, Nakatsuji HJRA (2016) Gaussian 16. In.: Gaussian, inc. Wallingford, CT\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eG\u0026aacute;zquez Jos\u0026eacute;, Cedillo, Vela A (2007) 'Electrodonating and Electroaccepting Powers', \u003cem\u003eThe journal of physical chemistry. A\u003c/em\u003e, 111: 1966-70\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGeerlings P, Chamorro E, Chattaraj PK, De Proft F, G\u0026aacute;zquez Jos\u0026eacute;L, Liu S Christophe Morell, Alejandro Toro-Labb\u0026eacute;, Alberto Vela, and Paul Ayers. 2020. 'Conceptual density functional theory: status, prospects, issues'. Theor Chem Acc, 139: 36\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGissi A, Tcheremenskaia O, Bossa C, Battistelli CL, and Patience Browne (2024) The OECD (Q)SAR Assessment Framework: A tool for increasing regulatory uptake of computational approaches. Comput Toxicol 31:100326\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGramatica P (2007) Principles of QSAR models validation: internal and external. QSAR Comb Sci 26:694\u0026ndash;701\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGramatica P, Chirico N, Papa E, Cassani S, Kovarich S (2013) QSARINS: A new software for the development, analysis, and validation of QSAR MLR models. J Comput Chem 34:2121\u0026ndash;2132\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuo W, Song X, Gao Y, Yang S, Tang J, Zhao C, Wang H, Ren J, Zeng L, Hanhong Xu (2025) Exploring Insecticidal Molecules with Random Forest: Toward High Insecticidal Activity and Low Bee Toxicity. J Agric Food Chem 73:5573\u0026ndash;5584\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGustavsson M, Molander S, Backhaus T, Kristiansson E (2023) Risk assessment of chemicals and their mixtures are hindered by scarcity and inconsistencies between different environmental exposure limits. Environ Res 225:115372\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHargrove TY, Friggeri L, Wawrzak Z, Qi A, Hoekstra WJ, Schotzinger RJ, York JD, Peter F, Guengerich, Galina I, Lepesheva (2017) Structural analyses of Candida albicans sterol 14α-demethylase complexed with azole drugs address the molecular basis of azole-mediated inhibition of fungal sterol biosynthesis. J Biol Chem 292:6728\u0026ndash;6743\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKar S, Amin SA, Li L (2025) Unveiling chemical space, scaffold diversity, critical structural features of pesticides: A comprehensive QSAR, qRASAR, machine learning studies to predict pesticides toxicity. Sci Total Environ 1001:180489\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKelly SL, Kelly DE (2013) 'Microbial cytochromes P450: biodiversity and biotechnology. Where do cytochromes P450 come from, what do they do and what can they do for us?'. Philosophical Trans Royal Soc B: Biol Sci, 368\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKomprda Jiř\u0026iacute;, L\u0026ouml;rinczov\u0026aacute; Katar\u0026iacute;na, Toušov\u0026aacute; Z, Smutn\u0026aacute; M Soňa Smetanov\u0026aacute;, Kl\u0026aacute;ra Komprdov\u0026aacute;, and Kl\u0026aacute;ra Hilscherov\u0026aacute;. 2026. 'Methodological Challenges in the Application of QSAR Models for Chemical Prioritization and Toxicity Assessment: A Case Study on Aryl Hydrocarbon Receptor Activity in Environmental Pollutant Mixtures', \u003cem\u003eACS Environmental Au\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarenich AV, Cramer CJ, Truhlar DG (2009) Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. J Phys Chem B 113:6378\u0026ndash;6396\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarquez N, Giachero Mar\u0026iacute;aL, St\u0026eacute;phane, Declerck, Ducasse DA (2021) 'Macrophomina phaseolina: General Characteristics of Pathogenicity and Methods of Control', \u003cem\u003eFrontiers in Plant Science\u003c/em\u003e, Volume 12\u0026ndash;2021\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMigdał Paweł, Murawska A, Berbeć E, Zarębski K, Ratajczak N, Roman A, Krzysztof Latarowski (2024) Biochemical Indicators and Mortality in Honey Bee (Apis mellifera) Workers after Oral Exposure to Plant Protection Products and Their Mixtures. Agriculture 14:5\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNation Sr JL (2022) Insect physiology and biochemistry. CRC\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNicholson CC, Knapp J, Kiljanek T, Albrecht M, Chauzat M-P, Costa C, De la R\u0026uacute;a P, Klein A-M, Marika M\u0026auml;nd SG, Potts O, Schweiger I, Bottero E, Cini, Joachim R, de Miranda JC Stout, and Maj Rundl\u0026ouml;f. 2024. 'Pesticide use negatively affects bumble bees across European landscapes', \u003cem\u003eNature\u003c/em\u003e, 628: 355\u0026thinsp;\u0026ndash;\u0026thinsp;58\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOecd DE (2007) 'Guidance document on the validation of (quantitative) structure-activity relationship [(Q) SAR] models', \u003cem\u003eOrgan. Econ. Co-operation Dev. Paris Fr\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePolitzer P, Murray JS, Concha MC (2002) The complementary roles of molecular surface electrostatic potentials and average local ionization energies with respect to electrophilic processes. Int J Quantum Chem 88:19\u0026ndash;27\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePratim Roy, Partha S, Paul I, Mitra, Kunal, Roy (2009) On Two Novel Parameters for Validation of Predictive QSAR Models. Molecules 14:1660\u0026ndash;1701\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRahimi-Soujeh Z, Safaie N, Moradi S, Abbod M, Sharifi R, Mojerlou S, Mokhtassi-Bidgoli A (2024) New binary mixtures of fungicides against Macrophomina phaseolina: Machine learning-driven QSAR, read-across prediction, and molecular dynamics simulation. Chemosphere 366:143533\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRaine NE, Maj, Rundl\u0026ouml;f (2024) Pesticide Exposure and Effects on Non-Apis Bees. Ann Rev Entomol 69:551\u0026ndash;576\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRondeau S, Raine NE (2024) 'Single and combined exposure to \u0026lsquo;bee safe\u0026rsquo; pesticides alter behaviour and offspring production in a ground-nesting solitary bee (Xenoglossa pruinosa)', \u003cem\u003eProceedings of the Royal Society B: Biological Sciences\u003c/em\u003e, 291\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eR\u0026uuml;cker C, R\u0026uuml;cker G, Markus Meringer (2007) y-Randomization and Its Variants in QSPR/QSAR. J Chem Inf Model 47:2345\u0026ndash;2357\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRybak JM, Xie J, Martin-Vicente A, Guruceaga X, Thorn HI, Nywening AV, Ge W, Souza ACO, Shetty AC, Carrie, McCracken VM, Bruno JE, Parker SL, Kelly HM, Snell CA, Cuomo PD, Rogers and Jarrod R. Fortwendel. 2024. 'A secondary mechanism of action for triazole antifungals in Aspergillus fumigatus mediated by hmg1', \u003cem\u003eNature Communications\u003c/em\u003e, 15: 3642\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchuhmann A, Schmid AP, Manzer S, Schulte J, and Ricarda Scheiner (2022). 'Interaction of InsecticidesFungicides in Bees'. Front Insect Sci, Volume 1\u0026ndash;2021\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma P, Ranjan P, Chakraborty T (2024) Applications of conceptual density functional theory in reference to quantitative structure\u0026ndash;activity / property relationship. Mol Phys 122:e2331620\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShirai M, and Thomas Eulgem (2023) 'Molecular interactions between the soilborne pathogenic fungus Macrophomina phaseolina and its host plants'. Front Plant Sci, Volume 14\u0026ndash;2023\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStrushkevich N, Usanov SA, Hee-Won Park (2010) Structural Basis of Human CYP51 Inhibition by Antifungal Azoles. J Mol Biol 397:1067\u0026ndash;1078\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun Y, Liu R, Luo Z, Zhang J, Gao Z, Liu R, Liu N, Zhang H, Li K, Wu X, Yin W, Qin Q, Su X, Zhao D, Maosheng Cheng (2024) Identification of novel and potent triazoles targeting CYP51 for antifungal: Design, synthesis, and biological study. Eur J Med Chem 280:116942\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTaenzler V, Weyers A, Maus C, Ebeling M, Levine S, Cabrera A, Schmehl D, Gao Z, Ismael Rodea-Palomares (2023) Acute toxicity of pesticide mixtures to honey bees is generally additive, and well predicted by Concentration Addition. Sci Total Environ 857:159518\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTennekes HA, Sanchez-Bayo FP (2011) 'Time-dependent toxicity of neonicotinoids and other toxicants: implications for a new approach to risk assessment', \u003cem\u003eJournal of Environmental \u0026amp; Analytical Toxicology\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTopping C, Bednarska A, Benfenati E, Chetcuti J, Simon-Delso N, Duan X, Focks A, Laskowski R, Lombardo A, Marcussen L, Metodiev T, Rubinigg M, Rundl\u0026ouml;f M Fabio Sgolastra, Carla Stoyanova, Gregor Sušanj, James Williams, and Elzbieta Ziolkowska. 2024. 'PollinERA: Understanding pesticide-Pollinator interactions to support EU Environmental Risk Assessment and policy'. Res Ideas Outcomes, 10\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang C, Luo X, Zhang Y, Pu X, Liu J, Zhang S, Ye M, Li X (2025) Green Pesticide Research and Development Integrating Molecular Targets, Mechanisms, Resistance, and Innovation in Theory and Technology. J Agric Food Chem 73:32460\u0026ndash;32489\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang T, Tang L, Luan F, Nat\u0026aacute;lia M, Cordeiro DS (2018) Prediction of the Toxicity of Binary Mixtures by QSAR Approach Using the Hypothetical Descriptors. Int J Mol Sci 19:3423\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWarrilow AGS, Claire M, Martel JE, Parker N, Melo DC, Lamb WD, Nes DE, Kelly, Steven LK (2010) Azole binding properties of Candida albicans sterol 14-α demethylase (CaCYP51). Antimicrob Agents Chemother 54:4235\u0026ndash;4245\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang W, Parr RG (1985) 'Hardness, softness, and the fukui function in the electronic theory of metals and catalysis', \u003cem\u003eProceedings of the National Academy of Sciences\u003c/em\u003e, 82: 6723-26\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYap CW (2011) PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466\u0026ndash;1474\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhao Y, and Donald G. Truhlar (2008) The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: two new functionals and systematic testing of four M06-class functionals and 12 other functionals. Theor Chem Acc 120:215\u0026ndash;241\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Pesticide mixtures, Antifungal activity, Honey bee toxicity, Unified descriptor framework, Selectivity, Quantum reactivity descriptors","lastPublishedDoi":"10.21203/rs.3.rs-9203836/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9203836/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study presents a unified descriptor-based QSAR framework for analyzing pesticide mixtures from the dual perspective of antifungal efficacy and honey bee toxicity. Rather than developing independent predictive models optimized for individual endpoints, the framework is designed to enable cross-endpoint comparison within a consistent chemical representation space, thereby supporting selectivity-oriented interpretation. Two conceptually linked but independently developed QSAR models were constructed to describe (i) antifungal activity against \u003cem\u003eMacrophomina phaseolina\u003c/em\u003e and (ii) acute contact toxicity toward \u003cem\u003eApis mellifera\u003c/em\u003e. Molecular representation was achieved by integrating quantum reactivity descriptors (QRDs), derived from density functional theory (DFT), with a curated subset of structural descriptors (CaPS) calculated using the PaDEL platform. Mixture properties were encoded using a composition-weighted descriptor scheme, and both endpoints were modeled using an identical genetic algorithm\u0026ndash;multiple linear regression (GA\u0026ndash;MLR) workflow to ensure methodological consistency at the descriptor level. The results reveal clear divergence in descriptor contributions across endpoints. Antifungal efficacy is primarily associated with structural features such as molecular size, polarity, and topological organization, whereas honey bee toxicity is more strongly governed by electronic reactivity, charge-transfer behavior, and polarity-dependent transport properties. This distinction indicates that efficacy and non-target toxicity are not intrinsically coupled, and that a descriptor-level selectivity window may be identified.\u003c/p\u003e","manuscriptTitle":"A dual-objective QSAR framework integrating quantum reactivity and structural descriptors for predicting antifungal activity and honey bee toxicity of pesticide mixtures","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-17 05:28:49","doi":"10.21203/rs.3.rs-9203836/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"21aab753-dae7-4cf0-ae60-afb0ec6efab5","owner":[],"postedDate":"April 17th, 2026","published":true,"recentEditorialEvents":[{"type":"decision","content":"Reject","date":"2026-05-05T01:09:06+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-05-05T05:10:33+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-17 05:28:49","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9203836","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9203836","identity":"rs-9203836","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00