Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images

doi:10.21203/rs.3.rs-5786592/v1

Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images

2025 · doi:10.21203/rs.3.rs-5786592/v1

preprint OA: closed

Full text JSON View at publisher

Full text 186,282 characters · extracted from preprint-html · click to expand

Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images Wensheng Cui, Ming Fan, Lihua Li This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5786592/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 21 Oct, 2025 Read the published version in Breast Cancer Research → Version 1 posted 6 You are reading this latest preprint version Abstract Background Tumor microenvironment (TME) biomarkers derived from histopathological images of HER2+ breast cancer (HER2+BC) can effectively predict pathological complete response (pCR) following neoadjuvant chemotherapy (NAC), thereby enhancing patient prognosis. In this study, we quantitatively assessed the morphological information of critical regions in the TME and analyzed their predictive potential for pCR. Methods The retrospective analysis included 147 HER2+BC patients treated with NAC, comprising 85 from the Yale Response dataset for training and 62 from the IMPRESS HER2+ dataset for external validation. Initially, VGG-16 and Xception networks were utilized to segment hematoxylin and eosin-stained histopathology images, generating tissue segmentation images (TS-images). Tumor and non-tumor regions were identified based on the TS-images, from which tumor-infiltrating lymphocytes (TILs) and non-tumor-infiltrating lymphocytes (non-TILs) were extracted, respectively. Subsequently, the morphological information of these regions was quantified through the measurement of connected components. Feature selection was performed based on combined morphological and clinical information, employing the least absolute shrinkage and selection operator. Finally, selected features were input into a multilayer perceptron for training and validated on an external test cohort. Results In external validation, models derived from non-TILs achieved an area under the curve (AUC) of 0.873 in predicting pCR, with F1 score, PPV, recall, and NPV of 0.889, 0.821, 0.970, and 0.933, respectively. This performance significantly surpassed models trained on non-tumor (AUC = 0.779), tumor (AUC = 0.732), TILs (AUC = 0.594), and lymphocytes (AUC = 0.668). Furthermore, despite using 20% of the samples for training, the model trained on non-TILs maintained its high performance (AUC = 0.722). Univariate analyses of pCR revealed significant morphological features, such as the significance area filled mean for non-TILs (p value = 0.026) and the significance number for non-tumor (p value = 0.003). Conclusion The TME-based morphological information from histopathological images demonstrates accurate prediction of pCR, offering considerable potential for more precise patient stratification for NAC. Breast cancer Deep learning Histopathological images Neoadjuvant chemotherapy Tumor microenvironment Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Background HER2-positive breast cancer (HER2 + BC) is a distinct subtype of breast cancer characterized by overexpression or amplification of human epidermal growth factor receptor 2, accounting for approximately 20–25% of breast cancer cases[ 1 , 2 ]. Compared to HER2-negative breast cancer, HER2 + BC exhibits a higher propensity for aggressiveness and recurrence[ 3 – 6 ]. Neoadjuvant chemotherapy (NAC) is currently considered an effective treatment option for patients with HER2 + BC[ 7 – 9 ]. Pathological complete response (pCR) serves as one of the evaluation criteria used to assess the effectiveness of NAC[ 10 ]. Patients achieving pCR are expected to have a more favorable outcome than those with pathological noncomplete response (non-pCR)[ 11 ]. However, approximately 30% of HER2 + BC patients do not attain pCR[ 12 ]. Consequently, predicting pCR before NAC in HER2 + BC is crucial for guiding personalized treatment strategies and avoiding ineffective interventions. Currently, conventional clinical indicators such as tumor size[ 13 ], pathologic grade[ 14 ], Ki-67 proliferation index[ 15 ], and tumor-infiltrating lymphocytes (TILs)[ 16 – 18 ] lack precision in predicting pCR. To address this limitation, researchers have introduced molecular biomarker approaches, including CALGB 40601[ 19 ], SPAG5[ 20 ], and PD-L1[ 21 ]. However, these methods present significant challenges in terms of cost and time investment. Beyond clinical indicators and molecular biomarkers for pCR prediction, several studies have demonstrated the effectiveness of artificial intelligence (AI) in analyzing medical images to forecast responses to NAC. These applications extend to radiological images, encompassing MRI[ 22 , 23 ] and PET/CT[ 24 ]. In comparison to radiological images, histopathological images, considered the gold standard for disease diagnosis, offer extensive information on the tumor microenvironment (TME), providing insights into disease progression[ 25 , 26 ]. AI technology has been employed by researchers to characterize TME information in immunohistochemical (IHC)-stained breast cancer histopathological images[ 27 , 28 ]. While non-routine IHC staining methods (e.g., Ki-67 and PHH3) can capture high-value biomarkers within the TME, enhancing the predictive accuracy of pCR, they present challenges such as increased medical costs, extended assay times, and dependence on manual processes. Consequently, an increasing number of researchers are utilizing routine (e.g., hematoxylin and eosin, H&E) stained histopathological images to identify biomarkers within the TME. Fisher et al. extracted TME information from triple-negative breast cancer histopathology images, discovering that the interaction between tumor and TILs held significant predictive value for pCR[ 29 ]. Shen et al. employed AI to analyze histopathological images, extracting cell nuclei features as input for machine learning to predict pCR to NAC[ 30 ]. Li et al. introduced the tumor-associated stroma score to forecast the response of breast cancer patients to NAC[ 31 ]. Studies have demonstrated that morphological information of tissue in the TME correlates with the treatment response of NAC for breast cancer[ 32 , 33 ]. However, considering current research, there remains potential for further exploration of the TME based on histopathological images. In this study, we used deep learning techniques to segment H&E-stained core needle biopsy histopathological images of HER2 + BC. This process resulted in the generation of tissue segmentation images (TS-images). Utilizing these TS-images, we extracted morphological information from various tissue components in the TME, including tumor, non-tumor, TILs, non-tumor-infiltrating lymphocytes (non-TILs), and lymphocytes, to predict pCR in NAC. The primary aim of this research is to support clinicians in patient selection, improve NAC treatment response rates, and contribute to the advancement of precision tumor therapy. Materials and Methods Datasets A cohort of 147 patients who underwent NAC at Yale University and Purdue University was retrospectively analyzed. The model's development utilized the Yale Response dataset from Yale University, while external validation was conducted using the IMPRESS HER2 + dataset from Purdue University to assess the model's generalization capabilities. Table 1 presents a summary of the clinical and histopathological characteristics of these patients. Table 1 Clinical and histopathological characteristics of HER2 + patients in the two datasets. Cohort Characteristics Case#/Median %/Range Yale Response dataset Total case number 85 - Cases with pCR 36 42.35% Cases with residual tumor 4 57.65% Estrogen receptor (ER) positive 69 81.18% Progesterone receptor (PR) positive 66 77.65% HER2/CEP17 ratio 3.14 0.00-17.40 Residual tumor size (cm) 1.35 0.02-11.00 HER2CN (signals/cell) 11.0 3.3–6.7 IMPRESS HER2 + dataset Total case number 62 - Cases with pCR 38 61.29% Cases with residual tumor 24 38.71% Estrogen receptor (ER) positive 30 48.39% Progesterone receptor (PR) positive 19 30.65% HER2/CEP17 ratio 6.73 1.23–22.98 Residual tumor size (cm) 0.80 0.10-7.00 Residual cancer burden 1.39 0.91–4.14 Age (years) 56 30–76 Nottingham Grade Ⅰ 1 1.61% Ⅱ 27 43.55% Ⅲ 34 54.84% Nuclear grade Ⅰ 0 0.00% Ⅱ 10 16.13% Ⅲ 52 83.87% The Yale Response dataset comprises whole slide images (WSIs) of pretreatment core needle biopsies and associated clinical information from 85 female patients with HER2 + BC[ 34 ]. All WSIs were scanned at 20× magnification using the Aperio ScanScope Console (v10.2.0.2352). Prior to surgery, each patient received treatment with trastuzumab ± pertuzumab. Pathologists evaluated the efficacy of NAC based on surgical resection pathological reports. They defined pCR as the absence of residual invasive, lympho-vascular invasion, or metastatic carcinoma (36 cases, 42.35%). Conversely, cases exhibiting any residual invasive, lymphovascular, or metastatic carcinoma were classified as non-pCR (49 cases, 57.65%). The IMPRESS HER2 + dataset, comprising 62 patients with H&E-stained HER2 + BC, was derived from WSIs of pretreatment core needle biopsies and associated clinical information[ 27 ]. Histopathologic slides were scanned at 20× magnification using a Hamamatsu scanner. The majority of patients underwent NAC with Taxol (paclitaxel/docetaxel) and trastuzumab. A subset of 7 cases followed an alternative 4-cycle regimen of pertuzumab, trastuzumab, and docetaxel, with 3 cases not achieving pCR and 4 cases achieving pCR. The definition of pCR adhered to Yale University standards. The dataset exhibited a pCR rate of 38.71% (24 cases), while non-pCR cases constituted 61.29% (38 cases). Overview of the framework Our methodology comprises three primary components: histopathological image pre-processing, generation of TS-images and extraction of morphological information, and modeling for pCR prediction and evaluation. Figure 1 illustrates this comprehensive pipeline. Histopathological image pre-processing Digital pathology tissue images contain billions of pixels, making direct input into deep learning models computationally infeasible[ 35 – 37 ]. To address this challenge, we utilized the OpenSlides toolkit to segment WSIs into smaller tiles measuring 175*175*3 pixels[ 38 ]. WSIs often exhibit ink contamination, blank areas, fewer tissue, and fat regions. To reduce noise, we applied Otsu’s thresholding method to segment tissue regions and quantify the proportion of tissue pixels in each tile. Tiles with a tissue pixel fraction below 20% were excluded from further analysis to ensure that only research-relevant regions were retained. Tissue classification and generating tissue segmentation image We developed a transfer learning model using TensorFlow to classify tumor, non-tumor, and necrosis regions. Subsequently, we reconstructed WSIs from tile predictions to generate TS-images for feature extraction. These TS-images categorize regions into tumor, non-tumor, TILs, non-TILs, and lymphocytes, facilitating analysis, enhancing model generalization, and offering a more comprehensive analytical perspective than pixel-level analysis. The non-tumor region encompasses all tissues not classified as tumor, including stroma, lymphocytes that are not tumor-infiltrating, vascular structures, and other supportive cells such as fibroblasts and mesenchymal cells. Fat tissue was excluded during preprocessing and is not considered part of the non-tumor region. Non-TILs refer to lymphocytes located in the non-tumor regions but still within the TME, playing a role in the immune response. These include lymphocytes present in the peritumoral stroma, inflammatory infiltration zones, and immune cell clusters outside the tumor boundary yet within the TME. However, Non-TILs do not include lymphocytes directly infiltrating tumor regions. The datasets for the classification tasks of each tissue type exhibit significant sample size disparities, suggesting that multi-task classification may not necessarily yield improved performance. A sequential binary classification approach was implemented to effectively classify each region. Figure 2 illustrates this process. Initially, the classification distinguished between tumor and non-tumor areas. Given the morphological similarity between necrosis and TILs, a Convolutional Neural Network (CNN) was developed to identify necrotic regions[ 39 ]. Subsequently, TILs were classified. Following this, non-TILs were classified based on the non-tumor classification results. Finally, TILs and non-TILs were combined to form the lymphocyte category. Since the primary contribution of this study lies in morphological analysis rather than the classification process, we have relocated the technical details of tissue region classification to Additional File 2. Feature extraction and construction We utilized the publicly available Scikit-image.measure.regionprops module to extract morphological features from different regions of TS-images, encompassing tumor, non-tumor, non-TILs, TILs, and lymphocytes. Instead of pixel-level analysis, we treated each tile as the fundamental unit for feature extraction and conducted a whole-slide-based evaluation of tissue morphology at the tile level. By applying eight-connectivity at the tile level, we identified connected components of each tissue type, facilitating a comprehensive whole-slide assessment of morphological structures across the WSI. Using methods provided by the library, we extracted 46 sets of morphological features for each tissue region, following an approach similar to those proposed by Diao et al. [ 40 ] and Wang et al. [ 41 ]. These morphological features capture a wide range of tissue characteristics, including quantitative, size-based metrics such as the number of connected components, major and minor axis lengths, as well as qualitative, shape-based metrics such as Euler number and eccentricity. To comprehensively characterize the spatial properties of the TME, we adopted three feature extraction perspectives. The all region computes the morphological features of all connected tissue components, providing an overview of the overall tissue structure. The largest region extracts morphological features from the largest connected component within each tissue region, reflecting the predominant structural characteristics. The significant region focuses on components that exceed 5% of the largest region’s area, calculating the mean and standard deviation of their morphological features to analyze local morphological variations and TME heterogeneity. The configuration of the percentage threshold for significant regions plays a crucial role in extracting meaningful morphological features. A threshold that is too high may reduce the sensitivity of spatial feature extraction, leading to the omission of critical structural details, while a threshold that is too low may introduce excessive noise, compromising the model’s generalization ability. For details on the determination of the significant region threshold, please refer to the experimental section below. For each tissue region, we extracted 11 feature sets from the largest region, 12 feature sets from the all-region perspective, and 23 feature sets from the significant-region perspective. The feature names are detailed in Additional File 1. Figure 3 illustrates the three perspectives used for extracting morphological features from non-tumor and non-TIL regions. In addition to morphological information, we integrated clinical features into our analysis. The primary motivation for incorporating clinical features is to complement morphological characteristics extracted from pathological data and enhance the predictive performance for pCR. Studies have demonstrated that clinical variables, such as ER/PR status and HER2/CEP17 ratio, are significantly associated with patient prognosis[ 42 – 44 ]. Therefore, integrating these variables is expected to improve the model’s predictive capability and provide a more comprehensive assessment of treatment response. Our methodology involved intersecting clinical features from the Yale Response and IMPRESS HER2 + datasets. Specifically, we examined estrogen receptor status (ER+/-), ER%, progesterone receptor status (PR+/-), progesterone receptor PR%, and the ratio of HER2 expression to chromosome 17 expression (HER2/CEP17)[ 45 , 46 ] as clinical features. Machine learning framework for predicting pathological complete response to neoadjuvant chemotherapy Initially, we employed standardization for data preprocessing, adjusting the data range to have a mean of 0 and a standard deviation of 1. Subsequently, we utilized the least absolute shrinkage and selection operator (LASSO) algorithm for feature selection, eliminating redundant features to enhance the model's generalization performance. The alpha value of the LASSO was determined using a stratified 10-fold cross-validated grid search algorithm. Finally, we input the selected features into the multilayer perceptron (MLP) for training and validated them on an external test cohort. The hyperparameter configuration of the MLP mirrored that of the LASSO acquisition. Statistical analysis methodology We conducted analyses of morphological and clinical features from the Yale Response and IMPRESS HER2 + datasets using two-sample t-tests. Furthermore, we utilized Spearman's rank correlation coefficients to evaluate the relationships between morphological features and residual infiltration size (RIS), as well as among the morphological features themselves. These coefficients yield a correlation measure (R) and a two-sided P-value. For all statistical analyses, we employed a P-value threshold of less than 0.05 to denote statistical significance. Model performance evaluation metrics We employed various metrics to evaluate model performance, including area under the curve (AUC), F1 score, positive predictive value (PPV), recall, and negative predictive value (NPV). In the validation results of multicenter independent datasets, an AUC exceeding 0.800 was considered evidence of excellent model performance, while an AUC greater than 0.700 indicated certain judgment capabilities. Experimental environment The hardware configuration for all experiments consists of a high-performance computing cluster, featuring two NVIDIA Quadro RTX 6000 GPUs for parallel computing and a 2.0 TB hard disk for storage. For software, the OpenSlide (version: 1.2.0) toolkit was employed for WSI tiling, while TensorFlow (version: 2.4.1) was utilized for data loading, deep model training, and testing. Python was used in conjunction with the Scikit-learn (version: 1.2.2) and SciPy (version: 1.8.1) libraries for machine learning and statistical analysis. Furthermore, OpenCV (version: 4.7.0), Pandas (version: 2.0.2), and Scikit-image (version: 0.19.3) libraries were utilized for image morphology feature extraction and data processing, respectively. Results Evaluation of significant region thresholds The appropriate selection of the significant region threshold is crucial for accurately capturing the spatial characteristics of the TME. To systematically evaluate the impact of threshold selection, we tested four threshold values 1%, 3%, 5%, and 10% for significant region measurement, extracted the corresponding morphological features, and integrated them with clinical features to predict pCR in HER2 + breast cancer to NAC. The experimental methodology has been described in the Methods section and is not repeated here for brevity. The experimental results on the external validation dataset IMPRESS_HER2 + are presented in Table 2 , while the results of morphology-only modeling are provided in Additional File 2: Table S2 . Table 2 Comparison of the generalization performance of morphological and clinical feature combinations for five tissue types under different significant region thresholds in the IMPRESS HER2 + dataset. Region Sign. thresholds AUC F1 score PPV Recall NPV Tumor 1% 0.716 0.787 0.661 0.974 0.750 3% 0.708 0.764 0.667 0.895 0.615 5% 0.732 0.740 0.771 0.711 0.586 10% 0.724 0.805 0.795 0.816 0.680 Non-tumor 1% 0.717 0.784 0.644 1.000 0.800 3% 0.726 0.795 0.775 0.816 0.667 5% 0.779 0.883 0.745 0.946 0.846 10% 0.734 0.806 0.714 0.926 0.727 TILs 1% 0.684 0.780 0.727 0.842 0.650 3% 0.651 0.714 0.694 0.735 0.583 5% 0.594 0.654 0.708 0.607 0.571 10% 0.620 0.692 0.529 1.000 0.000 Non-TILs 1% 0.774 0.805 0.775 0.838 0.683 3% 0.822 0.827 0.756 0.912 0.778 5% 0.873 0.889 0.821 0.970 0.933 10% 0.719 0.684 0.650 0.722 0.700 Lym. 1% 0.675 0.778 0.673 0.921 0.667 3% 0.691 0.698 0.759 0.647 0.581 5% 0.668 0.730 0.676 0.793 0.632 10% 0.676 0.688 0.647 0.733 0.667 Best-performed values are highlighted in boldface. Sign.: significance, TILs: tumor infiltrating lymphocytes, non-TILs: non-tumor infiltrating lymphocytes, Lym.: lymphocytes. Table 2 compares model performance across five tissue types under different significant region thresholds. The results indicate that the 5% threshold achieves the best overall performance, particularly in the non-tumor and non-TILs. In the non-tumor, the 5% threshold yields the highest AUC (0.779), F1-score (0.883), and NPV (0.846), demonstrating its effectiveness in capturing spatial features critical for pCR prediction while minimizing irrelevant tissue interference. On the non-TILs, the 5% threshold achieves the best AUC (0.873), F1-score (0.889), PPV (0.821), Recall (0.970), and NPV (0.933), indicating its ability to optimally retain TME features. In contrast, the 1% threshold increases recall but reduces AUC and NPV, suggesting that excessive noise negatively impacts predictive performance. The 10% threshold, on the other hand, may lead to performance degradation due to excessive loss of critical information. Although the 3% threshold shows some improvement, it remains less stable than the 5% threshold. Therefore, the 5% threshold is identified as the optimal choice, striking the best balance between noise reduction and feature retention, thereby enhancing model generalization capability. Machine learning utilizing morphological and clinical features to predict the pCR We employed deep convolutional neural networks to segment histopathological images into five distinct regions: tumor, non-tumor, non-TILs, TILs, and lymphocytes. Each region comprises 46 morphological and 5 clinical features (HER2/CEP17 ratio, ER, PR, ER%, and PR%). Feature selection was performed using LASSO, and the selected features were input into an MLP classifier to predict the pCR to NAC. Table 3 presents the performance evaluation of the model on the external validation set, along with a comparative analysis against baseline models and the latest state-of-the-art approaches. Table 3 Generalization performance validation of pCR predictions in the IMPRESS HER2 + dataset. Type Method AUC F1 score PPV Recall NPV Mor. + Clinical Tumor 0.732 0.740 0.771 0.711 0.586 Non-tumor 0.779 0.883 0.745 0.946 0.846 Non-TILs 0.873 0.889 0.821 0.970 0.933 TILs 0.594 0.654 0.708 0.607 0.571 Lym. 0.668 0.730 0.676 0.793 0.632 Mor. Tumor 0.656 0.656 0.808 0.553 0.526 Non-tumor 0.746 0.843 0.761 0.946 0.857 Non-TILs Mor. 0.766 0.817 0.763 0.879 0.750 TILs Mor. 0.562 0.719 0.639 0.821 0.625 Lym. Mor. 0.625 0.716 0.632 0.828 0.600 Clinical ER 0.658 0.765 0.721 0.816 0.632 ER% 0.668 0.800 0.692 0.947 0.800 PR 0.669 0.795 0.700 0.921 0.750 PR% 0.645 0.686 0.750 0.632 0.533 HER2/CEP17 ratio 0.668 0.760 0.613 1.000 0.000 Total Clinical 0.716 0.805 0.750 0.868 0.722 Tiles Count Tumor 0.664 0.760 0.613 1.000 0.000 Non-Tumor 0.680 0.787 0.661 0.974 0.833 TILs 0.552 0.768 0.623 1.000 0.999 Non-TILs 0.651 0.688 0.846 0.579 0.556 Lym. 0.596 0.763 0.627 0.974 0.666 Relative TNR 0.576 0.769 0.660 0.921 0.667 ILTR 0.562 0.784 0.644 1.000 1.000 NTLR 0.571 0.758 0.632 0.947 0.600 LNTR 0.587 0.768 0.623 1.000 0.999 LD 0.518 0.761 0.648 0.921 0.625 Latest Work IMPRESS (H&E only)[ 27 ] 0.812 0.827 0.906 0.761 0.698 Pathologists’ features[ 27 ] 0.788 0.782 0.870 0.711 0.645 LTR[ 47 ] 0.541 0.760 0.613 1.000 0.000 Mor.: Morphological. PPV: positive predictive value, NPV: negative predictive value. Best-performed values are highlighted in boldface. The results indicate that the most effective predictive model is a hybrid approach integrating morphological and clinical features. Among these, the non-TILs achieved the highest AUC (0.873) and F1-score (0.889), demonstrating superior predictive performance. Within purely clinical features, the HER2/CEP17 ratio exhibited the highest recall (1.000), highlighting its critical role in pCR prediction. Regarding Tiles Count features, the results show that the non-tumor tile count (AUC = 0.680) outperformed the tumor tile count (AUC = 0.664), suggesting that information from non-tumor regions may contribute significantly to pCR prediction. However, compared to models incorporating both morphological and clinical features, tile count-based models exhibit limited predictive power. Similarly, relative tissue metrics (e.g., TNR, ILTR) demonstrated lower predictive performance, with AUC values ranging from 0.55 to 0.68, indicating that solely relying on tissue proportion metrics provides limited discriminative ability. The definitions and computational formulas for these relative features are detailed in Additional File 2. Comprehensive external validation further confirms that features derived from “non-tumor” regions (including both morphological and clinical features) exhibit the most promising performance in pCR prediction, with AUC values ranging from 0.779 to 0.873. Although some models showed improvements in specific metrics, their overall predictive ability remained inferior to the best-performing hybrid model integrating both morphological and clinical features. Furthermore, our findings indicate that the top-performing model surpasses pathologists in pCR prediction, reinforcing that “non-tumor” morphological features and clinical data are highly effective indicators for predicting pCR in HER2 + breast cancer patients undergoing NAC. The study reveals that morphological features of "non-tumor" regions and clinical information are effective predictors of pCR to NAC in HER2 + BC. To validate the effectiveness of the method, we compared its performance with recent studies. Table 3 demonstrates that the model trained on morphological and clinical features of non-TILs exhibits the highest overall performance. Furthermore, the non-tumor model displayed superior performance relative to the reference[ 47 ]. In conclusion, based on the comprehensive evaluation presented in Table 3 , Fig. 4 .a, and Fig. 4 .b, the model trained on the combined features of the "non-tumor" demonstrates the highest generalization capability. Consequently, our subsequent analyses focused on the non-tumor and non-TILs components due to their superior performance. The generalization ability of the model was tested by percentage random selection of the training set. As illustrated in Fig. 4 , the model's performance exhibits a general upward trend as the percentage of the training set increases. Notably, the model demonstrates effective performance even with a smaller proportion of the training set. Using only the morphological features extracted from the non-TILs regions of 20% of the training set (Yale Response dataset) WSIs, the AUC of the external validation set reached 0.722 (as shown in Fig. 4 .c). Furthermore, the non-TILs consistently outperformed the non-tumor regions. These experimental results indicate that information derived from "non-tumor" regions possesses substantial generalization capability for predicting pCR in NAC. Feature importance analysis To comprehensively understand the process affecting the combined features of non-tumor and non-TILs, we analyzed the weights of the features selected by the LASSO. Figure 5 illustrate the weights assigned to the LASSO-selected features, while Additional File 2: Table S1 presents the three most favorable and unfavorable features. Furthermore, we employed Spearman's test to examine the correlation between paired features, identifying potential dependencies. The results for the Yale response dataset are displayed in Additional File 2: Figure S1 . Figure 5 .a illustrates spatial characteristics of the tumor immune microenvironment (TIME). The three most favorable features, associated with significance regions, are the significance region number, significance area filled mean, and significance area convex mean. A higher number of significant non-TIL regions may indicate a more robust immune response to cancer cells, potentially correlating with a positive response to NAC. Furthermore, a greater area filled mean of significant non-TILs suggests a larger distribution area of lymphocytes, possibly contributing to a favorable pCR. Similarly, a larger mean convex hull area of significant regions indicates a broader spatial distribution of lymphocytes or more complex margins, which may facilitate an improved treatment response. The most adverse factors include the HER2/CEP17 ratio, the largest eccentricity, and the significance eccentricity mean. While pCR and the HER2/CEP17 ratio typically exhibit a positive correlation, this feature set, comprising only immunomorphological features, demonstrates a negative correlation. This suggests that non-TIL morphological features may be more indicative for pCR prediction. Eccentricity, describing the shape of an ellipse, ranges from 0 to 1, with lower values indicating a more circular shape. Consequently, lymphocyte infiltration patterns closer to circular may be more favorable for predicting treatment outcomes. Figure 5 .b provides spatial information for the TME. The three most favorable features are the largest area, the largest area convex, and the significance area convex mean. A larger area potentially indicates a higher proportion of relevant immune cells. Moreover, larger values for the largest area convex and significance area convex mean suggest a wider distribution of non-tumor cells. This spatial configuration positively correlates with the probability of pCR. Conversely, three adverse factors are identified: the significance euler number std, the largest axis minor length, and the HER2/CEP17 ratio. The euler number, quantifying the number of voids in the cavity region, represents spatial completeness and serves as a valuable indicator of connectivity variability in the non-tumor spatial distribution. A greater standard deviation of the euler number in significant regions indicates more substantial differences in connectivity between regions. This suggests a potential relationship between pCR and the variability of connectivity in the non-tumor spatial distribution. The minor axis length represents the short axis length of the ellipse, and a smaller average length of the short axis in the largest region implies a flatter morphological distribution. Concurrently, a flatter distribution of non-tumor regions appears to be more conducive to pCR. In the "non-tumor" region, the TME demonstrates a correlation with TIME. Consequently, the features of these two regions, as illustrated in Fig. 5 , exhibit overlap. The PR, significance area convex mean, and significance number emerge as common favorable features. Conversely, the largest eccentricity, significance eccentricity mean, and HER2/CEP17 ratio present as common unfavorable features. Univariate analysis We employed statistical tools to examine the relationship between the combined features of non-tumor and non-TILs and pCR. Initially, we utilized the normal test function from the Scipy package to conduct a standard test on the features. The results indicate that all features conform to a normal distribution. Subsequently, we performed feature difference analysis on pCR and non-pCR using a two-sample t-test. We identified the most significant favorable and adverse features of non-tumor and non-TILs selected by LASSO. Figure 6 illustrates the statistical analysis results through box plots, while Additional File 3 presents the comprehensive findings. In the Yale Response dataset analysis (Fig. 6 .a and Fig. 6 .b), we observed significant variability among most features, with the exception of the HER2/CEP17 ratio. Furthermore, these features display both higher favorable and unfavorable weights, as depicted in Fig. 5 . For instance, consider the Yale Response dataset's significance area filled mean (p-value = 0.026) and the IMPRESS HER2 + dataset's significance number (p-value = 0.003). However, some features exhibit small weights, such as PR (p-value = 0.008) in the IMPRESS HER2 + dataset. The experimental results demonstrate that individual features also possess the capacity to distinguish between pCR and non-pCR cases. In addition to the two-sample t test, we employed Spearman's test to analyze the correlation between non-tumor and non-TILs morphological features and RIS. For patients with pCR, residual invasion size (RIS) was defined as 0. In the Yale Response and IMPRESS HER2 + datasets, the median RIS values were 1.35 cm (range 0.02–11.00 cm) and 0.80 cm (range 0.10–7.00 cm), respectively. We identified the most significant favorable and adverse morphological features of non-tumor and non-TILs selected by LASSO. These features underwent statistical analysis using scatter plots, as illustrated in Fig. 6 . Additional File 4 presents the complete results. In Fig. 6 .f, the significance area convex mean (p-value = 0.006, R = -0.299) of non-tumor in the Yale Response dataset exhibited a significant negative correlation with RIS. In Fig. 6 .g, the significance number (p-value = 0.009, R = 0.336) of non-TILs in the IMPRESS HER2 + dataset demonstrated a significant positive correlation with RIS. In Fig. 6 .h, the all filled area number (p-value = 0.012, R = -0.317) of non-tumor in the IMPRESS HER2 + dataset showed a significant negative correlation with RIS. The significance area convex mean of non-tumor yielded different results in Figs. 6 .f and 6.h. This discrepancy likely arose from the distinct distributions of the two datasets. Furthermore, additional features significantly correlated with RIS can be utilized for quantitative prediction of RIS values (e.g., the significance area number in Fig. 6 .g). Discussion Computational pathology methodologies have been extensively employed in clinical applications to support diagnostic, outcome, and predictive objectives[ 48 , 49 ]. Moreover, the analysis and assessment of TME biomarkers in histopathological images of breast cancer can contribute to improved patient outcomes. To our knowledge, limited research has focused on predicting pCR to NAC in HER2 + BC patients through quantification of TME morphological information. This study integrates morphological features with clinical data to construct combined features, which are subsequently input into LASSO for feature selection. The selected features are then fed into an MLP to predict the response to NAC treatment. This research offers several advantages. Firstly, it addresses the challenge of analyzing large-scale digital histopathological images using machine learning algorithms. While researchers typically study WSIs by tiling, the size of the tiles can lead to information biases and unstable model performance. This study employed deep learning to segment the WSIs and generate TS-images, which were then used as a foundation to extract spatial features of TME, thereby replacing the WSI. This approach provides a more robust and repeatable method for feature extraction compared to tile-level analysis, effectively reducing data complexity and minimizing noise. Secondly, this study conducts a comprehensive analysis of WSI regions by segmenting TS-images into five areas: tumor, non-tumor, TILs, non-TILs, and lymphocytes. The research demonstrates that combined morphological and clinical features of non-tumor and non-TILs performed better in predicting NAC. Thirdly, while AI is considered a potential clinical guideline method, many deep learning-based studies lack interpretability and repeatability. This study utilizes deep learning methods for foundational image segmentation tasks on WSIs and employs image analysis methods to extract interpretable features from significant and largest regions. In external validation, the best-performing model achieved an AUC of 0.873, outperforming various baseline models (Clinical, Tiles Count, and Relative), pathologists, and the latest state-of-the-art approaches. Additionally, we evaluated the appropriateness of the significant region threshold selection and conducted a comprehensive analysis of interpretable morphological features from multiple perspectives, including the contribution of LASSO-selected features. Lastly, this paper addresses the lack of external validation in most studies due to limited public datasets on NAC. It utilizes the open-source Yale University dataset and the IMPRESS HER2 + dataset for multicenter validation, verifying the model's robustness through percentage training. Experimental results demonstrate the model's good generalizability when trained on "non-tumor" morphological and clinical combined features. This study has several limitations. First, the relatively small dataset may impact the statistical significance of our findings. Our focus on utilizing a single molecular profile to predict the response in HER2 + BC patients might limit the generalizability of our results to other contexts. Furthermore, our approach prioritizes morphological features due to their interpretability and computational efficiency, without incorporating additional vision-based features, such as Haralick texture features or deep learning-based embeddings. Although these features could potentially provide complementary information, they were not included in this study. Future work could explore hybrid feature representations, integrating deep learning-derived features with traditional features to further enhance predictive performance. Conclusions Our study introduces a novel approach for predicting NAC response in HER2 + breast cancer by integrating morphological and clinical features from histopathological images. Compared to traditional molecular biomarker methods, our approach is cost-effective, interpretable, and highly generalizable across independent datasets. Initially, deep learning techniques are employed to segment the WSI into tumor, necrosis, and lymphocyte regions, generating TS-images. Subsequently, morphological features are extracted from tumor, non-tumor, TILs, non-TILs, and lymphocyte areas, enabling a thorough exploration of TME and TIME information. To enhance the model's pCR prediction performance, morphological features are combined with clinical data. Experimental results demonstrate that the integration of non-tumor and non-TILs features significantly improves NAC treatment prediction accuracy, surpassing recent comparable studies. This research presents an automated, accurate, interpretable, cost-effective, and reproducible method for extracting spatial information from the TME. Future work will focus on expanding external data validation and applying this methodology to other breast cancer molecular subtypes and cancer types for NAC treatment response prediction. Abbreviations NAC Neoadjuvant chemotherapy HER2 + BC HER2-positive breast cancer pCR Pathological complete response TME Tumor microenvironment TIME Tumor immune microenvironment TS-images Tissue segmentation images TILs Tumor infiltrating lymphocytes non-TILs non-tumor infiltrating lymphocytes LASSO Least absolute shrinkage and selection operator AUC Area under the curve HER2+ Human epidermal growth factor receptor 2 positive AI Artificial intelligence H&E hematoxylin and eosin TNBC Triple-negative breast cancer IHC immunohistochemical ER+ Estrogen receptor positive PR+ Progesterone receptor positive WSIs Whole slide images TI-CNN CNN for tumor classification, N-CNN:CNN for necrosis classification, L-CNN:CNN for lymphocytes infiltrating classification Sign. significance, GPL:global pooling layer HER2/CEP17 ratio HER2 expression to the expression of chromosome 17 ratio Lym. lymphocytes PPV positive predictive value NPV negative predictive value RIS Residual infiltration size Mor. morphological features PTD Pertuzumab + trastuzumab + docetaxel ROC Receiver Operating Characteristic. Declarations Ethics Approval and Consent to Participate Patient consent was not required because all samples were archival. Consent for publication Not applicable. Availability of data and materials The datasets used in this paper are publicly available and can be accessed through the literature index. Competing interest The authors declare no competing interests. Acknowledgements and funding This work was partially supported by grants from the National Natural Science Foundation of China (U21A20521 and 62271178), and the Natural Science Foundation of Zhejiang Province of China (LR23F010002). Author Contributions WenSheng Cui, Lihua Li, and Fan Ming performed the research. WenSheng Cui and Fan Ming designed the research study. WenSheng Cui collected the data. WenSheng Cui analyzed the data. WenSheng Cui wrote the paper. Lihua Li and Fan Ming oversaw the project. All authors provided critical feedback on the manuscript and approved the final version of the paper. References Giaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A, Jemal A, Siegel RL. Breast Cancer Statistics, 2022. Ca-a Cancer J Clin. 2022;72(6):524–41. Loibl S, Gianni L. HER2-positive breast cancer. Lancet. 2017;389(10087):2415–29. Andrulis IL, Bull SB, Blackstein ME, Sutherland D, Mak C, Sidlofsky S, Pritzker KP, Hartwick RW, Hanna W, Lickley L, et al. neu/erbB-2 amplification identifies a poor-prognosis group of women with node-negative breast cancer. Toronto Breast Cancer Study Group. J Clin oncology: official J Am Soc Clin Oncol. 1998;16(4):1340–9. Seshadri R, Firgaira FA, Horsfall DJ, McCaul K, Setlur V, Kitchen P. Clinical significance of HER-2/neu oncogene amplification in primary breast cancer. The South Australian Breast Cancer Study Group. J Clin oncology: official J Am Soc Clin Oncol. 1993;11(10):1936–42. Tiwari RK, Borgen PI, Wong GY, Cordon-Cardo C, Osborne MP. HER-2/neu amplification and overexpression in primary human breast cancer is associated with early metastasis. Anticancer Res. 1992;12(2):419–25. Slamon DJ, Clark GM, Wong SG, Levin WJ, Ullrich A, McGuire WL. Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene. Sci (New York NY). 1987;235(4785):177–82. Derks MGM, van de Velde CJH. Neoadjuvant chemotherapy in breast cancer: more than just downsizing. Lancet Oncol. 2018;19(1):2–3. Masood S. Neoadjuvant chemotherapy in breast cancers. Women's health (London England). 2016;12(5):480–91. Charfare H, Limongelli S, Purushotham AD. Neoadjuvant chemotherapy in breast cancer. Br J Surg. 2005;92(1):14–23. Haque W, Verma V, Hatch S, Klimberg VS, Butler EB, Teh BS. Response rates and pathologic complete response by breast cancer molecular subtype following neoadjuvant chemotherapy. Breast Cancer Res Treat. 2018;170(3):559–67. Cortazar P, Zhang L, Untch M, Mehta K, Costantino JP, Wolmark N, Bonnefoi H, Cameron D, Gianni L, Valagussa P, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164–72. Schettini F, Pascual T, Conte B, Chic N, Braso-Maristany F, Galvan P, Martinez O, Adamo B, Vidal M, Munoz M et al. HER2-enriched subtype and pathological complete response in HER2-positive breast cancer: A systematic review and meta-analysis. Cancer Treat Rev 2020, 84. Goorts B, van Nijnatten TJA, de Munck L, Moossdorff M, Heuts EM, de Boer M, Lobbes MBI, Smidt ML. Clinical tumor stage is the most important predictor of pathological complete response rate after neoadjuvant chemotherapy in breast cancer patients. Breast Cancer Res Treat. 2017;163(1):83–91. Lips EH, Mulder L, de Ronde JJ, Mandjes IAM, Koolen BB, Wessels LFA, Rodenhuis S, Wesseling J. Breast cancer subtyping by immunohistochemistry and histological grade outperforms breast cancer intrinsic subtypes in predicting neoadjuvant chemotherapy response. Breast Cancer Res Treat. 2013;140(1):63–71. Alba E, Lluch A, Ribelles N, Anton-Torres A, Sanchez-Rovira P, Albanell J, Calvo L, Lopez Garcia-Asenjo JA, Palacios J, Ignacio Chacon J, et al. High Proliferation Predicts Pathological Complete Response to Neoadjuvant Chemotherapy in Early Breast Cancer. Oncologist. 2016;21(2):150–5. Liefaard MC, van der Voort A, van Seijen M, Thijssen B, Sanders J, Vonk S, Mittempergher L, Bhaskaran R, de Munck L, van Leeuwen-Stok AE et al. Tumor-infiltrating lymphocytes in HER2-positive breast cancer treated with neoadjuvant chemotherapy and dual HER2-blockade. Npj Breast Cancer 2024, 10(1). Luque M, Sanz-Alvarez M, Morales-Gallego M, Madoz-Gurpide J, Zazo S, Dominguez C, Cazorla A, Izarzugaza Y, Arranz JL, Cristobal I et al. Tumor-Infiltrating Lymphocytes and Immune Response in HER2-Positive Breast Cancer. Cancers 2022, 14(24). Denkert C, von Minckwitz G, Darb-Esfahani S, Lederer B, Heppner BI, Weber KE, Budczies J, Huober J, Klauschen F, Furlanetto J, et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 2018;19(1):40–50. Carey LA, Berry DA, Cirrincione CT, Barry WT, Pitcher BN, Harris LN, Ollila DW, Krop IE, Henry NL, Weckstein DJ, et al. Molecular Heterogeneity and Response to Neoadjuvant Human Epidermal Growth Factor Receptor 2 Targeting in CALGB 40601, a Randomized Phase III Trial of Paclitaxel Plus Trastuzumab With or Without Lapatinib. J Clin Oncol. 2016;34(6):542–. Abdel-Fatah TMA, Agarwal D, Liu D-X, Russell R, Rueda OM, Liu K, Xu B, Moseley PM, Green AR, Pockley AG, et al. [SPAG5] as a prognostic biomarker and chemotherapy sensitivity predictor in breast cancer: a retrospective, integrated genomic, transcriptomic, and protein analysis. Lancet Oncol. 2016;17(7):1004–18. Wimberly H, Brown JR, Schalper K, Haack H, Silver MR, Nixon C, Bossuyt V, Pusztai L, Lannin DR, Rimm DL. PD-L1 Expression Correlates with Tumor-Infiltrating Lymphocytes and Response to Neoadjuvant Chemotherapy in Breast Cancer. Cancer Immunol Res. 2015;3(4):326–32. Zhang B, Yu Y, Mao Y, Wang H, Lv M, Su X, Wang Y, Li Z, Zhang Z, Bian T, et al. Development of MRI-Based Deep Learning Signature for Prediction of Axillary Response After NAC in Breast Cancer. Acad Radiol. 2024;31(3):800–11. Li Z, Gao J, Zhou H, Li X, Zheng T, Lin F, Wang X, Chu T, Wang Q, Wang S et al. Multiregional dynamic contrast-enhanced MRI-based integrated system for predicting pathological complete response of axillary lymph node to neoadjuvant chemotherapy in breast cancer: multicentre study. Ebiomedicine 2024, 107. Seban R-D, Arnaud E, Loirat D, Cabel L, Cottu P, Djerroudi L, Hescot S, Loap P, Bonneau C, Bidard F-C, et al. 18F FDG PET/CT for predicting triple-negative breast cancer outcomes after neoadjuvant chemotherapy with or without pembrolizumab. Eur J Nucl Med Mol Imaging. 2023;50(13):4024–35. Derouane F, van Marcke C, Berliere M, Gerday A, Fellah L, Leconte I, Van Bockstal MR, Galant C, Corbet C, Duhoux FP. Predictive Biomarkers of Response to Neoadjuvant Chemotherapy in Breast Cancer: Current and Future Perspectives for Precision Medicine. Cancers 2022, 14(16). Mungenast F, Fernando A, Nica R, Boghiu B, Lungu B, Batra J, Ecker RC. Next-Generation Digital Histopathology of the Tumor Microenvironment. Genes 2021, 12(4). Huang Z, Shao W, Han Z, Alkashash AM, de la Sancha C, Parwani AVV, Nitta H, Hou Y, Wang T, Salama P et al. Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images. Npj Precision Oncol 2023, 7(1). Duanmu H, Bhattarai S, Li H, Shi Z, Wang F, Teodoro G, Gogineni K, Subhedar P, Kiraz U, Janssen EAM, et al. A spatial attention guided deep learning system for prediction of pathological complete response using breast cancer histopathology images. Bioinformatics. 2022;38(19):4605–12. Fisher TB, Saini G, Rekha TS, Krishnamurthy J, Bhattarai S, Callagy G, Webber M, Janssen EAM, Kong J, Aneja R. Digital image analysis and machine learning-assisted prediction of neoadjuvant chemotherapy response in triple-negative breast cancer. Breast Cancer Res 2024, 26(1). Shen B, Saito A, Ueda A, Fujita K, Nagamatsu Y, Hashimoto M, Kobayashi M, Mirza AH, Graf HP, Cosatto E, et al. Development of multiple AI pipelines that predict neoadjuvant chemotherapy response of breast cancer using H&E-stained tissues. J Pathol Clin Res. 2023;9(3):182–94. Li F, Yang Y, Wei Y, Zhao Y, Fu J, Xiao X, Zheng Z, Bu H. Predicting neoadjuvant chemotherapy benefit using deep learning from stromal histology in breast cancer. Npj Breast Cancer 2022, 8(1). Rodriguez-Bejarano OH, Parra-Lopez C, Patarroyo MA. A review concerning the breast cancer-related tumour microenvironment. Crit Rev Oncol Hematol 2024, 199. Vanguri RS, Fenn KM, Kearney MR, Wang Q, Guo H, Marks DK, Chin C, Alcus CF, Thompson JB, Leu C-S, et al. Tumor Immune Microenvironment and Response to Neoadjuvant Chemotherapy in Hormone Receptor/HER2 + Early Stage Breast Cancer. Clin Breast Cancer. 2022;22(6):538–46. Farahmand S, Fernandez AI, Ahmed FS, Rimm DL, Chuang JH, Reisenbichler E, Zarringhalam K. Deep learning trained on hematoxylin and eosin tumor region of Interest predicts HER2 status and trastuzumab treatment response in HER2 + breast cancer. Mod Pathol. 2022;35(1):44–51. Albusayli R, Graham JD, Pathmanathan N, Shaban M, Raza SEA, Minhas F, Armes JE, Rajpoot N. Artificial intelligence-based digital scores of stromal tumour-infiltrating lymphocytes and tumour-associated stroma predict disease-specific survival in triple-negative breast cancer. J Pathol. 2023;260(1):32–42. Fu Y, Jung AW, Torne RV, Gonzalez S, Vohringer H, Shmatko A, Yates LR, Jimenez-Linan M, Moore L, Gerstung M. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat Cancer. 2020;1(8):800–. Amgad M, Stovgaard ES, Balslev E, Thagaard J, Chen W, Dudgeon S, Sharma A, Kerner JK, Denkert C, Yuan Y, et al. Report on computational assessment of Tumor Infiltrating Lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ breast cancer. 2020;6:16–16. Goode A, Gilbert B, Harkes J, Jukic D, Satyanarayanan M. OpenSlide: A vendor-neutral software foundation for digital pathology. J Pathol Inf. 2013;4:27–27. Saltz J, Gupta R, Hou L, Kurc T, Singh P, Vu N, Samaras D, Shroyer KR, Zhao T, Batiste R, et al. Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. Cell Rep. 2018;23(1):181–. Diao JA, Wang JK, Chui WF, Mountain V, Gullapally SC, Srinivasan R, Mitchell RN, Glass B, Hoffman S, Rao SK et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nat Commun 2021, 12(1). Wang S, Chen A, Yang L, Cai L, Xie Y, Fujimoto J, Gazdar A, Xiao G. Comprehensive analysis of lung cancer pathology images to discover tumor shape and boundary features that predict survival outcome. Sci Rep. 2018;8(1):10393. Fan W, Chang J, Fu P. Endocrine therapy resistance in breast cancer: current status, possible mechanisms and overcoming strategies. Future Med Chem. 2015;7(12):1511–9. Greenwell K, Hussain L, Lee D, Bramlage M, Bills G, Mehta A, Jackson A, Wexelman B. Complete pathologic response rate to neoadjuvant chemotherapy increases with increasing HER2/CEP17 ratio in HER2 overexpressing breast cancer: analysis of the National Cancer Database (NCDB). Breast cancer Res Treat. 2020;181:249–54. Singer CF, Tan YY, Fitzal F, Steger GG, Egle D, Reiner A, Rudas M, Moinfar F, Gruber C, Petru E. Pathological complete response to neoadjuvant trastuzumab is dependent on HER2/CEP17 ratio in HER2-amplified early breast cancer. Clin Cancer Res. 2017;23(14):3676–83. Singer CF, Tan YY, Fitzal F, Steger GG, Egle D, Reiner A, Rudas M, Moinfar F, Gruber C, Petru E, et al. Pathological Complete Response to Neoadjuvant Trastuzumab Is Dependent on HER2/CEP17 Ratio in HER2-Amplified Early Breast Cancer. Clin Cancer Res. 2017;23(14):3676–83. Wolff AC, Hammond MEH, Hicks DG, Dowsett M, McShane LM, Allison KH, Allred DC, Bartlett JMS, Bilous M, Fitzgibbons P, et al. Recommendations for Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Update. Arch Pathol Lab Med. 2014;138(2):241–56. Aswolinskiy W, Munari E, Horlings HM, Mulder L, Bogina G, Sanders J, Liu Y-H, van den Belt-dusebout AW, Tessier L, Balkenhol M et al. PROACTING: predicting pathological complete response to neoadjuvant chemotherapy in breast cancer from routine diagnostic histopathology biopsies with deep learning. Breast Cancer Res 2023, 25(1). Srinidhi CL, Ciga O, Martel AL. Deep neural network models for computational histopathology: A survey. Med Image Anal 2021, 67. van der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med. 2021;27(5):775–84. Additional Declarations No competing interests reported. Supplementary Files Additionalfile1Featurename.xlsx Additional file 1: Feature name. Additionalfile2.docx Additional file 2: Table S1. Feature weight information for the non-tumor and non-TILs of the Yale Response Dataset. Figure S1. Spearman correlation analysis of the features of non-tumor and non-TILs features. Additionalfile3Twosamplettest.xlsx Additional file 3: Two sample t-test results. Additionalfile4SpearmantestRIS.xlsx Additional file 4 Spearman-test_RIS. Cite Share Download PDF Status: Published Journal Publication published 21 Oct, 2025 Read the published version in Breast Cancer Research → Version 1 posted Editorial decision: Revision requested 24 Apr, 2025 Reviews received at journal 23 Apr, 2025 Reviewers agreed at journal 03 Apr, 2025 Reviewers invited by journal 01 Apr, 2025 Submission checks completed at journal 21 Mar, 2025 First submitted to journal 20 Mar, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5786592","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":437092280,"identity":"744e1a52-fe77-413a-b827-828591ed089b","order_by":0,"name":"Wensheng Cui","email":"","orcid":"","institution":"Hangzhou Dianzi University","correspondingAuthor":false,"prefix":"","firstName":"Wensheng","middleName":"","lastName":"Cui","suffix":""},{"id":437092282,"identity":"c2e03e00-0520-40ef-bc8f-309a583bebbe","order_by":1,"name":"Ming Fan","email":"","orcid":"","institution":"Hangzhou Dianzi University","correspondingAuthor":false,"prefix":"","firstName":"Ming","middleName":"","lastName":"Fan","suffix":""},{"id":437092284,"identity":"568d463b-71c5-4f6a-b800-3cb79bb0d83a","order_by":2,"name":"Lihua Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAoklEQVRIiWNgGAWjYBACPmbmNoaPDcwgtgFxWtiYGdsYZ5KmhYGxjZmXNC3sjG2PbXdYJzawN2+TYKi5Q5TD2o1zz6QnNvAcK5NgOPaMKC1t0rlthxMbJHLMJBgbDhOpxRKkRf4NKVoYwbbwkKBFsrct3biNJ63YIuEYEVr4+Q8fk/jZZi3bz354440PNURoQVgHIhJI0DAKRsEoGAWjAA8AAFlOMHOOi6GpAAAAAElFTkSuQmCC","orcid":"","institution":"Hangzhou Dianzi University","correspondingAuthor":true,"prefix":"","firstName":"Lihua","middleName":"","lastName":"Li","suffix":""}],"badges":[],"createdAt":"2025-01-08 07:23:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5786592/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5786592/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s13058-025-02139-x","type":"published","date":"2025-10-21T16:16:29+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":79907177,"identity":"2c1acbea-22e3-44d3-946a-47ad89305959","added_by":"auto","created_at":"2025-04-04 11:10:13","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":398458,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of our pipeline. \u003cstrong\u003ea\u003c/strong\u003e, \u003cstrong\u003eb\u003c/strong\u003eand \u003cstrong\u003ec\u003c/strong\u003e represent the data conversion from WSIs to tiles and the removal of worthless tiles. \u003cstrong\u003ed\u003c/strong\u003e, \u003cstrong\u003ee\u003c/strong\u003e, and \u003cstrong\u003ef\u003c/strong\u003e represent multiple tissue classifications, the generation of TS-images, based on which the morphological information of tumor, non-tumor, non-TILs, TILs, and Lym. is extracted. \u003cstrong\u003eg\u003c/strong\u003e, \u003cstrong\u003eh\u003c/strong\u003e and \u003cstrong\u003ei\u003c/strong\u003e represent the construction of the pCR prediction model, feature analysis and model performance evaluation. T-CNN: CNN for tumor classification, N-CNN: CNN for necrosis classification, L-CNN: CNN for lymphocytes classification, Sign.: significance, GPL: global maximum pooling layer.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/dd1224a023c862834e3a2ae1.png"},{"id":79907169,"identity":"cf3a7ff5-60ed-487a-a33c-806fea7e72dd","added_by":"auto","created_at":"2025-04-04 11:10:13","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":109766,"visible":true,"origin":"","legend":"\u003cp\u003eHistopathology image region classification flow (The patient ID is IMPRESS HER2+ dataset: 080_HE). TI-CNN: CNN for tumor classification, N-CNN: CNN for necrosis classification, L-CNN: CNN for lymphocytes infiltrating classification, TILs: tumor infiltrating lymphocytes, non-TILs: non-tumor infiltrating lymphocytes, Lym.: lymphocytes.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/71115629e601e3b2f16b8f10.png"},{"id":79907163,"identity":"aa0d8679-a484-4d65-b2b1-a74ade0fd6fe","added_by":"auto","created_at":"2025-04-04 11:10:12","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":106617,"visible":true,"origin":"","legend":"\u003cp\u003eThree angles of extracting features for non-tumor and non-TILs. (The patient ID is IMPRESS HER2+ dataset: 080_HE)\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/aa2c44a2281aa8dcf7165f06.png"},{"id":79909839,"identity":"99b4fbcf-b2a4-4651-8424-4a954d4ecde0","added_by":"auto","created_at":"2025-04-04 11:26:13","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":136670,"visible":true,"origin":"","legend":"\u003cp\u003eModel generalization performance validation results. \u003cstrong\u003ea \u003c/strong\u003eReceiver Operating Characteristic (ROC) curves for the prediction of pCR combining morphological and clinical features. \u003cstrong\u003eb \u003c/strong\u003eOnly morphological features were used to predict ROC curves for pCR. \u003cstrong\u003ec\u003c/strong\u003e Percentage training ROC curves for non-TILs. \u003cstrong\u003ed\u003c/strong\u003e Percentage training ROC curves for non-tumor. \u003cstrong\u003ee.\u003c/strong\u003eResults of the evaluation of the indicators of the percentage training of non-TILs. \u003cstrong\u003ef.\u003c/strong\u003e Results of the evaluation of the indicators of the percentage training of non-tumor.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/52af6e91ca617f6a825dc561.png"},{"id":79908260,"identity":"ac121639-9fde-4f59-a165-abbdff878fd7","added_by":"auto","created_at":"2025-04-04 11:18:13","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":81651,"visible":true,"origin":"","legend":"\u003cp\u003eFeature importance selected by LASSO. a. The importance of non-TILs morphological and clinical combined features. b. The importance of non-tumor morphological and clinical combined features.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/68f38fc86714e74624958840.png"},{"id":94490462,"identity":"4d230e15-4266-4b57-a608-d0f5a2926ee7","added_by":"auto","created_at":"2025-10-27 17:10:25","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1926815,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/850c9b8c-2b60-48d3-8837-d085ac462e9e.pdf"},{"id":79907164,"identity":"69795712-cb5f-4803-bea3-c3622575a4eb","added_by":"auto","created_at":"2025-04-04 11:10:13","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":10969,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional file 1:\u003c/strong\u003e Feature name.\u003c/p\u003e","description":"","filename":"Additionalfile1Featurename.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/1b3e579b3f8b21602c096ddc.xlsx"},{"id":79907176,"identity":"942042c1-5d7b-4222-972c-cb0c9dfdd736","added_by":"auto","created_at":"2025-04-04 11:10:13","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":732066,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional file 2: Table S1. \u003c/strong\u003eFeature weight information for the non-tumor and non-TILs of the Yale Response Dataset. \u003cstrong\u003eFigure S1. \u003c/strong\u003eSpearman correlation analysis of the features of non-tumor and non-TILs features.\u003c/p\u003e","description":"","filename":"Additionalfile2.docx","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/fdd1d24d7adf3ff067da58db.docx"},{"id":79908261,"identity":"07c367bc-d3e5-4c03-9f4f-ab3971e56239","added_by":"auto","created_at":"2025-04-04 11:18:13","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":24351,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional file 3:\u003c/strong\u003e Two sample t-test results.\u003c/p\u003e","description":"","filename":"Additionalfile3Twosamplettest.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/a76af961a1760b3e48ab1414.xlsx"},{"id":79907173,"identity":"2176bb2f-bd80-44a0-97eb-70ec0109b3c0","added_by":"auto","created_at":"2025-04-04 11:10:13","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":21325,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional file 4 \u003c/strong\u003eSpearman-test_RIS.\u003c/p\u003e","description":"","filename":"Additionalfile4SpearmantestRIS.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-5786592/v1/b8133efd9bb48c3b672193c9.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images","fulltext":[{"header":"Background","content":"\u003cp\u003eHER2-positive breast cancer (HER2\u0026thinsp;+\u0026thinsp;BC) is a distinct subtype of breast cancer characterized by overexpression or amplification of human epidermal growth factor receptor 2, accounting for approximately 20\u0026ndash;25% of breast cancer cases[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Compared to HER2-negative breast cancer, HER2\u0026thinsp;+\u0026thinsp;BC exhibits a higher propensity for aggressiveness and recurrence[\u003cspan additionalcitationids=\"CR4 CR5\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Neoadjuvant chemotherapy (NAC) is currently considered an effective treatment option for patients with HER2\u0026thinsp;+\u0026thinsp;BC[\u003cspan additionalcitationids=\"CR8\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Pathological complete response (pCR) serves as one of the evaluation criteria used to assess the effectiveness of NAC[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Patients achieving pCR are expected to have a more favorable outcome than those with pathological noncomplete response (non-pCR)[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. However, approximately 30% of HER2\u0026thinsp;+\u0026thinsp;BC patients do not attain pCR[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Consequently, predicting pCR before NAC in HER2\u0026thinsp;+\u0026thinsp;BC is crucial for guiding personalized treatment strategies and avoiding ineffective interventions.\u003c/p\u003e \u003cp\u003eCurrently, conventional clinical indicators such as tumor size[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], pathologic grade[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e], Ki-67 proliferation index[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e], and tumor-infiltrating lymphocytes (TILs)[\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e] lack precision in predicting pCR. To address this limitation, researchers have introduced molecular biomarker approaches, including CALGB 40601[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e], SPAG5[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], and PD-L1[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. However, these methods present significant challenges in terms of cost and time investment. Beyond clinical indicators and molecular biomarkers for pCR prediction, several studies have demonstrated the effectiveness of artificial intelligence (AI) in analyzing medical images to forecast responses to NAC. These applications extend to radiological images, encompassing MRI[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e] and PET/CT[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. In comparison to radiological images, histopathological images, considered the gold standard for disease diagnosis, offer extensive information on the tumor microenvironment (TME), providing insights into disease progression[\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAI technology has been employed by researchers to characterize TME information in immunohistochemical (IHC)-stained breast cancer histopathological images[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. While non-routine IHC staining methods (e.g., Ki-67 and PHH3) can capture high-value biomarkers within the TME, enhancing the predictive accuracy of pCR, they present challenges such as increased medical costs, extended assay times, and dependence on manual processes. Consequently, an increasing number of researchers are utilizing routine (e.g., hematoxylin and eosin, H\u0026amp;E) stained histopathological images to identify biomarkers within the TME. Fisher et al. extracted TME information from triple-negative breast cancer histopathology images, discovering that the interaction between tumor and TILs held significant predictive value for pCR[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. Shen et al. employed AI to analyze histopathological images, extracting cell nuclei features as input for machine learning to predict pCR to NAC[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Li et al. introduced the tumor-associated stroma score to forecast the response of breast cancer patients to NAC[\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Studies have demonstrated that morphological information of tissue in the TME correlates with the treatment response of NAC for breast cancer[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. However, considering current research, there remains potential for further exploration of the TME based on histopathological images.\u003c/p\u003e \u003cp\u003eIn this study, we used deep learning techniques to segment H\u0026amp;E-stained core needle biopsy histopathological images of HER2\u0026thinsp;+\u0026thinsp;BC. This process resulted in the generation of tissue segmentation images (TS-images). Utilizing these TS-images, we extracted morphological information from various tissue components in the TME, including tumor, non-tumor, TILs, non-tumor-infiltrating lymphocytes (non-TILs), and lymphocytes, to predict pCR in NAC. The primary aim of this research is to support clinicians in patient selection, improve NAC treatment response rates, and contribute to the advancement of precision tumor therapy.\u003c/p\u003e"},{"header":"Materials and Methods","content":"\u003cp\u003eDatasets\u003c/p\u003e \u003cp\u003eA cohort of 147 patients who underwent NAC at Yale University and Purdue University was retrospectively analyzed. The model's development utilized the Yale Response dataset from Yale University, while external validation was conducted using the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset from Purdue University to assess the model's generalization capabilities. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e presents a summary of the clinical and histopathological characteristics of these patients.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eClinical and histopathological characteristics of HER2\u0026thinsp;+\u0026thinsp;patients in the two datasets.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCohort\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eCharacteristics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCase#/Median\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e%/Range\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"7\" rowspan=\"8\"\u003e \u003cp\u003eYale Response dataset\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eTotal case number\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e85\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eCases with pCR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e42.35%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eCases with residual tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e57.65%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eEstrogen receptor (ER) positive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e69\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e81.18%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eProgesterone receptor (PR) positive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e66\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e77.65%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eHER2/CEP17 ratio\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e3.14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.00-17.40\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eResidual tumor size (cm)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.35\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.02-11.00\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eHER2CN (signals/cell)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e11.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e3.3\u0026ndash;6.7\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"14\" rowspan=\"15\"\u003e \u003cp\u003eIMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eTotal case number\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e62\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eCases with pCR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e38\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e61.29%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eCases with residual tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e38.71%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eEstrogen receptor (ER) positive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e48.39%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eProgesterone receptor (PR) positive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e19\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e30.65%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eHER2/CEP17 ratio\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e6.73\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.23\u0026ndash;22.98\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eResidual tumor size (cm)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.10-7.00\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eResidual cancer burden\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.39\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.91\u0026ndash;4.14\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eAge (years)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e56\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e30\u0026ndash;76\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eNottingham Grade\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅠ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1.61%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅡ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e43.55%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅢ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e34\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e54.84%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eNuclear grade\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅠ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.00%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅡ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e16.13%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eⅢ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e52\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e83.87%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe Yale Response dataset comprises whole slide images (WSIs) of pretreatment core needle biopsies and associated clinical information from 85 female patients with HER2\u0026thinsp;+\u0026thinsp;BC[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. All WSIs were scanned at 20\u0026times; magnification using the Aperio ScanScope Console (v10.2.0.2352). Prior to surgery, each patient received treatment with trastuzumab\u0026thinsp;\u0026plusmn;\u0026thinsp;pertuzumab. Pathologists evaluated the efficacy of NAC based on surgical resection pathological reports. They defined pCR as the absence of residual invasive, lympho-vascular invasion, or metastatic carcinoma (36 cases, 42.35%). Conversely, cases exhibiting any residual invasive, lymphovascular, or metastatic carcinoma were classified as non-pCR (49 cases, 57.65%).\u003c/p\u003e \u003cp\u003eThe IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset, comprising 62 patients with H\u0026amp;E-stained HER2\u0026thinsp;+\u0026thinsp;BC, was derived from WSIs of pretreatment core needle biopsies and associated clinical information[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Histopathologic slides were scanned at 20\u0026times; magnification using a Hamamatsu scanner. The majority of patients underwent NAC with Taxol (paclitaxel/docetaxel) and trastuzumab. A subset of 7 cases followed an alternative 4-cycle regimen of pertuzumab, trastuzumab, and docetaxel, with 3 cases not achieving pCR and 4 cases achieving pCR. The definition of pCR adhered to Yale University standards. The dataset exhibited a pCR rate of 38.71% (24 cases), while non-pCR cases constituted 61.29% (38 cases).\u003c/p\u003e \u003cp\u003eOverview of the framework\u003c/p\u003e \u003cp\u003eOur methodology comprises three primary components: histopathological image pre-processing, generation of TS-images and extraction of morphological information, and modeling for pCR prediction and evaluation. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates this comprehensive pipeline.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eHistopathological image pre-processing\u003c/p\u003e \u003cp\u003eDigital pathology tissue images contain billions of pixels, making direct input into deep learning models computationally infeasible[\u003cspan additionalcitationids=\"CR36\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. To address this challenge, we utilized the OpenSlides toolkit to segment WSIs into smaller tiles measuring 175*175*3 pixels[\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. WSIs often exhibit ink contamination, blank areas, fewer tissue, and fat regions. To reduce noise, we applied Otsu\u0026rsquo;s thresholding method to segment tissue regions and quantify the proportion of tissue pixels in each tile. Tiles with a tissue pixel fraction below 20% were excluded from further analysis to ensure that only research-relevant regions were retained.\u003c/p\u003e \u003cp\u003eTissue classification and generating tissue segmentation image\u003c/p\u003e \u003cp\u003eWe developed a transfer learning model using TensorFlow to classify tumor, non-tumor, and necrosis regions. Subsequently, we reconstructed WSIs from tile predictions to generate TS-images for feature extraction. These TS-images categorize regions into tumor, non-tumor, TILs, non-TILs, and lymphocytes, facilitating analysis, enhancing model generalization, and offering a more comprehensive analytical perspective than pixel-level analysis. The non-tumor region encompasses all tissues not classified as tumor, including stroma, lymphocytes that are not tumor-infiltrating, vascular structures, and other supportive cells such as fibroblasts and mesenchymal cells. Fat tissue was excluded during preprocessing and is not considered part of the non-tumor region. Non-TILs refer to lymphocytes located in the non-tumor regions but still within the TME, playing a role in the immune response. These include lymphocytes present in the peritumoral stroma, inflammatory infiltration zones, and immune cell clusters outside the tumor boundary yet within the TME. However, Non-TILs do not include lymphocytes directly infiltrating tumor regions.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe datasets for the classification tasks of each tissue type exhibit significant sample size disparities, suggesting that multi-task classification may not necessarily yield improved performance. A sequential binary classification approach was implemented to effectively classify each region. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e illustrates this process. Initially, the classification distinguished between tumor and non-tumor areas. Given the morphological similarity between necrosis and TILs, a Convolutional Neural Network (CNN) was developed to identify necrotic regions[\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Subsequently, TILs were classified. Following this, non-TILs were classified based on the non-tumor classification results. Finally, TILs and non-TILs were combined to form the lymphocyte category. Since the primary contribution of this study lies in morphological analysis rather than the classification process, we have relocated the technical details of tissue region classification to Additional File 2.\u003c/p\u003e \u003cp\u003eFeature extraction and construction\u003c/p\u003e \u003cp\u003eWe utilized the publicly available Scikit-image.measure.regionprops module to extract morphological features from different regions of TS-images, encompassing tumor, non-tumor, non-TILs, TILs, and lymphocytes. Instead of pixel-level analysis, we treated each tile as the fundamental unit for feature extraction and conducted a whole-slide-based evaluation of tissue morphology at the tile level. By applying eight-connectivity at the tile level, we identified connected components of each tissue type, facilitating a comprehensive whole-slide assessment of morphological structures across the WSI. Using methods provided by the library, we extracted 46 sets of morphological features for each tissue region, following an approach similar to those proposed by Diao et al. [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e] and Wang et al. [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. These morphological features capture a wide range of tissue characteristics, including quantitative, size-based metrics such as the number of connected components, major and minor axis lengths, as well as qualitative, shape-based metrics such as Euler number and eccentricity.\u003c/p\u003e \u003cp\u003eTo comprehensively characterize the spatial properties of the TME, we adopted three feature extraction perspectives. The all region computes the morphological features of all connected tissue components, providing an overview of the overall tissue structure. The largest region extracts morphological features from the largest connected component within each tissue region, reflecting the predominant structural characteristics. The significant region focuses on components that exceed 5% of the largest region\u0026rsquo;s area, calculating the mean and standard deviation of their morphological features to analyze local morphological variations and TME heterogeneity.\u003c/p\u003e \u003cp\u003eThe configuration of the percentage threshold for significant regions plays a crucial role in extracting meaningful morphological features. A threshold that is too high may reduce the sensitivity of spatial feature extraction, leading to the omission of critical structural details, while a threshold that is too low may introduce excessive noise, compromising the model\u0026rsquo;s generalization ability. For details on the determination of the significant region threshold, please refer to the experimental section below. For each tissue region, we extracted 11 feature sets from the largest region, 12 feature sets from the all-region perspective, and 23 feature sets from the significant-region perspective. The feature names are detailed in Additional File 1. Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e illustrates the three perspectives used for extracting morphological features from non-tumor and non-TIL regions.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn addition to morphological information, we integrated clinical features into our analysis. The primary motivation for incorporating clinical features is to complement morphological characteristics extracted from pathological data and enhance the predictive performance for pCR. Studies have demonstrated that clinical variables, such as ER/PR status and HER2/CEP17 ratio, are significantly associated with patient prognosis[\u003cspan additionalcitationids=\"CR43\" citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. Therefore, integrating these variables is expected to improve the model\u0026rsquo;s predictive capability and provide a more comprehensive assessment of treatment response. Our methodology involved intersecting clinical features from the Yale Response and IMPRESS HER2\u0026thinsp;+\u0026thinsp;datasets. Specifically, we examined estrogen receptor status (ER+/-), ER%, progesterone receptor status (PR+/-), progesterone receptor PR%, and the ratio of HER2 expression to chromosome 17 expression (HER2/CEP17)[\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e, \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e] as clinical features.\u003c/p\u003e \u003cp\u003eMachine learning framework for predicting pathological complete response to neoadjuvant chemotherapy\u003c/p\u003e \u003cp\u003eInitially, we employed standardization for data preprocessing, adjusting the data range to have a mean of 0 and a standard deviation of 1. Subsequently, we utilized the least absolute shrinkage and selection operator (LASSO) algorithm for feature selection, eliminating redundant features to enhance the model's generalization performance. The alpha value of the LASSO was determined using a stratified 10-fold cross-validated grid search algorithm. Finally, we input the selected features into the multilayer perceptron (MLP) for training and validated them on an external test cohort. The hyperparameter configuration of the MLP mirrored that of the LASSO acquisition.\u003c/p\u003e \u003cp\u003eStatistical analysis methodology\u003c/p\u003e \u003cp\u003eWe conducted analyses of morphological and clinical features from the Yale Response and IMPRESS HER2\u0026thinsp;+\u0026thinsp;datasets using two-sample t-tests. Furthermore, we utilized Spearman's rank correlation coefficients to evaluate the relationships between morphological features and residual infiltration size (RIS), as well as among the morphological features themselves. These coefficients yield a correlation measure (R) and a two-sided P-value. For all statistical analyses, we employed a P-value threshold of less than 0.05 to denote statistical significance.\u003c/p\u003e \u003cp\u003eModel performance evaluation metrics\u003c/p\u003e \u003cp\u003eWe employed various metrics to evaluate model performance, including area under the curve (AUC), F1 score, positive predictive value (PPV), recall, and negative predictive value (NPV). In the validation results of multicenter independent datasets, an AUC exceeding 0.800 was considered evidence of excellent model performance, while an AUC greater than 0.700 indicated certain judgment capabilities.\u003c/p\u003e \u003cp\u003eExperimental environment\u003c/p\u003e \u003cp\u003eThe hardware configuration for all experiments consists of a high-performance computing cluster, featuring two NVIDIA Quadro RTX 6000 GPUs for parallel computing and a 2.0 TB hard disk for storage. For software, the OpenSlide (version: 1.2.0) toolkit was employed for WSI tiling, while TensorFlow (version: 2.4.1) was utilized for data loading, deep model training, and testing. Python was used in conjunction with the Scikit-learn (version: 1.2.2) and SciPy (version: 1.8.1) libraries for machine learning and statistical analysis. Furthermore, OpenCV (version: 4.7.0), Pandas (version: 2.0.2), and Scikit-image (version: 0.19.3) libraries were utilized for image morphology feature extraction and data processing, respectively.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation of significant region thresholds\u003c/h2\u003e \u003cp\u003e \u003cb\u003eThe appropriate selection of the significant region threshold is crucial for accurately capturing the spatial characteristics of the TME. To systematically evaluate the impact of threshold selection, we tested four threshold values 1%, 3%, 5%, and 10% for significant region measurement, extracted the corresponding morphological features, and integrated them with clinical features to predict pCR in HER2\u0026thinsp;+\u0026thinsp;breast cancer to NAC. The experimental methodology has been described in the Methods section and is not repeated here for brevity. The experimental results on the external validation dataset IMPRESS_HER2\u0026thinsp;+\u0026thinsp;are presented in\u003c/b\u003e Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, \u003cb\u003ewhile the results of morphology-only modeling are provided in Additional File 2: Table \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e.\u003c/b\u003e\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparison of the generalization performance of morphological and clinical feature combinations for five tissue types under different significant region thresholds in the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRegion\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSign. thresholds\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eF1 score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003ePPV\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNPV\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eTumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.716\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.787\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.661\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.974\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.750\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.708\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.764\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.667\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.895\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.615\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.732\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.740\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.771\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.711\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.586\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.724\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.805\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.795\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.816\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.680\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eNon-tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.717\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.784\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.644\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.800\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.726\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.795\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.775\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.816\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.667\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.779\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.883\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.745\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.946\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.846\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.734\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.806\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.714\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.926\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.727\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eTILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.684\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.780\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.727\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.842\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.650\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.651\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.714\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.694\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.735\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.583\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.594\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.654\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.708\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.607\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.571\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.620\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.692\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.529\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eNon-TILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.774\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.805\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.775\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.838\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.683\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.822\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.827\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.756\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.912\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.778\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.873\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.889\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.821\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.970\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.933\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.719\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.684\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.650\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.722\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.700\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eLym.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.675\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.778\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.673\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e0.921\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.667\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.691\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.698\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.759\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.647\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.581\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.668\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.730\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.676\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.793\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.676\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.688\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.647\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.733\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e0.667\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eBest-performed values are highlighted in boldface. Sign.: significance, TILs: tumor infiltrating lymphocytes, non-TILs: non-tumor infiltrating lymphocytes, Lym.: lymphocytes.\u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e compares model performance across five tissue types under different significant region thresholds. The results indicate that the 5% threshold achieves the best overall performance, particularly in the non-tumor and non-TILs. In the non-tumor, the 5% threshold yields the highest AUC (0.779), F1-score (0.883), and NPV (0.846), demonstrating its effectiveness in capturing spatial features critical for pCR prediction while minimizing irrelevant tissue interference. On the non-TILs, the 5% threshold achieves the best AUC (0.873), F1-score (0.889), PPV (0.821), Recall (0.970), and NPV (0.933), indicating its ability to optimally retain TME features. In contrast, the 1% threshold increases recall but reduces AUC and NPV, suggesting that excessive noise negatively impacts predictive performance. The 10% threshold, on the other hand, may lead to performance degradation due to excessive loss of critical information. Although the 3% threshold shows some improvement, it remains less stable than the 5% threshold. Therefore, the 5% threshold is identified as the optimal choice, striking the best balance between noise reduction and feature retention, thereby enhancing model generalization capability.\u003c/p\u003e \u003cp\u003eMachine learning utilizing morphological and clinical features to predict the pCR\u003c/p\u003e \u003cp\u003eWe employed deep convolutional neural networks to segment histopathological images into five distinct regions: tumor, non-tumor, non-TILs, TILs, and lymphocytes. Each region comprises 46 morphological and 5 clinical features (HER2/CEP17 ratio, ER, PR, ER%, and PR%). Feature selection was performed using LASSO, and the selected features were input into an MLP classifier to predict the pCR to NAC. Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e presents the performance evaluation of the model on the external validation set, along with a comparative analysis against baseline models and the latest state-of-the-art approaches.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eGeneralization performance validation of pCR predictions in the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eType\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMethod\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eF1 score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003ePPV\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNPV\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eMor. + Clinical\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.732\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.740\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.771\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.711\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.586\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.779\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.883\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.745\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.946\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.846\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-TILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.873\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.889\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.821\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.970\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.933\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.594\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.654\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.708\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.607\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.571\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLym.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.668\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.730\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.676\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.793\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eMor.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.656\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.656\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.808\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.553\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.526\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.746\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.843\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.761\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.946\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.857\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-TILs Mor.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.766\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.817\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.763\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.879\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.750\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTILs Mor.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.562\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.719\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.639\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.821\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.625\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLym. Mor.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.625\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.716\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.828\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.600\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"5\" rowspan=\"6\"\u003e \u003cp\u003eClinical\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eER\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.658\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.765\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.721\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.816\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eER%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.668\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.800\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.692\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.947\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.800\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.669\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.795\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.700\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.921\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.750\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePR%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.645\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.686\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.750\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.533\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHER2/CEP17 ratio\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.668\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.760\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.613\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTotal Clinical\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.716\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.805\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.750\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.868\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.722\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eTiles Count\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.664\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.760\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.613\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-Tumor\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.680\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.787\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.661\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.974\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.833\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.552\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.768\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.623\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.999\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNon-TILs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.651\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.688\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.846\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.579\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.556\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLym.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.596\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.763\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.627\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.974\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.666\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eRelative\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTNR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.576\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.769\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.660\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.921\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.667\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eILTR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.562\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.784\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.644\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNTLR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.571\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.758\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.632\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.947\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.600\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLNTR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.587\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.768\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.623\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.999\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLD\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.518\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.761\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.648\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.921\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.625\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eLatest Work\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIMPRESS (H\u0026amp;E only)[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.812\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.827\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003e0.906\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.761\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.698\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePathologists\u0026rsquo; features[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.788\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.782\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.870\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.711\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.645\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLTR[\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.541\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.760\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.613\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u003cb\u003e1.000\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eMor.: Morphological. PPV: positive predictive value, NPV: negative predictive value. Best-performed values are highlighted in boldface.\u003c/p\u003e \u003cp\u003eThe results indicate that the most effective predictive model is a hybrid approach integrating morphological and clinical features. Among these, the non-TILs achieved the highest AUC (0.873) and F1-score (0.889), demonstrating superior predictive performance. Within purely clinical features, the HER2/CEP17 ratio exhibited the highest recall (1.000), highlighting its critical role in pCR prediction.\u003c/p\u003e \u003cp\u003eRegarding Tiles Count features, the results show that the non-tumor tile count (AUC\u0026thinsp;=\u0026thinsp;0.680) outperformed the tumor tile count (AUC\u0026thinsp;=\u0026thinsp;0.664), suggesting that information from non-tumor regions may contribute significantly to pCR prediction. However, compared to models incorporating both morphological and clinical features, tile count-based models exhibit limited predictive power. Similarly, relative tissue metrics (e.g., TNR, ILTR) demonstrated lower predictive performance, with AUC values ranging from 0.55 to 0.68, indicating that solely relying on tissue proportion metrics provides limited discriminative ability. The definitions and computational formulas for these relative features are detailed in Additional File 2.\u003c/p\u003e \u003cp\u003eComprehensive external validation further confirms that features derived from \u0026ldquo;non-tumor\u0026rdquo; regions (including both morphological and clinical features) exhibit the most promising performance in pCR prediction, with AUC values ranging from 0.779 to 0.873. Although some models showed improvements in specific metrics, their overall predictive ability remained inferior to the best-performing hybrid model integrating both morphological and clinical features. Furthermore, our findings indicate that the top-performing model surpasses pathologists in pCR prediction, reinforcing that \u0026ldquo;non-tumor\u0026rdquo; morphological features and clinical data are highly effective indicators for predicting pCR in HER2\u0026thinsp;+\u0026thinsp;breast cancer patients undergoing NAC. The study reveals that morphological features of \"non-tumor\" regions and clinical information are effective predictors of pCR to NAC in HER2\u0026thinsp;+\u0026thinsp;BC.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo validate the effectiveness of the method, we compared its performance with recent studies. Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e demonstrates that the model trained on morphological and clinical features of non-TILs exhibits the highest overall performance. Furthermore, the non-tumor model displayed superior performance relative to the reference[\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. In conclusion, based on the comprehensive evaluation presented in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.a, and Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.b, the model trained on the combined features of the \"non-tumor\" demonstrates the highest generalization capability. Consequently, our subsequent analyses focused on the non-tumor and non-TILs components due to their superior performance.\u003c/p\u003e \u003cp\u003eThe generalization ability of the model was tested by percentage random selection of the training set. As illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, the model's performance exhibits a general upward trend as the percentage of the training set increases. Notably, the model demonstrates effective performance even with a smaller proportion of the training set. Using only the morphological features extracted from the non-TILs regions of 20% of the training set (Yale Response dataset) WSIs, the AUC of the external validation set reached 0.722 (as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.c).\u003c/p\u003e \u003cp\u003eFurthermore, the non-TILs consistently outperformed the non-tumor regions. These experimental results indicate that information derived from \"non-tumor\" regions possesses substantial generalization capability for predicting pCR in NAC.\u003c/p\u003e \u003cp\u003eFeature importance analysis\u003c/p\u003e \u003cp\u003eTo comprehensively understand the process affecting the combined features of non-tumor and non-TILs, we analyzed the weights of the features selected by the LASSO. Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e illustrate the weights assigned to the LASSO-selected features, while Additional File 2: Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e presents the three most favorable and unfavorable features. Furthermore, we employed Spearman's test to examine the correlation between paired features, identifying potential dependencies. The results for the Yale response dataset are displayed in Additional File 2: Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.a illustrates spatial characteristics of the tumor immune microenvironment (TIME). The three most favorable features, associated with significance regions, are the significance region number, significance area filled mean, and significance area convex mean. A higher number of significant non-TIL regions may indicate a more robust immune response to cancer cells, potentially correlating with a positive response to NAC. Furthermore, a greater area filled mean of significant non-TILs suggests a larger distribution area of lymphocytes, possibly contributing to a favorable pCR. Similarly, a larger mean convex hull area of significant regions indicates a broader spatial distribution of lymphocytes or more complex margins, which may facilitate an improved treatment response. The most adverse factors include the HER2/CEP17 ratio, the largest eccentricity, and the significance eccentricity mean. While pCR and the HER2/CEP17 ratio typically exhibit a positive correlation, this feature set, comprising only immunomorphological features, demonstrates a negative correlation. This suggests that non-TIL morphological features may be more indicative for pCR prediction. Eccentricity, describing the shape of an ellipse, ranges from 0 to 1, with lower values indicating a more circular shape. Consequently, lymphocyte infiltration patterns closer to circular may be more favorable for predicting treatment outcomes.\u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.b provides spatial information for the TME. The three most favorable features are the largest area, the largest area convex, and the significance area convex mean. A larger area potentially indicates a higher proportion of relevant immune cells. Moreover, larger values for the largest area convex and significance area convex mean suggest a wider distribution of non-tumor cells. This spatial configuration positively correlates with the probability of pCR. Conversely, three adverse factors are identified: the significance euler number std, the largest axis minor length, and the HER2/CEP17 ratio. The euler number, quantifying the number of voids in the cavity region, represents spatial completeness and serves as a valuable indicator of connectivity variability in the non-tumor spatial distribution. A greater standard deviation of the euler number in significant regions indicates more substantial differences in connectivity between regions. This suggests a potential relationship between pCR and the variability of connectivity in the non-tumor spatial distribution. The minor axis length represents the short axis length of the ellipse, and a smaller average length of the short axis in the largest region implies a flatter morphological distribution. Concurrently, a flatter distribution of non-tumor regions appears to be more conducive to pCR.\u003c/p\u003e \u003cp\u003eIn the \"non-tumor\" region, the TME demonstrates a correlation with TIME. Consequently, the features of these two regions, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, exhibit overlap. The PR, significance area convex mean, and significance number emerge as common favorable features. Conversely, the largest eccentricity, significance eccentricity mean, and HER2/CEP17 ratio present as common unfavorable features.\u003c/p\u003e \u003cp\u003eUnivariate analysis\u003c/p\u003e \u003cp\u003eWe employed statistical tools to examine the relationship between the combined features of non-tumor and non-TILs and pCR. Initially, we utilized the normal test function from the Scipy package to conduct a standard test on the features. The results indicate that all features conform to a normal distribution. Subsequently, we performed feature difference analysis on pCR and non-pCR using a two-sample t-test. We identified the most significant favorable and adverse features of non-tumor and non-TILs selected by LASSO. Figure\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e illustrates the statistical analysis results through box plots, while Additional File 3 presents the comprehensive findings. In the Yale Response dataset analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.a and Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.b), we observed significant variability among most features, with the exception of the HER2/CEP17 ratio. Furthermore, these features display both higher favorable and unfavorable weights, as depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e. For instance, consider the Yale Response dataset's significance area filled mean (p-value\u0026thinsp;=\u0026thinsp;0.026) and the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset's significance number (p-value\u0026thinsp;=\u0026thinsp;0.003). However, some features exhibit small weights, such as PR (p-value\u0026thinsp;=\u0026thinsp;0.008) in the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset. The experimental results demonstrate that individual features also possess the capacity to distinguish between pCR and non-pCR cases.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn addition to the two-sample t test, we employed Spearman's test to analyze the correlation between non-tumor and non-TILs morphological features and RIS. For patients with pCR, residual invasion size (RIS) was defined as 0. In the Yale Response and IMPRESS HER2\u0026thinsp;+\u0026thinsp;datasets, the median RIS values were 1.35 cm (range 0.02\u0026ndash;11.00 cm) and 0.80 cm (range 0.10\u0026ndash;7.00 cm), respectively. We identified the most significant favorable and adverse morphological features of non-tumor and non-TILs selected by LASSO. These features underwent statistical analysis using scatter plots, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e. Additional File 4 presents the complete results. In Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.f, the significance area convex mean (p-value\u0026thinsp;=\u0026thinsp;0.006, R = -0.299) of non-tumor in the Yale Response dataset exhibited a significant negative correlation with RIS. In Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.g, the significance number (p-value\u0026thinsp;=\u0026thinsp;0.009, R\u0026thinsp;=\u0026thinsp;0.336) of non-TILs in the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset demonstrated a significant positive correlation with RIS. In Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.h, the all filled area number (p-value\u0026thinsp;=\u0026thinsp;0.012, R = -0.317) of non-tumor in the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset showed a significant negative correlation with RIS. The significance area convex mean of non-tumor yielded different results in Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.f and 6.h. This discrepancy likely arose from the distinct distributions of the two datasets. Furthermore, additional features significantly correlated with RIS can be utilized for quantitative prediction of RIS values (e.g., the significance area number in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.g).\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eComputational pathology methodologies have been extensively employed in clinical applications to support diagnostic, outcome, and predictive objectives[\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e, \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. Moreover, the analysis and assessment of TME biomarkers in histopathological images of breast cancer can contribute to improved patient outcomes. To our knowledge, limited research has focused on predicting pCR to NAC in HER2\u0026thinsp;+\u0026thinsp;BC patients through quantification of TME morphological information. This study integrates morphological features with clinical data to construct combined features, which are subsequently input into LASSO for feature selection. The selected features are then fed into an MLP to predict the response to NAC treatment.\u003c/p\u003e \u003cp\u003eThis research offers several advantages. Firstly, it addresses the challenge of analyzing large-scale digital histopathological images using machine learning algorithms. While researchers typically study WSIs by tiling, the size of the tiles can lead to information biases and unstable model performance. This study employed deep learning to segment the WSIs and generate TS-images, which were then used as a foundation to extract spatial features of TME, thereby replacing the WSI. This approach provides a more robust and repeatable method for feature extraction compared to tile-level analysis, effectively reducing data complexity and minimizing noise. Secondly, this study conducts a comprehensive analysis of WSI regions by segmenting TS-images into five areas: tumor, non-tumor, TILs, non-TILs, and lymphocytes. The research demonstrates that combined morphological and clinical features of non-tumor and non-TILs performed better in predicting NAC. Thirdly, while AI is considered a potential clinical guideline method, many deep learning-based studies lack interpretability and repeatability. This study utilizes deep learning methods for foundational image segmentation tasks on WSIs and employs image analysis methods to extract interpretable features from significant and largest regions. In external validation, the best-performing model achieved an AUC of 0.873, outperforming various baseline models (Clinical, Tiles Count, and Relative), pathologists, and the latest state-of-the-art approaches. Additionally, we evaluated the appropriateness of the significant region threshold selection and conducted a comprehensive analysis of interpretable morphological features from multiple perspectives, including the contribution of LASSO-selected features. Lastly, this paper addresses the lack of external validation in most studies due to limited public datasets on NAC. It utilizes the open-source Yale University dataset and the IMPRESS HER2\u0026thinsp;+\u0026thinsp;dataset for multicenter validation, verifying the model's robustness through percentage training. Experimental results demonstrate the model's good generalizability when trained on \"non-tumor\" morphological and clinical combined features.\u003c/p\u003e \u003cp\u003eThis study has several limitations. First, the relatively small dataset may impact the statistical significance of our findings. Our focus on utilizing a single molecular profile to predict the response in HER2\u0026thinsp;+\u0026thinsp;BC patients might limit the generalizability of our results to other contexts. Furthermore, our approach prioritizes morphological features due to their interpretability and computational efficiency, without incorporating additional vision-based features, such as Haralick texture features or deep learning-based embeddings. Although these features could potentially provide complementary information, they were not included in this study. Future work could explore hybrid feature representations, integrating deep learning-derived features with traditional features to further enhance predictive performance.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eOur study introduces a novel approach for predicting NAC response in HER2\u0026thinsp;+\u0026thinsp;breast cancer by integrating morphological and clinical features from histopathological images. Compared to traditional molecular biomarker methods, our approach is cost-effective, interpretable, and highly generalizable across independent datasets. Initially, deep learning techniques are employed to segment the WSI into tumor, necrosis, and lymphocyte regions, generating TS-images. Subsequently, morphological features are extracted from tumor, non-tumor, TILs, non-TILs, and lymphocyte areas, enabling a thorough exploration of TME and TIME information. To enhance the model's pCR prediction performance, morphological features are combined with clinical data. Experimental results demonstrate that the integration of non-tumor and non-TILs features significantly improves NAC treatment prediction accuracy, surpassing recent comparable studies. This research presents an automated, accurate, interpretable, cost-effective, and reproducible method for extracting spatial information from the TME. Future work will focus on expanding external data validation and applying this methodology to other breast cancer molecular subtypes and cancer types for NAC treatment response prediction.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eNAC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eNeoadjuvant chemotherapy\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eHER2\u0026thinsp;+\u0026thinsp;BC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eHER2-positive breast cancer\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003epCR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePathological complete response\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTME\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTumor microenvironment\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTIME\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTumor immune microenvironment\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTS-images\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTissue segmentation images\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTILs\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTumor infiltrating lymphocytes\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003enon-TILs\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003enon-tumor infiltrating lymphocytes\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eLASSO\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLeast absolute shrinkage and selection operator\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eAUC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArea under the curve\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eHER2+\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eHuman epidermal growth factor receptor 2 positive\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eAI\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArtificial intelligence\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eH\u0026amp;E\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ehematoxylin and eosin\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTNBC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTriple-negative breast cancer\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eIHC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eimmunohistochemical\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eER+\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eEstrogen receptor positive\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePR+\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eProgesterone receptor positive\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eWSIs\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eWhole slide images\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTI-CNN\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eCNN for tumor classification, N-CNN:CNN for necrosis classification, L-CNN:CNN for lymphocytes infiltrating classification\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSign.\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003esignificance, GPL:global pooling layer\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eHER2/CEP17 ratio\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eHER2 expression to the expression of chromosome 17 ratio\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eLym.\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003elymphocytes\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePPV\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003epositive predictive value\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eNPV\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003enegative predictive value\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eRIS\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eResidual infiltration size\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMor.\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003emorphological features\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePTD\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePertuzumab\u0026thinsp;+\u0026thinsp;trastuzumab\u0026thinsp;+\u0026thinsp;docetaxel\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eROC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eReceiver Operating Characteristic.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthics Approval and Consent to Participate\u0026nbsp;\u003c/p\u003e\n\u003cp\u003ePatient consent was not required because all\u0026nbsp;samples were archival.\u003c/p\u003e\n\u003cp\u003eConsent for publication\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used in this paper are publicly available and can be accessed through the literature index.\u003c/p\u003e\n\u003cp\u003eCompeting interest\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003eAcknowledgements and funding\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis work was partially supported by grants from the National Natural Science Foundation of China (U21A20521 and 62271178), and the Natural Science Foundation of Zhejiang Province of China (LR23F010002).\u003c/p\u003e\n\u003cp\u003eAuthor Contributions\u003c/p\u003e\n\u003cp\u003eWenSheng Cui, Lihua Li, and Fan Ming performed the research. WenSheng Cui and Fan Ming designed the research study. WenSheng Cui collected the data. WenSheng Cui analyzed the data. WenSheng Cui wrote the paper. Lihua Li and Fan Ming oversaw the project. All authors provided critical feedback on the manuscript and approved the final version of the paper.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGiaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A, Jemal A, Siegel RL. Breast Cancer Statistics, 2022. Ca-a Cancer J Clin. 2022;72(6):524\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLoibl S, Gianni L. HER2-positive breast cancer. Lancet. 2017;389(10087):2415\u0026ndash;29.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAndrulis IL, Bull SB, Blackstein ME, Sutherland D, Mak C, Sidlofsky S, Pritzker KP, Hartwick RW, Hanna W, Lickley L, et al. neu/erbB-2 amplification identifies a poor-prognosis group of women with node-negative breast cancer. Toronto Breast Cancer Study Group. J Clin oncology: official J Am Soc Clin Oncol. 1998;16(4):1340\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeshadri R, Firgaira FA, Horsfall DJ, McCaul K, Setlur V, Kitchen P. Clinical significance of HER-2/neu oncogene amplification in primary breast cancer. The South Australian Breast Cancer Study Group. J Clin oncology: official J Am Soc Clin Oncol. 1993;11(10):1936\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTiwari RK, Borgen PI, Wong GY, Cordon-Cardo C, Osborne MP. HER-2/neu amplification and overexpression in primary human breast cancer is associated with early metastasis. Anticancer Res. 1992;12(2):419\u0026ndash;25.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSlamon DJ, Clark GM, Wong SG, Levin WJ, Ullrich A, McGuire WL. Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene. Sci (New York NY). 1987;235(4785):177\u0026ndash;82.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDerks MGM, van de Velde CJH. Neoadjuvant chemotherapy in breast cancer: more than just downsizing. Lancet Oncol. 2018;19(1):2\u0026ndash;3.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMasood S. Neoadjuvant chemotherapy in breast cancers. Women's health (London England). 2016;12(5):480\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCharfare H, Limongelli S, Purushotham AD. Neoadjuvant chemotherapy in breast cancer. Br J Surg. 2005;92(1):14\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaque W, Verma V, Hatch S, Klimberg VS, Butler EB, Teh BS. Response rates and pathologic complete response by breast cancer molecular subtype following neoadjuvant chemotherapy. Breast Cancer Res Treat. 2018;170(3):559\u0026ndash;67.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCortazar P, Zhang L, Untch M, Mehta K, Costantino JP, Wolmark N, Bonnefoi H, Cameron D, Gianni L, Valagussa P, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164\u0026ndash;72.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchettini F, Pascual T, Conte B, Chic N, Braso-Maristany F, Galvan P, Martinez O, Adamo B, Vidal M, Munoz M et al. HER2-enriched subtype and pathological complete response in HER2-positive breast cancer: A systematic review and meta-analysis. Cancer Treat Rev 2020, 84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoorts B, van Nijnatten TJA, de Munck L, Moossdorff M, Heuts EM, de Boer M, Lobbes MBI, Smidt ML. Clinical tumor stage is the most important predictor of pathological complete response rate after neoadjuvant chemotherapy in breast cancer patients. Breast Cancer Res Treat. 2017;163(1):83\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLips EH, Mulder L, de Ronde JJ, Mandjes IAM, Koolen BB, Wessels LFA, Rodenhuis S, Wesseling J. Breast cancer subtyping by immunohistochemistry and histological grade outperforms breast cancer intrinsic subtypes in predicting neoadjuvant chemotherapy response. Breast Cancer Res Treat. 2013;140(1):63\u0026ndash;71.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlba E, Lluch A, Ribelles N, Anton-Torres A, Sanchez-Rovira P, Albanell J, Calvo L, Lopez Garcia-Asenjo JA, Palacios J, Ignacio Chacon J, et al. High Proliferation Predicts Pathological Complete Response to Neoadjuvant Chemotherapy in Early Breast Cancer. Oncologist. 2016;21(2):150\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiefaard MC, van der Voort A, van Seijen M, Thijssen B, Sanders J, Vonk S, Mittempergher L, Bhaskaran R, de Munck L, van Leeuwen-Stok AE et al. Tumor-infiltrating lymphocytes in HER2-positive breast cancer treated with neoadjuvant chemotherapy and dual HER2-blockade. Npj Breast Cancer 2024, 10(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLuque M, Sanz-Alvarez M, Morales-Gallego M, Madoz-Gurpide J, Zazo S, Dominguez C, Cazorla A, Izarzugaza Y, Arranz JL, Cristobal I et al. Tumor-Infiltrating Lymphocytes and Immune Response in HER2-Positive Breast Cancer. \u003cem\u003eCancers\u003c/em\u003e 2022, 14(24).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDenkert C, von Minckwitz G, Darb-Esfahani S, Lederer B, Heppner BI, Weber KE, Budczies J, Huober J, Klauschen F, Furlanetto J, et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 2018;19(1):40\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarey LA, Berry DA, Cirrincione CT, Barry WT, Pitcher BN, Harris LN, Ollila DW, Krop IE, Henry NL, Weckstein DJ, et al. Molecular Heterogeneity and Response to Neoadjuvant Human Epidermal Growth Factor Receptor 2 Targeting in CALGB 40601, a Randomized Phase III Trial of Paclitaxel Plus Trastuzumab With or Without Lapatinib. J Clin Oncol. 2016;34(6):542\u0026ndash;.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbdel-Fatah TMA, Agarwal D, Liu D-X, Russell R, Rueda OM, Liu K, Xu B, Moseley PM, Green AR, Pockley AG, et al. [SPAG5] as a prognostic biomarker and chemotherapy sensitivity predictor in breast cancer: a retrospective, integrated genomic, transcriptomic, and protein analysis. Lancet Oncol. 2016;17(7):1004\u0026ndash;18.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWimberly H, Brown JR, Schalper K, Haack H, Silver MR, Nixon C, Bossuyt V, Pusztai L, Lannin DR, Rimm DL. PD-L1 Expression Correlates with Tumor-Infiltrating Lymphocytes and Response to Neoadjuvant Chemotherapy in Breast Cancer. Cancer Immunol Res. 2015;3(4):326\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang B, Yu Y, Mao Y, Wang H, Lv M, Su X, Wang Y, Li Z, Zhang Z, Bian T, et al. Development of MRI-Based Deep Learning Signature for Prediction of Axillary Response After NAC in Breast Cancer. Acad Radiol. 2024;31(3):800\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi Z, Gao J, Zhou H, Li X, Zheng T, Lin F, Wang X, Chu T, Wang Q, Wang S et al. Multiregional dynamic contrast-enhanced MRI-based integrated system for predicting pathological complete response of axillary lymph node to neoadjuvant chemotherapy in breast cancer: multicentre study. Ebiomedicine 2024, 107.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeban R-D, Arnaud E, Loirat D, Cabel L, Cottu P, Djerroudi L, Hescot S, Loap P, Bonneau C, Bidard F-C, et al. 18F FDG PET/CT for predicting triple-negative breast cancer outcomes after neoadjuvant chemotherapy with or without pembrolizumab. Eur J Nucl Med Mol Imaging. 2023;50(13):4024\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDerouane F, van Marcke C, Berliere M, Gerday A, Fellah L, Leconte I, Van Bockstal MR, Galant C, Corbet C, Duhoux FP. Predictive Biomarkers of Response to Neoadjuvant Chemotherapy in Breast Cancer: Current and Future Perspectives for Precision Medicine. Cancers 2022, 14(16).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMungenast F, Fernando A, Nica R, Boghiu B, Lungu B, Batra J, Ecker RC. Next-Generation Digital Histopathology of the Tumor Microenvironment. Genes 2021, 12(4).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang Z, Shao W, Han Z, Alkashash AM, de la Sancha C, Parwani AVV, Nitta H, Hou Y, Wang T, Salama P et al. Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images. Npj Precision Oncol 2023, 7(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuanmu H, Bhattarai S, Li H, Shi Z, Wang F, Teodoro G, Gogineni K, Subhedar P, Kiraz U, Janssen EAM, et al. A spatial attention guided deep learning system for prediction of pathological complete response using breast cancer histopathology images. Bioinformatics. 2022;38(19):4605\u0026ndash;12.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFisher TB, Saini G, Rekha TS, Krishnamurthy J, Bhattarai S, Callagy G, Webber M, Janssen EAM, Kong J, Aneja R. Digital image analysis and machine learning-assisted prediction of neoadjuvant chemotherapy response in triple-negative breast cancer. Breast Cancer Res 2024, 26(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShen B, Saito A, Ueda A, Fujita K, Nagamatsu Y, Hashimoto M, Kobayashi M, Mirza AH, Graf HP, Cosatto E, et al. Development of multiple AI pipelines that predict neoadjuvant chemotherapy response of breast cancer using H\u0026amp;E-stained tissues. J Pathol Clin Res. 2023;9(3):182\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi F, Yang Y, Wei Y, Zhao Y, Fu J, Xiao X, Zheng Z, Bu H. Predicting neoadjuvant chemotherapy benefit using deep learning from stromal histology in breast cancer. Npj Breast Cancer 2022, 8(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRodriguez-Bejarano OH, Parra-Lopez C, Patarroyo MA. A review concerning the breast cancer-related tumour microenvironment. Crit Rev Oncol Hematol 2024, 199.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVanguri RS, Fenn KM, Kearney MR, Wang Q, Guo H, Marks DK, Chin C, Alcus CF, Thompson JB, Leu C-S, et al. Tumor Immune Microenvironment and Response to Neoadjuvant Chemotherapy in Hormone Receptor/HER2\u0026thinsp;+\u0026thinsp;Early Stage Breast Cancer. Clin Breast Cancer. 2022;22(6):538\u0026ndash;46.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFarahmand S, Fernandez AI, Ahmed FS, Rimm DL, Chuang JH, Reisenbichler E, Zarringhalam K. Deep learning trained on hematoxylin and eosin tumor region of Interest predicts HER2 status and trastuzumab treatment response in HER2\u0026thinsp;+\u0026thinsp;breast cancer. Mod Pathol. 2022;35(1):44\u0026ndash;51.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlbusayli R, Graham JD, Pathmanathan N, Shaban M, Raza SEA, Minhas F, Armes JE, Rajpoot N. Artificial intelligence-based digital scores of stromal tumour-infiltrating lymphocytes and tumour-associated stroma predict disease-specific survival in triple-negative breast cancer. J Pathol. 2023;260(1):32\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFu Y, Jung AW, Torne RV, Gonzalez S, Vohringer H, Shmatko A, Yates LR, Jimenez-Linan M, Moore L, Gerstung M. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat Cancer. 2020;1(8):800\u0026ndash;.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAmgad M, Stovgaard ES, Balslev E, Thagaard J, Chen W, Dudgeon S, Sharma A, Kerner JK, Denkert C, Yuan Y, et al. Report on computational assessment of Tumor Infiltrating Lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ breast cancer. 2020;6:16\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoode A, Gilbert B, Harkes J, Jukic D, Satyanarayanan M. OpenSlide: A vendor-neutral software foundation for digital pathology. J Pathol Inf. 2013;4:27\u0026ndash;27.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaltz J, Gupta R, Hou L, Kurc T, Singh P, Vu N, Samaras D, Shroyer KR, Zhao T, Batiste R, et al. Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. Cell Rep. 2018;23(1):181\u0026ndash;.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDiao JA, Wang JK, Chui WF, Mountain V, Gullapally SC, Srinivasan R, Mitchell RN, Glass B, Hoffman S, Rao SK et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nat Commun 2021, 12(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang S, Chen A, Yang L, Cai L, Xie Y, Fujimoto J, Gazdar A, Xiao G. Comprehensive analysis of lung cancer pathology images to discover tumor shape and boundary features that predict survival outcome. Sci Rep. 2018;8(1):10393.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan W, Chang J, Fu P. Endocrine therapy resistance in breast cancer: current status, possible mechanisms and overcoming strategies. Future Med Chem. 2015;7(12):1511\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGreenwell K, Hussain L, Lee D, Bramlage M, Bills G, Mehta A, Jackson A, Wexelman B. Complete pathologic response rate to neoadjuvant chemotherapy increases with increasing HER2/CEP17 ratio in HER2 overexpressing breast cancer: analysis of the National Cancer Database (NCDB). Breast cancer Res Treat. 2020;181:249\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSinger CF, Tan YY, Fitzal F, Steger GG, Egle D, Reiner A, Rudas M, Moinfar F, Gruber C, Petru E. Pathological complete response to neoadjuvant trastuzumab is dependent on HER2/CEP17 ratio in HER2-amplified early breast cancer. Clin Cancer Res. 2017;23(14):3676\u0026ndash;83.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSinger CF, Tan YY, Fitzal F, Steger GG, Egle D, Reiner A, Rudas M, Moinfar F, Gruber C, Petru E, et al. Pathological Complete Response to Neoadjuvant Trastuzumab Is Dependent on HER2/CEP17 Ratio in HER2-Amplified Early Breast Cancer. Clin Cancer Res. 2017;23(14):3676\u0026ndash;83.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWolff AC, Hammond MEH, Hicks DG, Dowsett M, McShane LM, Allison KH, Allred DC, Bartlett JMS, Bilous M, Fitzgibbons P, et al. Recommendations for Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Update. Arch Pathol Lab Med. 2014;138(2):241\u0026ndash;56.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAswolinskiy W, Munari E, Horlings HM, Mulder L, Bogina G, Sanders J, Liu Y-H, van den Belt-dusebout AW, Tessier L, Balkenhol M et al. PROACTING: predicting pathological complete response to neoadjuvant chemotherapy in breast cancer from routine diagnostic histopathology biopsies with deep learning. Breast Cancer Res 2023, 25(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSrinidhi CL, Ciga O, Martel AL. Deep neural network models for computational histopathology: A survey. Med Image Anal 2021, 67.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med. 2021;27(5):775\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"breast-cancer-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"brcr","sideBox":"Learn more about [Breast Cancer Research](http://breast-cancer-research.biomedcentral.com)","snPcode":"13058","submissionUrl":"https://submission.nature.com/new-submission/13058/3","title":"Breast Cancer Research","twitterHandle":"@BCRJournal","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Breast cancer, Deep learning, Histopathological images, Neoadjuvant chemotherapy, Tumor microenvironment","lastPublishedDoi":"10.21203/rs.3.rs-5786592/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5786592/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground \u003c/strong\u003eTumor microenvironment (TME) biomarkers derived from histopathological images of HER2+ breast cancer (HER2+BC) can effectively predict pathological complete response (pCR) following neoadjuvant chemotherapy (NAC), thereby enhancing patient prognosis. In this study, we quantitatively assessed the morphological information of critical regions in the TME and analyzed their predictive potential for pCR.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods \u003c/strong\u003eThe retrospective analysis included 147 HER2+BC patients treated with NAC, comprising 85 from the Yale Response dataset for training and 62 from the IMPRESS HER2+ dataset for external validation. Initially, VGG-16 and Xception networks were utilized to segment hematoxylin and eosin-stained histopathology images, generating tissue segmentation images (TS-images). Tumor and non-tumor regions were identified based on the TS-images, from which tumor-infiltrating lymphocytes (TILs) and non-tumor-infiltrating lymphocytes (non-TILs) were extracted, respectively. Subsequently, the morphological information of these regions was quantified through the measurement of connected components. Feature selection was performed based on combined morphological and clinical information, employing the least absolute shrinkage and selection operator. Finally, selected features were input into a multilayer perceptron for training and validated on an external test cohort.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e In external validation, models derived from non-TILs achieved an area under the curve (AUC) of 0.873 in predicting pCR, with F1 score, PPV, recall, and NPV of 0.889, 0.821, 0.970, and 0.933, respectively. This performance significantly surpassed models trained on non-tumor (AUC = 0.779), tumor (AUC = 0.732), TILs (AUC = 0.594), and lymphocytes (AUC = 0.668). Furthermore, despite using 20% of the samples for training, the model trained on non-TILs maintained its high performance (AUC = 0.722). Univariate analyses of pCR revealed significant morphological features, such as the significance area filled mean for non-TILs (p value = 0.026) and the significance number for non-tumor (p value = 0.003).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion\u003c/strong\u003e The TME-based morphological information from histopathological images demonstrates accurate prediction of pCR, offering considerable potential for more precise patient stratification for NAC.\u003c/p\u003e","manuscriptTitle":"Morphological Analysis of Tumor Microenvironment in HER2-Positive Breast Cancer: Predicting Response to Neoadjuvant Chemotherapy on Histopathological Images","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-04 11:10:07","doi":"10.21203/rs.3.rs-5786592/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-04-24T19:17:48+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-23T19:56:24+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"98243707360379055931921382835602898949","date":"2025-04-03T17:32:50+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-04-01T22:24:14+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-03-21T05:29:37+00:00","index":"","fulltext":""},{"type":"submitted","content":"Breast Cancer Research","date":"2025-03-20T07:54:04+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"breast-cancer-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"brcr","sideBox":"Learn more about [Breast Cancer Research](http://breast-cancer-research.biomedcentral.com)","snPcode":"13058","submissionUrl":"https://submission.nature.com/new-submission/13058/3","title":"Breast Cancer Research","twitterHandle":"@BCRJournal","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2d25e416-27ae-48a5-be9b-7cbccbc7c3ac","owner":[],"postedDate":"April 4th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-10-27T16:33:40+00:00","versionOfRecord":{"articleIdentity":"rs-5786592","link":"https://doi.org/10.1186/s13058-025-02139-x","journal":{"identity":"breast-cancer-research","isVorOnly":false,"title":"Breast Cancer Research"},"publishedOn":"2025-10-21 16:16:29","publishedOnDateReadable":"October 21st, 2025"},"versionCreatedAt":"2025-04-04 11:10:07","video":"","vorDoi":"10.1186/s13058-025-02139-x","vorDoiUrl":"https://doi.org/10.1186/s13058-025-02139-x","workflowStages":[]},"version":"v1","identity":"rs-5786592","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5786592","identity":"rs-5786592","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00