Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques

doi:10.21203/rs.3.rs-9557672/v1

Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques

2026 · doi:10.21203/rs.3.rs-9557672/v1

preprint OA: closed

Full text JSON View at publisher

Full text 119,785 characters · extracted from preprint-html · click to expand

Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques Anwaar Saeed, Thant Hoe, Kilsi Kobani, Rachelle Atchina, Maren Buk, and 8 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9557672/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 9 You are reading this latest preprint version Abstract Immunotherapy (IO) improves survival in advanced hepatocellular carcinoma (HCC), yet under 30% of patients respond to treatment. Existing biomarkers have shown limited predictive accuracy. Machine learning (ML) techniques could be used to develop prediction models that support personalised treatment. We developed and evaluated machine learning models that predict response to IO in patients with HCC. We retrospectively analyzed data from 298 patients with HCC treated with immunotherapy at the University of Pittsburgh Medical Center (UPMC) Hillman Cancer Center between December 2014 and December 2023. Of the 298 patients, 215 (71%) had stable disease and 87 (29%) had progression. The best-performing model was a late-fusion model with text and tabular features (AUC 0.78, Precision: 0.76, Recall 0.70). Good performance was retained even when restricted to tabular variables (AUC 0.71, Precision: 0.70, Recall 0.70). Key predictors of response included lower alpha-fetoprotein (AFP), liver function tests within normal range (AST, ALT, ALP, albumin, bilirubin), higher total protein and lower grade of ECOG performance status. 142 patients had first-line IO treatment with atezolizumab and bevacizumab (Atezo/Bev) and 57 patients had durvalumab and tremelimumab (Durva/Treme). The model performed well for both treatment groups (Atezo/Bev: AUC 0.78, Durva/Treme: AUC 0.69). This study demonstrates that ML models integrating both clinical and NLP-derived features can accurately predict IO response in patients with HCC. Future work will externally validate these results on larger datasets, with the aim of developing generalizable and clinically useful predictive models. hepatocellular carcinoma immunotherapy machine learning natural language processing predictive modeling immune checkpoint inhibitors Figures Figure 2 Figure 3 Figure 4 Figure 5 Introduction The incidence of liver cancer increased by 75% between 1990 and 2015 [ 1 ]. The World Health Organization predicts that more than 1.3 million people will die from liver cancer in 2040 [ 2 ]. Hepatocellular carcinoma (HCC) is the most common primary malignancy of the liver, accounting for approximately 85–90% of cases [ 3 ]. The main risk factors for HCC include chronic hepatitis B or C virus infection, alcohol misuse, obesity, type 2 diabetes, and smoking [ 4 ]. Most patients with HCC are diagnosed at an advanced stage, at which point surgical resection, ablation and liver transplantation are no longer feasible [ 5 ]. Concurrent underlying liver disease often limited treatment options due to deranged liver function [ 6 ], consequently, survival for advanced HCC remains poor, with a 5-year survival rate of less than 20% [ 7 ]. Immune checkpoint inhibitors (ICIs) have emerged as a transformative treatment modality in oncology, leading to improved survival outcomes in several malignancies [ 8 ]. In HCC, the combination of atezolizumab, a programmed death-ligand 1 (PD-L1) inhibitor, and bevacizumab, an anti-angiogenic agent (Atezo/Bev), and the combination of durvalumab, a PD-L1 inhibitor, and tremelimumab, a cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) inhibitor (Durva/Treme), has demonstrated superiority over sorafenib in patients with unresectable HCC and has since become the first-line standard-of-care treatment for advanced HCC (aHCC) [ 9 ] [ 10 ]. Although treatment with Atezo/Bev has resulted in significant improvements in overall survival, 20% of patients are refractory to Atezo/Bev, with only 20–30% radiological response rate [ 11 ]. Substantial adverse effects including gastrointestinal bleeding, hypertension as well as immune-related adverse events which can be debilitating, irreversible and life-threatening [ 12 ]. Currently, the most widely used approach for predicting response to immune checkpoint blockade is PD-L1 expression assessed by immunohistochemistry (IHC) [ 13 ] However, PD-L1 IHC has demonstrated limited predictive accuracy across tumor types, including HCC. Response rates among PD-L1–positive patients remain low (26%), while responses are occasionally observed in PD-L1–negative patients (18%) [ 14 ]. In addition, PD-L1 testing requires invasive tissue sampling, which may be challenging in patients with advanced disease and carries procedural risks. Given the toxicity profile, high cost, and limited efficacy of immunotherapy in unselected patient populations, there is an urgent unmet need to identify accurate and robust predictive tools to identify patients who are mostly likely to have favorable treatment response. Machine learning (ML) offers a promising framework to address these challenges by enabling the integration of high-dimensional and heterogeneous clinical data. ML models are capable of capturing complex, non-linear relationships that are difficult to model using traditional statistical approaches [ 15 ]. In oncology, ML has been applied to outcome prediction, treatment response modeling, and patient stratification using diverse data sources, including clinical variables, imaging, pathology, and unstructured electronic health record data. Importantly, ML-based approaches may facilitate the development of non-invasive, data-driven predictors of immunotherapy response that extend beyond single biomarkers [ 16 ]. Our ML and explainable artificial intelligence (AI) models have shown to predict ICI response, immune related adverse events and subsequent admission in patients with melanoma [ 17 ]–[ 20 ]. A previous study demonstrated higher inter-model agreement among large language models (LLMs) compared with physician predictions in assessing immunotherapy response in HCC, albeit with moderate accuracy and sensitivity [ 21 ]. The aim of the study is to use our validated ML models to predict ICIs response among patients with aHCC treated with ICIs. It is hoped that identifying patient and clinical features associated with favorable responses to immune checkpoint inhibitors (ICIs) will support and inform clinical decision-making. Methods Patient Cohort Electronic health record data was collected from 302 patients with a diagnosis of hepatocellular cancer who were being treated at the University of Pittsburgh Medical Centre (UPMC) Hillman Centre between December 2014 and December 2023. For each patient, this included demographic data (age, gender, occupation), clinical data such as laboratory measurements (e.g. liver enzymes, serum proteins, coagulation studies), tumour staging and genomic testing results, treatment plan (inclusive of different immunotherapy regimens, current and past adjuvant treatments and neoadjuvant treatments), biopsy status, clinical text in imaging (e.g. magnetic resonance imaging, computer tomography) and histopathology reports at diagnosis and at start of immunotherapy, other relevant clinical data such as presenting complaint of the cancer diagnosis, past medical history, exposure to hepatotoxic chemicals such as alcohol, cocaine, heroin and marijuana, Child-Pugh scores and Eastern Cooperative Oncology Group (ECOG) performance status were included, contributing to a total of 77 data points. All patient data was de-identified and anonymized. Outcome of Interest The outcome of interest for this study was response to immunotherapy. A binary treatment response outcome was derived by categorizing radiological response categories (complete response, partial response, stable disease, progressive disease) into responder and non-responder classes [ 22 ]. Patients who had complete or partial response were classified as responders and those who had stable disease or progressive disease were classified as non-responders. Data Preprocessing All data processing and machine learning model development was done in Google CoLab (RRID:SCR_018009) using Python version 3.12.12 (RRID:SCR_008394). Columns with more than 80% missing values were removed [ 23 ], [ 24 ]. Irrelevant clinical identifiers, dates, free-text administrative fields, and post-treatment outcome variables that could imply response and hence introduce data leakage were excluded. Outliers were assessed visually using boxplots and removed if they were physiologically implausible and likely due to measurement or data entry errors (e.g., alpha-fetoprotein > 800,000 IU/mL). Past medical history and comorbidities were transformed into binary features indicating the presence or absence of a condition using keyword-based extraction, this will help to identify significant past medical history such as liver cirrhosis, viral hepatitis, metabolic liver disease, hypertension, diabetes and cardiovascular disease which are risk factors of HCC. To reduce multicollinearity and model instability, liver function tests and cancer staging variables with high correlation were excluded, and correlation matrices and variance inflation factors (VIFs) were assessed. Variables with sparse events or low prevalence were removed to ensure stable coefficient estimation. Treatment variables, including local therapies and immunotherapy regimens, were standardized and one-hot encoded based on predefined drug and procedure groupings. Of the remaining columns, those containing free-text (‘notes', 'presenting_complaints', 'physical_examination_findings', 'ct_report_at_diagnosis', 'mri_report_at_diagnosis', 'ct_report_around_the_start_of_immunotherapy', 'mri_report_at_start_of_immunotherapy', 'histopathological_features') were combined into one column with section markers named ‘full_text’. After removal of duplicated variables, the final dataset contained 54 features. Some patient entries were also deleted due to insufficient variables, resulting in a final cohort of 298 patients. The combined clinical free-text data was preprocessed using a custom lemmatization and tokenization pipeline. Text was first tokenized using a regular expression that captures alphanumeric terms and decimal numbers, then lemmatized with the WordNet lemmatizer. Stop words were removed using the standard scikit-learn English stop word list, with negation terms (e.g., no, not, never, none) explicitly retained to preserve clinically relevant polarity. Training and Validation set The dataset was randomly split into 80% for training and 20% for validation by number of patients, and stratified by outcome. Within the training set, 5-fold cross validation was performed for comparison before and after hyperparameter tuning, then against the holdout validation set. Machine Learning Prediction Pipeline Feature selection was performed using SelectKBest with Analysis of Variance (ANOVA) F-tests to identify and retain the top 20 predictive non-text features to reduce model complexity and dimensionality [ 25 ]. Multiple classification algorithms such as Lasso logistic regression, random forest, XGBoost, and support vector machine were implemented, with class imbalance addressed via weighting [ 26 ], [ 27 ], [ 28 ]. Hyperparameter tuning was conducted for all models and after tuning, models were evaluated using stratified train-test splits and cross-validation with ROC-AUC as the primary metric. The best-performing model was further interpreted using Shapley Additive Explanations (SHAP) values to identify feature importance across the overall cohort and within treatment-specific subgroups (Atezo/Bev and Durva/Treme) [ 8 ], [ 29 ], 30]. Statistical comparisons between treatment groups were conducted using Mann-Whitney U tests for continuous variables, Chi-squared tests for categorical variables, and t-tests for predicted response probabilities, providing interpretability and insight into subgroup treatment effects. The final optimized model was saved for downstream deployment and further analysis. Natural Language Processing Processed tokens were rejoined into normalized text strings and transformed into numerical feature representations using term frequency–inverse document frequency (TF-IDF) vectorization [ 31 ]. Both unigram and bigram representations were evaluated, and vocabulary size was controlled by varying the maximum number of retained features to reduce overfitting in small clinical datasets. Logistic regression, random forest, support vector machine, and XGBoost classifiers were trained. Model selection and hyperparameter optimization were performed using five-fold cross-validated grid search. The final model was selected based on the test set ROC-AUC performance. Late-Fusion Strategy A late-fusion approach was used to combine predictions from the tabular and text-based models at the probability level. Final class probabilities were computed as a weighted average of the two model outputs, with the fusion weight (α) optimized by sweeping values from 0 to 1 and selecting the value that maximized ROC-AUC on the test set. Logistic Regression A multivariable logistic regression model was developed to evaluate clinical predictors of immunotherapy response in patients with aHCC. The cleaned dataset was imported into R version 4.5.2 using RStudio (RRID:SCR_000432). Variables were converted to factors or retained as numeric based on their distribution and number of unique values. A full logistic regression model with a binomial link function was initially fitted, adjusting for potential confounders. Model refinement was performed by excluding non-significant and collinear predictors, followed by refitting the final model. Adjusted odds ratios (ORs) with 95% confidence intervals were calculated by exponentiating model coefficients. Statistical significance was defined as a two-sided p-value < 0.05. Model results were summarized using tidy model outputs, and significant predictors were visualized using forest plots. Results A total of 298 patients with HCC receiving immunotherapy were included in the final analysis after data cleaning. The mean age was similar between groups (69.0 ± 8.0 years in responders vs 68.4 ± 10.1 years in non-responders).The majority of patients were male, in both responders (85%) and non-responders (80.2%). 167 (56.0%) achieved a response and 131 (44.0%) were classified as non-responders. Responders were more likely to have better baseline performance status, with 41.3% having an ECOG score of 0 compared with 21.4% among non-responders; while poor performance status (ECOG 2– 3) was more frequent in non-responders (22.1% vs 7.2%). Liver function differed notably between groups. A greater proportion of responders had Child–Pugh A status (83.2% vs 66.4%), and responders, on average, had higher albumin levels (3.5 ± 0.6 vs 3.2 ± 0.7 g/dL) and lower total bilirubin (1.2 ± 1.1 vs 1.8 ± 2.7 mg/dL). Markers of tumor burden and cholestasis were also lower in responders. AFP levels were substantially lower in responders (mean = 5,160.7 ± 29,463.3 ng/mL) compared with non-responders (mean = 18,646.5 ± 56,254.8 ng/mL), as were ALP levels (mean = 153.9 ± 101.0 vs 211.4 ± 146.5 U/L) (Table 1). With respect to treatment, responders were more likely to have received Atezo/Bev (55.1% vs 38.2%), whereas the proportion receiving Durva/Treme was similar between groups (18.0% vs 20.6%). Variable Overall (n = 298) Non-response (n = 131) Response (n = 167) SMD* P-value Age, mean (SD † ) 68.7 (8.9) 68.4 (10.1) 69.0 (8.0) 0.06 0.611 Sex, n (%) Female 51 (17.1) 26 (19.8) 25 (15.0) 0.129 0.34 Male 247 (82.9) 105 (80.2) 142 (85.0) ECOG ‡ performance status, n (%) 0.553 < 0.001 0 97 (32.6) 28 (21.4) 69 (41.3) 1 160 (53.7) 74 (56.5) 86 (51.5) 2–3 41 (13.8) 29 (22.1) 12 (7.2) Child–Pugh score, n (%) 0.428 0.001 A 226 (75.8) 87 (66.4) 139 (83.2) B–C 72 (24.2) 44 (33.6) 28 (16.8) Albumin, mean (SD) 3.4 (0.6) 3.2 (0.7) 3.5 (0.6) 0.421 < 0.001 Total bilirubin, mean (SD) 1.5 (2.0) 1.8 (2.7) 1.2 (1.1) -0.334 0.007 AFP § , mean (SD) 11089.0 (43764.5) 18646.5 (56254.8) 5160.7 (29463.3) -0.3 0.014 ALP ¶ , mean (SD) 179.2 (126.1) 211.4 (146.5) 153.9 (101.0) -0.457 < 0.001 Treatment regimen, n (%) 0.344 0.005 Atezo/Bev 142 (47.7) 50 (38.2) 92 (55.1) Durva/Treme 57 (19.1) 27 (20.6) 30 (18.0) *Standardised mean difference, † Standard deviation, ‡ Eastern Cooperative Oncology Group, § Alpha-fetoprotein, ¶ Alkaline phosphatase Table 1: Baseline Characteristics of Patients by Response Status Multivariable Logistic Regression Analysis The multivariable logistic regression model demonstrated improved fit compared with the null model (residual deviance = 224.6 vs. null deviance = 267.5; AIC = 270.6). Performance status was strongly associated with treatment response. Patients with poorer functional status (performance status 2–3) had significantly lower chance of achieving response compared with those with performance status 0 (OR = 0.21, 95% CI: 0.06–0.70, p = 0.011). A weaker trend was observed for performance status 1, though this did not reach statistical significance. Meeting the Milan criteria for liver transplantation was 3 times more likely to respond to ICIs (OR = 3.90, 95% CI: 1.01–15.0, p = 0.049). AFP and disease stage at diagnosis, showed borderline associations (AFP: p = 0.067; stage III vs. stage I: p = 0.053), but did not reach statistical significance. No statistically significant associations were observed for age, sex, liver function measures (albumin, bilirubin, INR), Child-Pugh class, viral hepatitis status, cirrhosis history, or treatment regimen (Durva/Treme vs. Atezo/Bev). Machine Learning Pipeline Among the tabular models, the random forest classifier achieved the highest discriminative performance on the held-out test set, with a ROC-AUC of 0.71, outperforming support vector machines (0.69), XGBoost (0.68), and logistic regression (0.57) (Fig. 2 a). Feature importance analysis using SHAP values identified key clinical variables contributing to model predictions (Fig. 3 ). Key predictors of response included lower alpha-fetoprotein (AFP), liver function tests within normal range (AST, ALT, ALP, albumin, bilirubin), higher total protein and lower grade of ECOG performance status. 142 patients had first-line IO treatment with Atezo/Bev and 57 patients had durvalumab and tremelimumab Durva/Treme. For the text-based TF-IDF models, the support vector machine achieved the best performance following tuning, with a test ROC-AUC of 0.77, followed by logistic regression (0.75), random forest (0.66), and XGBoost (0.60). The most informative textual features included numeric tokens and clinically relevant imaging and disease-related terms, such as ‘arterial’, ‘carcinoma’, ‘hepatic’, ‘enhancement’, and ‘cirrhotic’, reflecting the importance of radiologic and diagnostic language in model discrimination. Trained on a) Tabular Features Only, and b) TF-IDF Text Features Only Comparison of Model Performance on Treatment Regimens Subgroup analysis showed that model performance for patients receiving Atezo/Bev (test AUC-ROC 0.78) was slightly superior to those receiving Durva/Treme (test AUC-ROC 0.69). There was no statistically significant difference among categorical and continuous variables between the two treatment groups ( p > 0.05). The model's predicted risk scores did not differ significantly by treatment regimen. The predicted mean response for the Atezo/Bev group was 0.548 and 0.531 for the Durva/Treme group with a t-test statistic of 0.336 ( p = 0.739). Late fusion of the best-performing tabular (random forest) and text-based (SVM) models further improved predictive performance. The optimal fusion weight favored the text modality, with 90% contribution from the text model and 10% from the tabular model, yielding a combined ROC-AUC of 0.78. Discussion In this study, we demonstrated how our advanced machine learning models can accurately predict immunotherapy response between responders and non-responders in patients with aHCC treated with ICIs with high predicted probability of AUC of 0.78 in our best predicted model. We also determined the key predictors of treatment efficiency and progression. Among all the models evaluated, the late-fusion model incorporating a random forest classifier for tabular clinical features and a support vector machine classifier for text-derived features achieved the best predictive performance. Key predictors of response included lower AFP levels, liver function tests within normal ranges (AST, ALT, ALP, albumin, and bilirubin), higher total protein levels, and better ECOG performance status. Our model was especially valuable in supporting clinical decisionmaking for patients who are at the borderline of eligibility for immunotherapy. By identifying individuals most likely to benefit, these models may help minimize unnecessary exposure to potentially toxic treatments while optimizing therapeutic outcomes. Alpha-fetoprotein (AFP) is a widely used serum biomarker for assessing treatment response in hepatocellular carcinoma (HCC). Our model confirmed the predictive value of AFP in predicting ICIs response, this is consistent with a meta-analysis of 131 studies showing patients with an AFP response had significantly higher objective radiological response rates [ 32 ]. A Chinese study further demonstrated that an early decline in AFP from baseline to the first follow-up (i.e., 4–6 weeks after treatment initiation) was associated with improved overall survival (OS) and progression-free survival (PFS) [ 33 ]. Similarly, liver enzymes, which proved to be indicators of hepatic functional reserve, were significant predictors of clinical outcomes. Tian et al. showed that impaired liver function, reflected by higher Child–Pugh or ALBI scores, was associated with poorer prognosis in patients receiving ICIs [ 34 ]. In contrast, the predictive value of PD-L1 expression remains unclear. In our model, PD-L1 levels were not associated with response to ICIs, consistent with findings from the KEYNOTE-224 and CheckMate-040 studies [ 35 ], [ 36 ]. By incorporating routinely available laboratory tests and demographic data as well as clinical text from clinician notes and investigation reports, this study demonstrates that immunotherapy response can be predicted using simple clinical inputs without the need for sophisticated biomolecular testing. This is particularly important given that these laboratory tests are inexpensive, widely available, and easily implemented in routine clinical practice, including in low-resource settings. Our findings further support the growing body of evidence demonstrating that machine learning approaches leveraging routinely collected clinical data can aid in predicting response to immunotherapy and in identifying patients at increased risk of disease progression. Artificial intelligence (AI) and machine learning (ML) are increasingly transforming patient care by enabling precision medicine and personalised oncology treatment strategies [ 37 ]. These methods can efficiently analyse and integrate large volumes of multimodal clinical data, providing timely and clinically actionable predictions [ 38 , 39 ]. In addition, AI and ML models are capable of identifying subtle or latent patterns within complex datasets that may be predictive of patient response to immune checkpoint inhibitors (ICIs) [ 40 ]. Importantly, AI and ML models can iteratively improve during the training process by adjusting model parameters in response to prediction errors. Through repeated exposure to diverse patient profiles and clinical outcomes, these models refine their representations of complex, non-linear relationships among clinical, molecular, and treatment variables, thereby enhancing model generalisability and predictive accuracy [ 41 ]. The main strength of our study lies in the systematic evaluation of five machine learning models to identify the optimal predictive approach. In addition, we specifically identified the most relevant clinical features and multimodal late-fusion pipeline to extract meaningful clinical data from unstructured text, thereby enhancing data input quality. To our knowledge, this study incorporates the largest patient numbers and clinical data points than prior work, contributing to improved predictive accuracy. We acknowledged one of the limitations of this study is being a single-center retrospective study with inherent biases in patient selection. However, our study is the largest to specifically assess immunotherapy response to aHCC using patients' clinical characteristics. Additionally, the number of patients who received the Atezo/Bev and Durva/Treme combinations was insufficient to detect clinically meaningful differences between treatment groups or to support robust subgroup analyses for generalizable clinical inference. Larger, more diverse cohorts are required to validate these findings across immune checkpoint–based combinations, patient demographics, geographic regions, treatment paradigms, and healthcare systems. This study lays a solid foundation for identifying patients who are likely to benefit from immunotherapy. The next step toward clinical implementation is validation in larger, more diverse populations with broader ethnic backgrounds and heterogeneous disease characteristics through multicenter retrospective studies. Subsequently, prospective realworld studies will be required to evaluate model performance and determine the most clinically effective predictive algorithms. Conclusion We have developed a well-validated machine learning–based predictive model capable of distinguishing responders from non-responders to immunotherapy in advanced HCC using clinical features in the form of structured tables and unstructured text. This model may support clinical decision-making in a setting where treatment response is notoriously difficult to predict. With the inclusion of additional clinical features and external validation in future studies, this approach has the potential to evolve into a clinically useful tool for guiding immunotherapy selection in routine practice. Declarations Acknowledgements The authors would like to acknowledge UPMC for the compilation of and provision of access to the patient datasets used in this research. Ethics Approval and Consent to Participate This study was reviewed and approved by the University of Pittsburgh Institutional Review Board (IRB) (protocol number: M0D23070075-002). This study used retrospectively collected, de-identified patient data, and was conducted in accordance with relevant guidelines and regulations. Informed consent to participate was granted by all participants in the study. Data Availability Statement The datasets generated and/or analyzed during the current study are not publicly available due to data use agreements and patient confidentiality but may be available from the UPMC subject to their approval and applicable ethical and legal requirements. Funding This study was funded by Curenetics Ltd. Conflicts of Interest The authors declare no potential conflicts of interest. Statement of Significance This study develops a multimodal AI model predicting HCC immunotherapy response (AUC 0.78) using routine clinical labs and text data, outperforming PD-L1 biomarkers. It identifies actionable predictors enabling precision selection for Atezo/Bev and Durva/Treme, addressing the critical 70% non-response rate. References Global Burden of Disease Liver Cancer Collaboration. ‘The Burden of Primary Liver Cancer and Underlying Etiologies From 1990 to 2015 at the Global, Regional, and National Level: Results From the Global Burden of Disease Study 2015’, JAMA Oncol , vol. 3, no. 12, pp. 1683–1691, Dec. 2017, 10.1001/jamaoncol.2017.3055 Rumgay H, et al. Global burden of primary liver cancer in 2020 and predictions to 2040. J Hepatol. Dec. 2022;77(6):1598–606. 10.1016/j.jhep.2022.08.021 . Sung H, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. Cancer J Clin. 2021;71(3):209–49. 10.3322/caac.21660 . McGlynn KA, Petrick JL, Groopman JD. Liver Cancer: Progress and Priorities. Cancer Epidemiol Biomarkers Prev. Oct. 2024;33(10):1261–72. 10.1158/1055-9965.EPI-24-0686 . Falette Puisieux M et al. Jan., ‘Therapeutic Management of Advanced Hepatocellular Carcinoma: An Updated Review’, Cancers , vol. 14, no. 10, p. 2357, 2022, 10.3390/cancers14102357 Shiani A, Narayanan S, Pena L, Friedman M. The Role of Diagnosis and Treatment of Underlying Liver Disease for the Prognosis of Primary Liver Cancer. Cancer Control. 2017;24(3). 10.1177/1073274817729240 . Erratum. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2020;70(4):313. 10.3322/caac.21609 . Garg P, Pareek S, Kulkarni P, Horne D, Salgia R, Singhal SS. Next-Generation Immunotherapy: Advancing Clinical Applications in Cancer Treatment. J Clin Med. Jan. 2024;13(21):6537. 10.3390/jcm13216537 . Finn RS, et al. Atezolizumab plus Bevacizumab in Unresectable Hepatocellular Carcinoma. N Engl J Med. May 2020;382(20):1894–905. 10.1056/NEJMoa1915745 . Abou-Alfa GK et al. Jun., Tremelimumab plus Durvalumab in Unresectable Hepatocellular Carcinoma, NEJM Evidence , vol. 1, no. 8, 2022, https://doi.org/10.1056/evidoa2100070 Qin R, Jin T, Xu F. Biomarkers predicting the efficacy of immune checkpoint inhibitors in hepatocellular carcinoma. Front Immunol. 2023;14:1326097. 10.3389/fimmu.2023.1326097 . PMID: 38187399; PMCID: PMC10770866. Ventura I, Sanchiz L, Legidos-García ME, Murillo-Llorente MT, Pérez-Bermejo M. ‘Atezolizumab and Bevacizumab Combination Therapy in the Treatment of Advanced Hepatocellular Cancer’, Cancers , vol. 16, no. 1, p. 197, Jan. 2024, 10.3390/cancers16010197 Doroshow DB, et al. PD-L1 as a biomarker of response to immunecheckpoint inhibitors. Nat Rev Clin Oncol. Jun. 2021;18(6):345–62. 10.1038/s41571-021-00473-5 . Yang Y, et al. The predictive value of PD-L1 expression in patients with advanced hepatocellular carcinoma treated with PD-1/PD-L1 inhibitors: A systematic review and meta-analysis. Cancer Med. 2023;12(8):9282–92. 10.1002/cam4.5676 . Adlung L, Cohen Y, Mor U, Elinav E. ‘Machine learning in clinical decision making’, Med , vol. 2, no. 6, pp. 642–665, Jun. 2021, 10.1016/j.medj.2021.04.006 Huang L, et al. Artificial Intelligence Can Predict Personalized Immunotherapy Outcomes in Cancer. Cancer Immunol Res. Feb. 2025;13(7):964–77. https://doi.org/10.1158/2326-6066.CIR-24-1270 . Sharma L, Balaji V, Katumba A, Mohan E, Adeleke S. Abstract A015: Precision medicine approach to melanoma immunotherapy: Predicting response, adverse events, and hospital admissions using machine learning and explainable artificial intelligence. Clin Cancer Res. Jul. 2025;31:A015–015. 10.1158/1557-3265.AIMACHINE-A015 . Sharma L, et al. 2P Using machine learning (ML) and explainable artificial intelligence (AI) to accurately predict immune-checkpoint inhibitor (ICI) response in small cell (SCLC) and non-small cell (NSCLC) lung cancer patients. ESMO Open. Mar. 2025;10:104157–104157. https://doi.org/10.1016/j.esmoop.2025.104157 . Sharma L, et al. 3P Predicting immunotherapy-related adverse events in melanoma patients using machine learning algorithms and explainable artificial intelligence. ESMO Open. Mar. 2025;10:104158. https://doi.org/10.1016/j.esmoop.2025.104158 . Sharma L et al. Immunotherapy toxicity prediction in patients with melanoma using machine learning algorithms. J Clin Oncol, 42, no. 16_suppl, pp. e21567–e21567, Jun. 2024, https://doi.org/10.1200/jco.2024.42.16_suppl.e21567 Xu J, et al. Predicting Immunotherapy Response in Unresectable Hepatocellular Carcinoma: A Comparative Study of Large Language Models and Human Experts. J Med Syst. May 2025;49(1):64. 10.1007/s10916-025-02192-1 . Eisenhauer EA, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. Jan. 2009;45(2):228–47. 10.1016/j.ejca.2008.10.026 . van Buuren S, Flexible Imputation of Missing Data., Second Edition. Second edition. | Boca Raton, Florida: CRC Press, [2019] |: Chapman and Hall/CRC, 2018. https://doi.org/10.1201/9780429492259 Kuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2019. Guyon I, Elisseeff A. ‘An introduction to variable and feature selection’, J. Mach. Learn. Res. , vol. 3, no. null, pp. 1157–1182, Mar. 2003. Breiman L, Forests’ ‘Random. Machine Learning , vol. 45, no. 1, pp. 5–32, Oct. 2001, 10.1023/A:1010933404324 Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. ‘Support vector machines’, IEEE Intelligent Systems and their Applications , vol. 13, no. 4, pp. 18–28, Jul. 1998, 10.1109/5254.708428 Chen T, Guestrin C. ‘XGBoost: A Scalable Tree Boosting System’, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Aug. 2016, pp. 785–794. 10.1145/2939672.2939785 Lundberg S, Lee S. ‘A Unified Approach to Interpreting Model Predictions’, Nov. 25, 2017, arXiv : arXiv:1705.07874. 10.48550/arXiv.1705.07874 Abou-Alfa GK et al. Jul., ‘Tremelimumab plus Durvalumab in Unresectable Hepatocellular Carcinoma’, NEJM Evidence , vol. 1, no. 8, p. EVIDoa2100070, 2022, 10.1056/EVIDoa2100070 Salton G, Buckley C. ‘Term-weighting approaches in automatic text retrieval’, Information Processing & Management , vol. 24, no. 5, pp. 513–523, Jan. 1988, 10.1016/0306-4573(88)90021-0 Tian et al. The prognostic and predictive value of AFP in immune checkpoint inhibitor-treated hepatocellular carcinoma: a systematic review and metaanalysisFront. Immunol., 04 November 2025 Sec. Cancer Immunity and Immunotherapy Volume 16–2025 | https://doi.org/10.3389/fimmu.2025.1695861 Hou G, Liu B, Fan ZQ, Li C, Zhang JP, Guo YH, Zhang RY, Zheng Y, Zhu H, Wang NY. Association between early response of alpha-fetoprotein and treatment efficacy of systemic therapy for advanced hepatocellular carcinoma: A multicenter cohort study from China. Front Oncol. 2023;12:1094104. 10.3389/fonc.2022.1094104. . PMID: 36686731; PMCID: PMC9846773. Tian BW, Yan LJ, Ding ZN, Liu H, Han CL, Meng GX, Xue JS, Dong ZR, Yan YC, Hong JG, Chen ZQ, Wang DX, Li T. Evaluating liver function and the impact of immune checkpoint inhibitors in the prognosis of hepatocellular carcinoma patients: A systemic review and meta-analysis. Int Immunopharmacol. 2023;114:109519. 10.1016/j.intimp.2022.109519. . Epub 2022 Nov 30. PMID: 36459922. Zhu AX, Finn RS, Edeline J, Cattan S, Ogasawara S, Palmer D, et al. Pembrolizumab in patients with advanced hepatocellular carcinoma previously treated with sorafenib (KEYNOTE-224): a non-randomised, open-label phase 2 trial. Lancet Oncol. 2018;19(7):940–52. 10.1016/S1470-2045(18)30351-6 . Yau T, Kang YK, Kim TY, El-Khoueiry AB, Santoro A, Sangro B, Melero I, Kudo M, Hou MM, Matilla A, Tovoli F, Knox JJ, Ruth He A, El-Rayes BF, Acosta-Rivera M, Lim HY, Neely J, Shen Y, Wisniewski T, Anderson J, Hsu C. Efficacy and Safety of Nivolumab Plus Ipilimumab in Patients With Advanced Hepatocellular Carcinoma Previously Treated With Sorafenib: The CheckMate 040 Randomized Clinical Trial. JAMA Oncol. 2020;6(11):e204564. 10.1001/jamaoncol.2020.4564 . Epub 2020 Nov 12. Erratum in: JAMA Oncol. 2021 Jan 1;7(1):140. doi: 10.1001/jamaoncol.2020.6961. PMID: 33001135; PMCID: PMC7530824. Hectors SJ, Lewis S, Besa C, King MJ, Said D, Putra J, Ward S, Higashi T, Thung S, Yao S, Laface I, Schwartz M, Gnjatic S, Merad M, Hoshida Y, Taouli B. MRI radiomics features predict immuno-oncological characteristics of hepatocellular carcinoma. Eur Radiol. 2020;30(7):3759–69. 10.1007/s00330-020-06675-2. . Epub 2020 Feb 21. PMID: 32086577; PMCID: PMC7869026. Rakaee M, Tafavvoghi M, Ricciuti B, Alessi JV, Cortellini A, Citarella F, Nibid L, Perrone G, Adib E, Fulgenzi CAM, Hidalgo Filho CM, Di Federico A, Jabar F, Hashemi S, Houda I, Richardsen E, Rasmussen Busund LT, Donnem T, Bahce I, Pinato DJ, Helland Å, Sholl LM, Awad MM, Kwiatkowski DJ. Deep Learning Model for Predicting Immunotherapy Response in Advanced Non-Small Cell Lung Cancer. JAMA Oncol. 2025;11(2):109–18. 10.1001/jamaoncol.2024.5356. . PMID: 39724105; PMCID: PMC11843371. Qin R, Jin T, Xu F. Biomarkers predicting the efficacy of immune checkpoint inhibitors in hepatocellular carcinoma. Front Immunol. 2023;14:1326097. 10.3389/fimmu.2023.1326097 . PMID: 38187399; PMCID: PMC10770866. Liu J, Fu R, Su Y, Li Z, Huang X, Wang Q, Shi Z, Wei S. Applications of artificial intelligence in cancer immunotherapy: a frontier review on enhancing treatment efficacy and safety. Front Immunol. 2025;16:1676112. 10.3389/fimmu.2025.1676112. . PMID: 41246300; PMCID: PMC12615365. Gao Q, Yang L, Lu M, Jin R, Ye H, Ma T. The artificial intelligence and machine learning in lung cancer immunotherapy. J Hematol Oncol. 2023;16(1):55. 10.1186/s13045-023-01456-y. . PMID: 37226190; PMCID: PMC10207827. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 15 May, 2026 Reviewers agreed at journal 10 May, 2026 Reviewers agreed at journal 05 May, 2026 Reviewers agreed at journal 05 May, 2026 Reviewers invited by journal 05 May, 2026 Editor assigned by journal 05 May, 2026 Editor invited by journal 05 May, 2026 Submission checks completed at journal 04 May, 2026 First submitted to journal 04 May, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9557672","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":637751413,"identity":"2a6af9e2-e592-4187-905a-e47121171cea","order_by":0,"name":"Anwaar Saeed","email":"","orcid":"","institution":"UPMC Hillman Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Anwaar","middleName":"","lastName":"Saeed","suffix":""},{"id":637751414,"identity":"e623bdac-f8a1-463e-9bb5-4911974cb32d","order_by":1,"name":"Thant Hoe","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Thant","middleName":"","lastName":"Hoe","suffix":""},{"id":637751415,"identity":"8b5dcbbf-2058-4456-b6c3-afdfb3cf107c","order_by":2,"name":"Kilsi Kobani","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Kilsi","middleName":"","lastName":"Kobani","suffix":""},{"id":637751416,"identity":"94a74d68-13ed-4cb8-a4d1-ac14c12cb401","order_by":3,"name":"Rachelle Atchina","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Rachelle","middleName":"","lastName":"Atchina","suffix":""},{"id":637751417,"identity":"1cb424c8-d852-46ef-8402-382ef97b8c80","order_by":4,"name":"Maren Buk","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Maren","middleName":"","lastName":"Buk","suffix":""},{"id":637751418,"identity":"08859c6c-b36a-4f27-9b07-28920414ffe0","order_by":5,"name":"Virginia Yuk-chun Lam","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Virginia","middleName":"Yuk-chun","lastName":"Lam","suffix":""},{"id":637751419,"identity":"3fb29bae-f5af-48a7-9c65-0ce3ce22a4a8","order_by":6,"name":"Meghana Singh","email":"","orcid":"","institution":"UPMC Hillman Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Meghana","middleName":"","lastName":"Singh","suffix":""},{"id":637751420,"identity":"1012ce29-ceb5-4372-b371-0ba63c9d2f34","order_by":7,"name":"Yuming Shi","email":"","orcid":"","institution":"UPMC Hillman Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Yuming","middleName":"","lastName":"Shi","suffix":""},{"id":637751421,"identity":"8d4440b6-5844-47df-82be-de707e1cd99e","order_by":8,"name":"Alireza Tojjari","email":"","orcid":"","institution":"UPMC Hillman Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Alireza","middleName":"","lastName":"Tojjari","suffix":""},{"id":637751422,"identity":"96bd9d60-14f3-4cfc-ae17-1508196f0a2e","order_by":9,"name":"Vaishnavi Balaji","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Vaishnavi","middleName":"","lastName":"Balaji","suffix":""},{"id":637751423,"identity":"c1c00aae-d201-4dbc-878e-e24dbc35c9f9","order_by":10,"name":"Lakshya Sharma","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Lakshya","middleName":"","lastName":"Sharma","suffix":""},{"id":637751424,"identity":"f7832853-bd9b-47ac-beb9-9a5ac7601686","order_by":11,"name":"Yuxi Zhang","email":"","orcid":"","institution":"Curenetics","correspondingAuthor":false,"prefix":"","firstName":"Yuxi","middleName":"","lastName":"Zhang","suffix":""},{"id":637751425,"identity":"76f70102-0f15-4ee1-9e6f-da31cdcaab6a","order_by":12,"name":"Sola Adeleke","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzElEQVRIiWNgGAWjYFAC5gaGDzB2AhCzEdbC2MA4A8YiWgszD4xFlLPk2xsbP9vUHM7nl24+/uBBTR0DHzsBnQZnDjZL5xw7bDlzzrHEhoRjhxnYeA4Q0CKR2CCd23DYwOBGjmFDYsMBBjaJBAIOm5HY/NsSqMX+Rv5HoJY6wloYbiS2STOCbJHIYQRqYSasBeiXNsueY+kGEjfSDGcA/cJD0C/y7c2Hb/yosTbgn5H84OOPmjo5+fYGAg5DBzwkqh8Fo2AUjIJRgA0AAEXVQsKPZofxAAAAAElFTkSuQmCC","orcid":"","institution":"Curenetics","correspondingAuthor":true,"prefix":"","firstName":"Sola","middleName":"","lastName":"Adeleke","suffix":""}],"badges":[],"createdAt":"2026-04-28 19:08:13","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9557672/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9557672/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109216034,"identity":"8b1058e2-9b71-4747-9160-dd067491a178","added_by":"auto","created_at":"2026-05-13 17:59:14","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":96488,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eReceiver Operating Characteristics for Machine Learning Models\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage24.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-9557672/v1/e56864b240a02831775406b9.jpeg"},{"id":109252159,"identity":"a501d624-95d0-4b25-a9ac-b07858dc97cb","added_by":"auto","created_at":"2026-05-14 09:21:53","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":110615,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSHAP Beeswarm Summary Plot of Top Tabular Features\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-9557672/v1/62c62f2e677d880aeb5e01c0.jpeg"},{"id":109216038,"identity":"072ce4f0-e2db-42e1-9a5c-f300c8c13788","added_by":"auto","created_at":"2026-05-13 17:59:15","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":34965,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eBoxplots of Predicted Responses by Treatment Subgroup\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-9557672/v1/1c33cdf1f63c614cf9059596.jpeg"},{"id":109252147,"identity":"37108c42-8abc-43cd-8674-1084a6604aef","added_by":"auto","created_at":"2026-05-14 09:21:48","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":34831,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eResults of Late-Fusion a) Optimisation and b) Performance\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"groupimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-9557672/v1/59a923b23d9c01c2ac64b680.jpeg"},{"id":109249174,"identity":"7da289c6-54d7-4ce7-acd4-7a6bb2932252","added_by":"auto","created_at":"2026-05-14 08:43:13","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":512478,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9557672/v1/06ca1ca8-cf6b-4c40-b186-6599ef902723.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe incidence of liver cancer increased by 75% between 1990 and 2015 [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The World Health Organization predicts that more than 1.3\u0026nbsp;million people will die from liver cancer in 2040 [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Hepatocellular carcinoma (HCC) is the most common primary malignancy of the liver, accounting for approximately 85\u0026ndash;90% of cases [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The main risk factors for HCC include chronic hepatitis B or C virus infection, alcohol misuse, obesity, type 2 diabetes, and smoking [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Most patients with HCC are diagnosed at an advanced stage, at which point surgical resection, ablation and liver transplantation are no longer feasible [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Concurrent underlying liver disease often limited treatment options due to deranged liver function [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], consequently, survival for advanced HCC remains poor, with a 5-year survival rate of less than 20% [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eImmune checkpoint inhibitors (ICIs) have emerged as a transformative treatment modality in oncology, leading to improved survival outcomes in several malignancies [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. In HCC, the combination of atezolizumab, a programmed death-ligand 1 (PD-L1) inhibitor, and bevacizumab, an anti-angiogenic agent (Atezo/Bev), and the combination of durvalumab, a PD-L1 inhibitor, and tremelimumab, a cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) inhibitor (Durva/Treme), has demonstrated superiority over sorafenib in patients with unresectable HCC and has since become the first-line standard-of-care treatment for advanced HCC (aHCC) [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Although treatment with Atezo/Bev has resulted in significant improvements in overall survival, 20% of patients are refractory to Atezo/Bev, with only 20\u0026ndash;30% radiological response rate [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Substantial adverse effects including gastrointestinal bleeding, hypertension as well as immune-related adverse events which can be debilitating, irreversible and life-threatening [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Currently, the most widely used approach for predicting response to immune checkpoint blockade is PD-L1 expression assessed by immunohistochemistry (IHC) [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] However, PD-L1 IHC has demonstrated limited predictive accuracy across tumor types, including HCC. Response rates among\u003c/p\u003e \u003cp\u003ePD-L1\u0026ndash;positive patients remain low (26%), while responses are occasionally observed in PD-L1\u0026ndash;negative patients (18%) [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. In addition, PD-L1 testing requires invasive tissue sampling, which may be challenging in patients with advanced disease and carries procedural risks. Given the toxicity profile, high cost, and limited efficacy of immunotherapy in unselected patient populations, there is an urgent unmet need to identify accurate and robust predictive tools to identify patients who are mostly likely to have favorable treatment response.\u003c/p\u003e \u003cp\u003eMachine learning (ML) offers a promising framework to address these challenges by enabling the integration of high-dimensional and heterogeneous clinical data. ML models are capable of capturing complex, non-linear relationships that are difficult to model using traditional statistical approaches [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. In oncology, ML has been applied to outcome prediction, treatment response modeling, and patient stratification using diverse data sources, including clinical variables, imaging, pathology, and unstructured electronic health record data. Importantly, ML-based approaches may facilitate the development of non-invasive, data-driven predictors of immunotherapy response that extend beyond single biomarkers [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Our ML and explainable artificial intelligence (AI) models have shown to predict ICI response, immune related adverse events and subsequent admission in patients with melanoma [\u003cspan additionalcitationids=\"CR18 CR19\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]\u0026ndash;[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. A previous study demonstrated higher inter-model agreement among large language models (LLMs) compared with physician predictions in assessing immunotherapy response in HCC, albeit with moderate accuracy and sensitivity [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. The aim of the study is to use our validated ML models to predict ICIs response among patients with aHCC treated with ICIs. It is hoped that identifying patient and clinical features associated with favorable responses to immune checkpoint inhibitors (ICIs) will support and inform clinical decision-making.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003ePatient Cohort\u003c/h2\u003e \u003cp\u003eElectronic health record data was collected from 302 patients with a diagnosis of hepatocellular cancer who were being treated at the University of Pittsburgh Medical Centre (UPMC) Hillman Centre between December 2014 and December 2023. For each patient, this included demographic data (age, gender, occupation), clinical data such as laboratory measurements (e.g. liver enzymes, serum proteins, coagulation studies), tumour staging and genomic testing results, treatment plan (inclusive of different immunotherapy regimens, current and past adjuvant treatments and neoadjuvant treatments), biopsy status, clinical text in imaging (e.g. magnetic resonance imaging, computer tomography) and histopathology reports at diagnosis and at start of immunotherapy, other relevant clinical data such as presenting complaint of the cancer diagnosis, past medical history, exposure to hepatotoxic chemicals such as alcohol, cocaine, heroin and marijuana, Child-Pugh scores and Eastern Cooperative Oncology Group (ECOG) performance status were included, contributing to a total of 77 data points. All patient data was de-identified and anonymized.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eOutcome of Interest\u003c/h3\u003e\n\u003cp\u003eThe outcome of interest for this study was response to immunotherapy. A binary treatment response outcome was derived by categorizing radiological response categories (complete response, partial response, stable disease, progressive disease) into responder and non-responder classes [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. Patients who had complete or partial response were classified as responders and those who had stable disease or progressive disease were classified as non-responders.\u003c/p\u003e\n\u003ch3\u003eData Preprocessing\u003c/h3\u003e\n\u003cp\u003eAll data processing and machine learning model development was done in Google CoLab (RRID:SCR_018009) using Python version 3.12.12 (RRID:SCR_008394). Columns with more than 80% missing values were removed [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e], [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Irrelevant clinical identifiers, dates, free-text administrative fields, and post-treatment outcome variables that could imply response and hence introduce data leakage were excluded. Outliers were assessed visually using boxplots and removed if they were physiologically implausible and likely due to measurement or data entry errors (e.g., alpha-fetoprotein\u0026thinsp;\u0026gt;\u0026thinsp;800,000 IU/mL).\u003c/p\u003e \u003cp\u003ePast medical history and comorbidities were transformed into binary features indicating the presence or absence of a condition using keyword-based extraction, this will help to identify significant past medical history such as liver cirrhosis, viral hepatitis, metabolic liver disease, hypertension, diabetes and cardiovascular disease which are risk factors of HCC.\u003c/p\u003e \u003cp\u003eTo reduce multicollinearity and model instability, liver function tests and cancer staging variables with high correlation were excluded, and correlation matrices and variance inflation factors (VIFs) were assessed. Variables with sparse events or low prevalence were removed to ensure stable coefficient estimation. Treatment variables, including local therapies and immunotherapy regimens, were standardized and one-hot encoded based on predefined drug and procedure groupings.\u003c/p\u003e \u003cp\u003eOf the remaining columns, those containing free-text (\u0026lsquo;notes', 'presenting_complaints',\u003c/p\u003e \u003cp\u003e'physical_examination_findings', 'ct_report_at_diagnosis', 'mri_report_at_diagnosis', 'ct_report_around_the_start_of_immunotherapy',\u003c/p\u003e \u003cp\u003e'mri_report_at_start_of_immunotherapy', 'histopathological_features') were combined into one column with section markers named \u0026lsquo;full_text\u0026rsquo;. After removal of duplicated variables, the final dataset contained 54 features. Some patient entries were also deleted due to insufficient variables, resulting in a final cohort of 298 patients.\u003c/p\u003e \u003cp\u003eThe combined clinical free-text data was preprocessed using a custom lemmatization and tokenization pipeline. Text was first tokenized using a regular expression that captures alphanumeric terms and decimal numbers, then lemmatized with the WordNet lemmatizer. Stop words were removed using the standard scikit-learn English stop word list, with negation terms (e.g., no, not, never, none) explicitly retained to preserve clinically relevant polarity.\u003c/p\u003e\n\u003ch3\u003eTraining and Validation set\u003c/h3\u003e\n\u003cp\u003eThe dataset was randomly split into 80% for training and 20% for validation by number of patients, and stratified by outcome. Within the training set, 5-fold cross validation was performed for comparison before and after hyperparameter tuning, then against the holdout validation set.\u003c/p\u003e\n\u003ch3\u003eMachine Learning Prediction Pipeline\u003c/h3\u003e\n\u003cp\u003eFeature selection was performed using SelectKBest with Analysis of Variance (ANOVA) F-tests to identify and retain the top 20 predictive non-text features to reduce model complexity and dimensionality [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. Multiple classification algorithms such as Lasso logistic regression, random forest, XGBoost, and support vector machine were implemented, with class imbalance addressed via weighting [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e], [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Hyperparameter tuning was conducted for all models and after tuning, models were evaluated using stratified train-test splits and cross-validation with ROC-AUC as the primary metric. The best-performing model was further interpreted using Shapley Additive Explanations (SHAP) values to identify feature importance across the overall cohort and within treatment-specific subgroups (Atezo/Bev and Durva/Treme) [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e], [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e], 30].\u003c/p\u003e \u003cp\u003eStatistical comparisons between treatment groups were conducted using Mann-Whitney U tests for continuous variables, Chi-squared tests for categorical variables, and t-tests for predicted response probabilities, providing interpretability and insight into subgroup treatment effects. The final optimized model was saved for downstream deployment and further analysis.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eNatural Language Processing\u003c/h2\u003e \u003cp\u003eProcessed tokens were rejoined into normalized text strings and transformed into numerical feature representations using term frequency\u0026ndash;inverse document frequency (TF-IDF) vectorization [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Both unigram and bigram representations were evaluated, and vocabulary size was controlled by varying the maximum number of retained features to reduce overfitting in small clinical datasets.\u003c/p\u003e \u003cp\u003eLogistic regression, random forest, support vector machine, and XGBoost classifiers were trained. Model selection and hyperparameter optimization were performed using five-fold cross-validated grid search. The final model was selected based on the test set ROC-AUC performance.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eLate-Fusion Strategy\u003c/h3\u003e\n\u003cp\u003eA late-fusion approach was used to combine predictions from the tabular and text-based models at the probability level. Final class probabilities were computed as a weighted average of the two model outputs, with the fusion weight (α) optimized by sweeping values from 0 to 1 and selecting the value that maximized ROC-AUC on the test set.\u003c/p\u003e\n\u003ch3\u003eLogistic Regression\u003c/h3\u003e\n\u003cp\u003eA multivariable logistic regression model was developed to evaluate clinical predictors of immunotherapy response in patients with aHCC. The cleaned dataset was imported into R version 4.5.2 using RStudio (RRID:SCR_000432). Variables were converted to factors or retained as numeric based on their distribution and number of unique values.\u003c/p\u003e \u003cp\u003eA full logistic regression model with a binomial link function was initially fitted, adjusting for potential confounders. Model refinement was performed by excluding non-significant and collinear predictors, followed by refitting the final model. Adjusted odds ratios (ORs) with 95% confidence intervals were calculated by exponentiating model coefficients. Statistical significance was defined as a two-sided p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05. Model results were summarized using tidy model outputs, and significant predictors were visualized using forest plots.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eA total of 298 patients with HCC receiving immunotherapy were included in the final analysis after data cleaning. The mean age was similar between groups (69.0\u0026thinsp;\u0026plusmn;\u0026thinsp;8.0 years in responders vs 68.4\u0026thinsp;\u0026plusmn;\u0026thinsp;10.1 years in non-responders).The majority of patients were male, in both responders (85%) and non-responders (80.2%). 167 (56.0%) achieved a response and 131 (44.0%) were classified as non-responders. Responders were more likely to have better baseline performance status, with 41.3% having an ECOG score of 0 compared with 21.4% among non-responders; while poor performance status (ECOG 2\u0026ndash; 3) was more frequent in non-responders (22.1% vs 7.2%). Liver function differed notably between groups. A greater proportion of responders had Child\u0026ndash;Pugh A status (83.2% vs 66.4%), and responders, on average, had higher albumin levels (3.5\u0026thinsp;\u0026plusmn;\u0026thinsp;0.6 vs 3.2\u0026thinsp;\u0026plusmn;\u0026thinsp;0.7 g/dL) and lower total bilirubin (1.2\u0026thinsp;\u0026plusmn;\u0026thinsp;1.1 vs 1.8\u0026thinsp;\u0026plusmn;\u0026thinsp;2.7 mg/dL). Markers of tumor burden and cholestasis were also lower in responders. AFP levels were substantially lower in responders (mean\u0026thinsp;=\u0026thinsp;5,160.7\u0026thinsp;\u0026plusmn;\u0026thinsp;29,463.3 ng/mL) compared with non-responders (mean =\u003c/p\u003e \u003cp\u003e18,646.5\u0026thinsp;\u0026plusmn;\u0026thinsp;56,254.8 ng/mL), as were ALP levels (mean\u0026thinsp;=\u0026thinsp;153.9\u0026thinsp;\u0026plusmn;\u0026thinsp;101.0 vs 211.4\u0026thinsp;\u0026plusmn;\u0026thinsp;146.5 U/L) (Table\u0026nbsp;1).\u003c/p\u003e \u003cp\u003eWith respect to treatment, responders were more likely to have received Atezo/Bev (55.1% vs 38.2%), whereas the proportion receiving Durva/Treme was similar between groups (18.0% vs 20.6%).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eVariable\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOverall (n\u0026thinsp;=\u0026thinsp;298)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNon-response (n\u0026thinsp;=\u0026thinsp;131)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eResponse (n\u0026thinsp;=\u0026thinsp;167)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSMD*\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eP-value\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAge, mean (SD\u003c/b\u003e\u003csup\u003e\u003cb\u003e\u0026dagger;\u003c/b\u003e\u003c/sup\u003e\u003cb\u003e)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e68.7 (8.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68.4 (10.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e69.0 (8.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.06\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.611\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eSex, n (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eFemale\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e51 (17.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e26 (19.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e25 (15.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.129\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.34\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMale\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e247 (82.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e105 (80.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e142 (85.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eECOG\u003c/b\u003e\u003csup\u003e\u003cb\u003e\u0026Dagger;\u003c/b\u003e\u003c/sup\u003e \u003cb\u003eperformance\u003c/b\u003e\u003c/p\u003e \u003cp\u003e\u003cb\u003estatus, n (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.553\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e0\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e97 (32.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e28 (21.4)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e69 (41.3)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e160 (53.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e74 (56.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e86 (51.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e2\u0026ndash;3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e41 (13.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e29 (22.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e12 (7.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eChild\u0026ndash;Pugh score, n (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.428\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eA\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e226 (75.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e87 (66.4)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e139 (83.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eB\u0026ndash;C\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e72 (24.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e44 (33.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e28 (16.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAlbumin, mean (SD)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3.4 (0.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.2 (0.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e3.5 (0.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.421\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTotal bilirubin, mean (SD)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.5 (2.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.8 (2.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.2 (1.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e-0.334\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.007\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAFP\u003c/b\u003e\u003csup\u003e\u003cb\u003e\u0026sect;\u003c/b\u003e\u003c/sup\u003e, \u003cb\u003emean (SD)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11089.0\u003c/p\u003e \u003cp\u003e(43764.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e18646.5\u003c/p\u003e \u003cp\u003e(56254.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e5160.7\u003c/p\u003e \u003cp\u003e(29463.3)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e-0.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.014\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eALP\u003c/b\u003e\u003csup\u003e\u003cb\u003e\u0026para;\u003c/b\u003e\u003c/sup\u003e, \u003cb\u003emean (SD)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e179.2 (126.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e211.4 (146.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e153.9 (101.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e-0.457\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTreatment regimen, n (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.344\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.005\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAtezo/Bev\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e142 (47.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e50 (38.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e92 (55.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"6\"\u003e\u003cb\u003eDurva/Treme\u003c/b\u003e 57 (19.1) 27 (20.6) 30 (18.0)\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"6\"\u003e*Standardised mean difference, \u003csup\u003e\u0026dagger;\u003c/sup\u003eStandard deviation, \u003csup\u003e\u0026Dagger;\u003c/sup\u003eEastern Cooperative Oncology Group, \u003csup\u003e\u0026sect;\u003c/sup\u003eAlpha-fetoprotein, \u003csup\u003e\u0026para;\u003c/sup\u003eAlkaline phosphatase\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eTable\u0026nbsp;1: Baseline Characteristics of Patients by Response Status\u003c/b\u003e \u003c/p\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eMultivariable Logistic Regression Analysis\u003c/h2\u003e \u003cp\u003eThe multivariable logistic regression model demonstrated improved fit compared with the null model (residual deviance\u0026thinsp;=\u0026thinsp;224.6 vs. null deviance\u0026thinsp;=\u0026thinsp;267.5; AIC\u0026thinsp;=\u0026thinsp;270.6).\u003c/p\u003e \u003cp\u003ePerformance status was strongly associated with treatment response. Patients with poorer functional status (performance status 2\u0026ndash;3) had significantly lower chance of achieving response compared with those with performance status 0 (OR\u0026thinsp;=\u0026thinsp;0.21, 95% CI: 0.06\u0026ndash;0.70, p\u0026thinsp;=\u0026thinsp;0.011). A weaker trend was observed for performance status 1, though this did not reach statistical significance. Meeting the Milan criteria for liver transplantation was 3 times more likely to respond to ICIs (OR\u0026thinsp;=\u0026thinsp;3.90, 95% CI: 1.01\u0026ndash;15.0, p\u0026thinsp;=\u0026thinsp;0.049).\u003c/p\u003e \u003cp\u003eAFP and disease stage at diagnosis, showed borderline associations (AFP: \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.067; stage III vs. stage I: \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.053), but did not reach statistical significance. No statistically significant associations were observed for age, sex, liver function measures (albumin, bilirubin, INR), Child-Pugh class, viral hepatitis status, cirrhosis history, or treatment regimen (Durva/Treme vs. Atezo/Bev).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eMachine Learning Pipeline\u003c/h2\u003e \u003cp\u003eAmong the tabular models, the random forest classifier achieved the highest discriminative performance on the held-out test set, with a ROC-AUC of 0.71, outperforming support vector machines (0.69), XGBoost (0.68), and logistic regression (0.57) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea). Feature importance analysis using SHAP values identified key clinical variables contributing to model predictions (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Key predictors of response included lower alpha-fetoprotein (AFP), liver function tests within normal range (AST, ALT, ALP, albumin, bilirubin), higher total protein and lower grade of ECOG performance status. 142 patients had first-line IO treatment with Atezo/Bev and 57 patients had durvalumab and tremelimumab Durva/Treme.\u003c/p\u003e \u003cp\u003eFor the text-based TF-IDF models, the support vector machine achieved the best performance following tuning, with a test ROC-AUC of 0.77, followed by logistic regression (0.75), random forest (0.66), and XGBoost (0.60). The most informative textual features included numeric tokens and clinically relevant imaging and disease-related terms, such as \u0026lsquo;arterial\u0026rsquo;, \u0026lsquo;carcinoma\u0026rsquo;, \u0026lsquo;hepatic\u0026rsquo;, \u0026lsquo;enhancement\u0026rsquo;, and \u0026lsquo;cirrhotic\u0026rsquo;, reflecting the importance of radiologic and diagnostic language in model discrimination.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eTrained on a) Tabular Features Only, and b) TF-IDF Text Features Only\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eComparison of Model Performance on Treatment Regimens\u003c/h2\u003e \u003cp\u003eSubgroup analysis showed that model performance for patients receiving Atezo/Bev (test\u003c/p\u003e \u003cp\u003eAUC-ROC 0.78) was slightly superior to those receiving Durva/Treme (test AUC-ROC\u003c/p\u003e \u003cp\u003e0.69). There was no statistically significant difference among categorical and continuous variables between the two treatment groups (\u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026gt;\u0026thinsp;0.05).\u003c/p\u003e \u003cp\u003eThe model's predicted risk scores did not differ significantly by treatment regimen. The predicted mean response for the Atezo/Bev group was 0.548 and 0.531 for the Durva/Treme group with a t-test statistic of 0.336 (\u003cem\u003ep\u0026thinsp;=\u003c/em\u003e\u0026thinsp;0.739).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eLate fusion of the best-performing tabular (random forest) and text-based (SVM) models further improved predictive performance. The optimal fusion weight favored the text modality, with 90% contribution from the text model and 10% from the tabular model, yielding a combined ROC-AUC of 0.78.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, we demonstrated how our advanced machine learning models can accurately predict immunotherapy response between responders and non-responders in patients with aHCC treated with ICIs with high predicted probability of AUC of 0.78 in our best predicted model. We also determined the key predictors of treatment efficiency and progression. Among all the models evaluated, the late-fusion model incorporating a random forest classifier for tabular clinical features and a support vector machine classifier for text-derived features achieved the best predictive performance. Key predictors of response included lower AFP levels, liver function tests within normal ranges (AST, ALT, ALP, albumin, and bilirubin), higher total protein levels, and better ECOG performance status. Our model was especially valuable in supporting clinical decisionmaking for patients who are at the borderline of eligibility for immunotherapy. By identifying individuals most likely to benefit, these models may help minimize unnecessary exposure to potentially toxic treatments while optimizing therapeutic outcomes.\u003c/p\u003e \u003cp\u003eAlpha-fetoprotein (AFP) is a widely used serum biomarker for assessing treatment response in hepatocellular carcinoma (HCC). Our model confirmed the predictive value of AFP in predicting ICIs response, this is consistent with a meta-analysis of 131 studies showing patients with an AFP response had significantly higher objective radiological response rates [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. A Chinese study further demonstrated that an early decline in AFP from baseline to the first follow-up (i.e., 4\u0026ndash;6 weeks after treatment initiation) was associated with improved overall survival (OS) and progression-free survival (PFS) [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Similarly, liver enzymes, which proved to be indicators of hepatic functional reserve, were significant predictors of clinical outcomes. Tian et al. showed that impaired liver function, reflected by higher Child\u0026ndash;Pugh or ALBI scores, was associated with poorer prognosis in patients receiving ICIs [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. In contrast, the predictive value of PD-L1 expression remains unclear. In our model, PD-L1 levels were not associated with response to ICIs, consistent with findings from the KEYNOTE-224 and CheckMate-040 studies [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eBy incorporating routinely available laboratory tests and demographic data as well as clinical text from clinician notes and investigation reports, this study demonstrates that immunotherapy response can be predicted using simple clinical inputs without the need for sophisticated biomolecular testing. This is particularly important given that these laboratory tests are inexpensive, widely available, and easily implemented in routine clinical practice, including in low-resource settings.\u003c/p\u003e \u003cp\u003eOur findings further support the growing body of evidence demonstrating that machine learning approaches leveraging routinely collected clinical data can aid in predicting response to immunotherapy and in identifying patients at increased risk of disease progression. Artificial intelligence (AI) and machine learning (ML) are increasingly transforming patient care by enabling precision medicine and personalised oncology treatment strategies [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. These methods can efficiently analyse and integrate large volumes of multimodal clinical data, providing timely and clinically actionable predictions [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. In addition, AI and ML models are capable of identifying subtle or latent patterns within complex datasets that may be predictive of patient response to immune checkpoint inhibitors (ICIs) [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Importantly, AI and ML models can iteratively improve during the training process by adjusting model parameters in response to prediction errors. Through repeated exposure to diverse patient profiles and clinical outcomes, these models refine their representations of complex, non-linear relationships among clinical, molecular, and treatment variables, thereby enhancing model generalisability and predictive accuracy [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe main strength of our study lies in the systematic evaluation of five machine learning models to identify the optimal predictive approach. In addition, we specifically identified the most relevant clinical features and multimodal late-fusion pipeline to extract meaningful clinical data from unstructured text, thereby enhancing data input quality. To our knowledge, this study incorporates the largest patient numbers and clinical data points than prior work, contributing to improved predictive accuracy.\u003c/p\u003e \u003cp\u003eWe acknowledged one of the limitations of this study is being a single-center retrospective study with inherent biases in patient selection. However, our study is the largest to specifically assess immunotherapy response to aHCC using patients' clinical characteristics. Additionally, the number of patients who received the Atezo/Bev and Durva/Treme combinations was insufficient to detect clinically meaningful differences between treatment groups or to support robust subgroup analyses for generalizable clinical inference. Larger, more diverse cohorts are required to validate these findings across immune checkpoint\u0026ndash;based combinations, patient demographics, geographic regions, treatment paradigms, and healthcare systems.\u003c/p\u003e \u003cp\u003eThis study lays a solid foundation for identifying patients who are likely to benefit from immunotherapy. The next step toward clinical implementation is validation in larger, more diverse populations with broader ethnic backgrounds and heterogeneous disease characteristics through multicenter retrospective studies. Subsequently, prospective realworld studies will be required to evaluate model performance and determine the most clinically effective predictive algorithms.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eWe have developed a well-validated machine learning\u0026ndash;based predictive model capable of distinguishing responders from non-responders to immunotherapy in advanced HCC using clinical features in the form of structured tables and unstructured text. This model may support clinical decision-making in a setting where treatment response is notoriously difficult to predict. With the inclusion of additional clinical features and external validation in future studies, this approach has the potential to evolve into a clinically useful tool for guiding immunotherapy selection in routine practice.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch3\u003eAcknowledgements\u003c/h3\u003e\n\u003cp\u003eThe authors would like to acknowledge UPMC for the compilation of and provision of access to the patient datasets used in this research. \u0026nbsp;\u003c/p\u003e\n\u003ch3\u003eEthics Approval and Consent to Participate\u003c/h3\u003e\n\u003cp\u003eThis study was reviewed and approved by the University of Pittsburgh Institutional Review Board (IRB) (protocol number: M0D23070075-002). This study used retrospectively collected, de-identified patient data, and was conducted in accordance with relevant guidelines and regulations. Informed consent to participate was granted by \u0026nbsp;all participants in the study.\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003eData Availability Statement\u003c/h3\u003e\n\u003cp\u003eThe datasets generated and/or analyzed during the current study are not publicly available due to data use agreements and patient confidentiality but may be available from the UPMC subject to their approval and applicable ethical and legal requirements.\u003c/p\u003e\n\u003ch3\u003eFunding\u003c/h3\u003e\n\u003cp\u003eThis study was funded by Curenetics Ltd.\u003c/p\u003e\n\u003ch3\u003eConflicts of Interest\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eThe authors declare no potential conflicts of interest.\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u0026nbsp; \u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003ch3\u003eStatement of Significance\u003c/h3\u003e\n\u003cp\u003eThis study develops a multimodal AI model predicting HCC immunotherapy response (AUC 0.78) using routine clinical labs and text data, outperforming PD-L1 biomarkers. It identifies actionable predictors enabling precision selection for Atezo/Bev and Durva/Treme, addressing the critical 70% non-response rate.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGlobal Burden of Disease Liver Cancer Collaboration. \u0026lsquo;The Burden of Primary Liver Cancer and Underlying Etiologies From 1990 to 2015 at the Global, Regional, and National Level: Results From the Global Burden of Disease Study 2015\u0026rsquo;, \u003cem\u003eJAMA Oncol\u003c/em\u003e, vol. 3, no. 12, pp. 1683\u0026ndash;1691, Dec. 2017, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1001/jamaoncol.2017.3055\u003c/span\u003e\u003cspan address=\"10.1001/jamaoncol.2017.3055\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRumgay H, et al. Global burden of primary liver cancer in 2020 and predictions to 2040. J Hepatol. Dec. 2022;77(6):1598\u0026ndash;606. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jhep.2022.08.021\u003c/span\u003e\u003cspan address=\"10.1016/j.jhep.2022.08.021\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSung H, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. Cancer J Clin. 2021;71(3):209\u0026ndash;49. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3322/caac.21660\u003c/span\u003e\u003cspan address=\"10.3322/caac.21660\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcGlynn KA, Petrick JL, Groopman JD. Liver Cancer: Progress and Priorities. Cancer Epidemiol Biomarkers Prev. Oct. 2024;33(10):1261\u0026ndash;72. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/1055-9965.EPI-24-0686\u003c/span\u003e\u003cspan address=\"10.1158/1055-9965.EPI-24-0686\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFalette Puisieux M et al. Jan., \u0026lsquo;Therapeutic Management of Advanced Hepatocellular Carcinoma: An Updated Review\u0026rsquo;, \u003cem\u003eCancers\u003c/em\u003e, vol. 14, no. 10, p. 2357, 2022, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/cancers14102357\u003c/span\u003e\u003cspan address=\"10.3390/cancers14102357\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShiani A, Narayanan S, Pena L, Friedman M. The Role of Diagnosis and Treatment of Underlying Liver Disease for the Prognosis of Primary Liver Cancer. Cancer Control. 2017;24(3). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/1073274817729240\u003c/span\u003e\u003cspan address=\"10.1177/1073274817729240\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eErratum. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2020;70(4):313. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3322/caac.21609\u003c/span\u003e\u003cspan address=\"10.3322/caac.21609\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGarg P, Pareek S, Kulkarni P, Horne D, Salgia R, Singhal SS. Next-Generation Immunotherapy: Advancing Clinical Applications in Cancer Treatment. J Clin Med. Jan. 2024;13(21):6537. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/jcm13216537\u003c/span\u003e\u003cspan address=\"10.3390/jcm13216537\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFinn RS, et al. Atezolizumab plus Bevacizumab in Unresectable Hepatocellular Carcinoma. N Engl J Med. May 2020;382(20):1894\u0026ndash;905. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1056/NEJMoa1915745\u003c/span\u003e\u003cspan address=\"10.1056/NEJMoa1915745\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbou-Alfa GK et al. Jun., Tremelimumab plus Durvalumab in Unresectable Hepatocellular Carcinoma, \u003cem\u003eNEJM Evidence\u003c/em\u003e, vol. 1, no. 8, 2022, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1056/evidoa2100070\u003c/span\u003e\u003cspan address=\"10.1056/evidoa2100070\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQin R, Jin T, Xu F. Biomarkers predicting the efficacy of immune checkpoint inhibitors in hepatocellular carcinoma. Front Immunol. 2023;14:1326097. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fimmu.2023.1326097\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2023.1326097\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. PMID: 38187399; PMCID: PMC10770866.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVentura I, Sanchiz L, Legidos-Garc\u0026iacute;a ME, Murillo-Llorente MT, P\u0026eacute;rez-Bermejo M. \u0026lsquo;Atezolizumab and Bevacizumab Combination Therapy in the Treatment of Advanced Hepatocellular Cancer\u0026rsquo;, \u003cem\u003eCancers\u003c/em\u003e, vol. 16, no. 1, p. 197, Jan. 2024, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/cancers16010197\u003c/span\u003e\u003cspan address=\"10.3390/cancers16010197\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDoroshow DB, et al. PD-L1 as a biomarker of response to immunecheckpoint inhibitors. Nat Rev Clin Oncol. Jun. 2021;18(6):345\u0026ndash;62. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41571-021-00473-5\u003c/span\u003e\u003cspan address=\"10.1038/s41571-021-00473-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Y, et al. The predictive value of PD-L1 expression in patients with advanced hepatocellular carcinoma treated with PD-1/PD-L1 inhibitors: A systematic review and meta-analysis. Cancer Med. 2023;12(8):9282\u0026ndash;92. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/cam4.5676\u003c/span\u003e\u003cspan address=\"10.1002/cam4.5676\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAdlung L, Cohen Y, Mor U, Elinav E. \u0026lsquo;Machine learning in clinical decision making\u0026rsquo;, \u003cem\u003eMed\u003c/em\u003e, vol. 2, no. 6, pp. 642\u0026ndash;665, Jun. 2021, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.medj.2021.04.006\u003c/span\u003e\u003cspan address=\"10.1016/j.medj.2021.04.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang L, et al. Artificial Intelligence Can Predict Personalized Immunotherapy Outcomes in Cancer. Cancer Immunol Res. Feb. 2025;13(7):964\u0026ndash;77. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1158/2326-6066.CIR-24-1270\u003c/span\u003e\u003cspan address=\"10.1158/2326-6066.CIR-24-1270\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma L, Balaji V, Katumba A, Mohan E, Adeleke S. Abstract A015: Precision medicine approach to melanoma immunotherapy: Predicting response, adverse events, and hospital admissions using machine learning and explainable artificial intelligence. Clin Cancer Res. Jul. 2025;31:A015\u0026ndash;015. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/1557-3265.AIMACHINE-A015\u003c/span\u003e\u003cspan address=\"10.1158/1557-3265.AIMACHINE-A015\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma L, et al. 2P Using machine learning (ML) and explainable artificial intelligence (AI) to accurately predict immune-checkpoint inhibitor (ICI) response in small cell (SCLC) and non-small cell (NSCLC) lung cancer patients. ESMO Open. Mar. 2025;10:104157\u0026ndash;104157. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.esmoop.2025.104157\u003c/span\u003e\u003cspan address=\"10.1016/j.esmoop.2025.104157\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma L, et al. 3P Predicting immunotherapy-related adverse events in melanoma patients using machine learning algorithms and explainable artificial intelligence. ESMO Open. Mar. 2025;10:104158. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.esmoop.2025.104158\u003c/span\u003e\u003cspan address=\"10.1016/j.esmoop.2025.104158\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma L et al. Immunotherapy toxicity prediction in patients with melanoma using machine learning algorithms. J Clin Oncol, 42, no. 16_suppl, pp. e21567\u0026ndash;e21567, Jun. 2024, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1200/jco.2024.42.16_suppl.e21567\u003c/span\u003e\u003cspan address=\"10.1200/jco.2024.42.16_suppl.e21567\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu J, et al. Predicting Immunotherapy Response in Unresectable Hepatocellular Carcinoma: A Comparative Study of Large Language Models and Human Experts. J Med Syst. May 2025;49(1):64. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10916-025-02192-1\u003c/span\u003e\u003cspan address=\"10.1007/s10916-025-02192-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEisenhauer EA, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. Jan. 2009;45(2):228\u0026ndash;47. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ejca.2008.10.026\u003c/span\u003e\u003cspan address=\"10.1016/j.ejca.2008.10.026\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan Buuren S, Flexible Imputation of Missing Data., Second Edition. Second edition. | Boca Raton, Florida: CRC Press, [2019] |: Chapman and Hall/CRC, 2018. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1201/9780429492259\u003c/span\u003e\u003cspan address=\"10.1201/9780429492259\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2019.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuyon I, Elisseeff A. \u0026lsquo;An introduction to variable and feature selection\u0026rsquo;, \u003cem\u003eJ. Mach. Learn. Res.\u003c/em\u003e, vol. 3, no. null, pp. 1157\u0026ndash;1182, Mar. 2003.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBreiman L, Forests\u0026rsquo; \u0026lsquo;Random. \u003cem\u003eMachine Learning\u003c/em\u003e, vol. 45, no. 1, pp. 5\u0026ndash;32, Oct. 2001, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1023/A:1010933404324\u003c/span\u003e\u003cspan address=\"10.1023/A:1010933404324\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. \u0026lsquo;Support vector machines\u0026rsquo;, \u003cem\u003eIEEE Intelligent Systems and their Applications\u003c/em\u003e, vol. 13, no. 4, pp. 18\u0026ndash;28, Jul. 1998, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/5254.708428\u003c/span\u003e\u003cspan address=\"10.1109/5254.708428\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen T, Guestrin C. \u0026lsquo;XGBoost: A Scalable Tree Boosting System\u0026rsquo;, in \u003cem\u003eProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining\u003c/em\u003e, Aug. 2016, pp. 785\u0026ndash;794. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/2939672.2939785\u003c/span\u003e\u003cspan address=\"10.1145/2939672.2939785\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLundberg S, Lee S. \u0026lsquo;A Unified Approach to Interpreting Model Predictions\u0026rsquo;, Nov. 25, 2017, \u003cem\u003earXiv\u003c/em\u003e: arXiv:1705.07874. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.48550/arXiv.1705.07874\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1705.07874\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbou-Alfa GK et al. Jul., \u0026lsquo;Tremelimumab plus Durvalumab in Unresectable Hepatocellular Carcinoma\u0026rsquo;, \u003cem\u003eNEJM Evidence\u003c/em\u003e, vol. 1, no. 8, p. EVIDoa2100070, 2022, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1056/EVIDoa2100070\u003c/span\u003e\u003cspan address=\"10.1056/EVIDoa2100070\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalton G, Buckley C. \u0026lsquo;Term-weighting approaches in automatic text retrieval\u0026rsquo;, \u003cem\u003eInformation Processing \u0026amp; Management\u003c/em\u003e, vol. 24, no. 5, pp. 513\u0026ndash;523, Jan. 1988, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/0306-4573(88)90021-0\u003c/span\u003e\u003cspan address=\"10.1016/0306-4573(88)90021-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTian et al. The prognostic and predictive value of AFP in immune checkpoint inhibitor-treated hepatocellular carcinoma: a systematic review and metaanalysisFront. Immunol., 04 November 2025 Sec. Cancer Immunity and Immunotherapy Volume 16\u0026ndash;2025 | \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fimmu.2025.1695861\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2025.1695861\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHou G, Liu B, Fan ZQ, Li C, Zhang JP, Guo YH, Zhang RY, Zheng Y, Zhu H, Wang NY. Association between early response of alpha-fetoprotein and treatment efficacy of systemic therapy for advanced hepatocellular carcinoma: A multicenter cohort study from China. Front Oncol. 2023;12:1094104. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fonc.2022.1094104.\u003c/span\u003e\u003cspan address=\"10.3389/fonc.2022.1094104.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . PMID: 36686731; PMCID: PMC9846773.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTian BW, Yan LJ, Ding ZN, Liu H, Han CL, Meng GX, Xue JS, Dong ZR, Yan YC, Hong JG, Chen ZQ, Wang DX, Li T. Evaluating liver function and the impact of immune checkpoint inhibitors in the prognosis of hepatocellular carcinoma patients: A systemic review and meta-analysis. Int Immunopharmacol. 2023;114:109519. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.intimp.2022.109519.\u003c/span\u003e\u003cspan address=\"10.1016/j.intimp.2022.109519.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . Epub 2022 Nov 30. PMID: 36459922.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhu AX, Finn RS, Edeline J, Cattan S, Ogasawara S, Palmer D, et al. Pembrolizumab in patients with advanced hepatocellular carcinoma previously treated with sorafenib (KEYNOTE-224): a non-randomised, open-label phase 2 trial. Lancet Oncol. 2018;19(7):940\u0026ndash;52. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/S1470-2045(18)30351-6\u003c/span\u003e\u003cspan address=\"10.1016/S1470-2045(18)30351-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYau T, Kang YK, Kim TY, El-Khoueiry AB, Santoro A, Sangro B, Melero I, Kudo M, Hou MM, Matilla A, Tovoli F, Knox JJ, Ruth He A, El-Rayes BF, Acosta-Rivera M, Lim HY, Neely J, Shen Y, Wisniewski T, Anderson J, Hsu C. Efficacy and Safety of Nivolumab Plus Ipilimumab in Patients With Advanced Hepatocellular Carcinoma Previously Treated With Sorafenib: The CheckMate 040 Randomized Clinical Trial. JAMA Oncol. 2020;6(11):e204564. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1001/jamaoncol.2020.4564\u003c/span\u003e\u003cspan address=\"10.1001/jamaoncol.2020.4564\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2020 Nov 12. Erratum in: JAMA Oncol. 2021 Jan 1;7(1):140. doi: 10.1001/jamaoncol.2020.6961. PMID: 33001135; PMCID: PMC7530824.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHectors SJ, Lewis S, Besa C, King MJ, Said D, Putra J, Ward S, Higashi T, Thung S, Yao S, Laface I, Schwartz M, Gnjatic S, Merad M, Hoshida Y, Taouli B. MRI radiomics features predict immuno-oncological characteristics of hepatocellular carcinoma. Eur Radiol. 2020;30(7):3759\u0026ndash;69. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00330-020-06675-2.\u003c/span\u003e\u003cspan address=\"10.1007/s00330-020-06675-2.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . Epub 2020 Feb 21. PMID: 32086577; PMCID: PMC7869026.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRakaee M, Tafavvoghi M, Ricciuti B, Alessi JV, Cortellini A, Citarella F, Nibid L, Perrone G, Adib E, Fulgenzi CAM, Hidalgo Filho CM, Di Federico A, Jabar F, Hashemi S, Houda I, Richardsen E, Rasmussen Busund LT, Donnem T, Bahce I, Pinato DJ, Helland \u0026Aring;, Sholl LM, Awad MM, Kwiatkowski DJ. Deep Learning Model for Predicting Immunotherapy Response in Advanced Non-Small Cell Lung Cancer. JAMA Oncol. 2025;11(2):109\u0026ndash;18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1001/jamaoncol.2024.5356.\u003c/span\u003e\u003cspan address=\"10.1001/jamaoncol.2024.5356.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . PMID: 39724105; PMCID: PMC11843371.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQin R, Jin T, Xu F. Biomarkers predicting the efficacy of immune checkpoint inhibitors in hepatocellular carcinoma. Front Immunol. 2023;14:1326097. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fimmu.2023.1326097\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2023.1326097\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. PMID: 38187399; PMCID: PMC10770866.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu J, Fu R, Su Y, Li Z, Huang X, Wang Q, Shi Z, Wei S. Applications of artificial intelligence in cancer immunotherapy: a frontier review on enhancing treatment efficacy and safety. Front Immunol. 2025;16:1676112. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fimmu.2025.1676112.\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2025.1676112.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . PMID: 41246300; PMCID: PMC12615365.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGao Q, Yang L, Lu M, Jin R, Ye H, Ma T. The artificial intelligence and machine learning in lung cancer immunotherapy. J Hematol Oncol. 2023;16(1):55. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s13045-023-01456-y.\u003c/span\u003e\u003cspan address=\"10.1186/s13045-023-01456-y.\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e . PMID: 37226190; PMCID: PMC10207827.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-cancer","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bcan","sideBox":"Learn more about [BMC Cancer](http://bmccancer.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bcan/default.aspx","title":"BMC Cancer","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"hepatocellular carcinoma, immunotherapy, machine learning, natural language processing, predictive modeling, immune checkpoint inhibitors","lastPublishedDoi":"10.21203/rs.3.rs-9557672/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9557672/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eImmunotherapy (IO) improves survival in advanced hepatocellular carcinoma (HCC), yet under 30% of patients respond to treatment. Existing biomarkers have shown limited predictive accuracy. Machine learning (ML) techniques could be used to develop prediction models that support personalised treatment. We developed and evaluated machine learning models that predict response to IO in patients with HCC. We retrospectively analyzed data from 298 patients with HCC treated with immunotherapy at the University of Pittsburgh Medical Center (UPMC) Hillman Cancer Center between December 2014 and December 2023. Of the 298 patients, 215 (71%) had stable disease and 87 (29%) had progression. The best-performing model was a late-fusion model with text and tabular features (AUC 0.78, Precision: 0.76, Recall 0.70). Good performance was retained even when restricted to tabular variables (AUC 0.71, Precision: 0.70, Recall 0.70). Key predictors of response included lower alpha-fetoprotein (AFP), liver function tests within normal range (AST, ALT, ALP, albumin, bilirubin), higher total protein and lower grade of ECOG performance status. 142 patients had first-line IO treatment with atezolizumab and bevacizumab (Atezo/Bev) and 57 patients had durvalumab and tremelimumab (Durva/Treme). The model performed well for both treatment groups (Atezo/Bev: AUC 0.78, Durva/Treme: AUC 0.69). This study demonstrates that ML models integrating both clinical and NLP-derived features can accurately predict IO response in patients with HCC. Future work will externally validate these results on larger datasets, with the aim of developing generalizable and clinically useful predictive models.\u003c/p\u003e","manuscriptTitle":"Predicting Immunotherapy Response in Patients with Hepatocellular Carcinoma from Clinical and Textual Features Using AI Techniques","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-13 17:59:10","doi":"10.21203/rs.3.rs-9557672/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"236719708370313634857011945602630069696","date":"2026-05-15T12:28:37+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"70909351855058161374910232210409400406","date":"2026-05-10T19:35:10+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"210861707247250685009493855989693859804","date":"2026-05-06T01:11:01+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"203662641259642569984132073988258019101","date":"2026-05-05T15:50:50+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-05-05T15:18:16+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-05-05T15:12:56+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-05-05T15:04:01+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-05-05T00:23:39+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Cancer","date":"2026-05-05T00:20:31+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-cancer","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bcan","sideBox":"Learn more about [BMC Cancer](http://bmccancer.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bcan/default.aspx","title":"BMC Cancer","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"beb16a72-69c4-4efc-a3c3-634103930d7d","owner":[],"postedDate":"May 13th, 2026","published":true,"recentEditorialEvents":[{"type":"reviewerAgreed","content":"236719708370313634857011945602630069696","date":"2026-05-15T12:28:37+00:00","index":34,"fulltext":""},{"type":"reviewerAgreed","content":"70909351855058161374910232210409400406","date":"2026-05-10T19:35:10+00:00","index":33,"fulltext":""},{"type":"reviewerAgreed","content":"210861707247250685009493855989693859804","date":"2026-05-06T01:11:01+00:00","index":30,"fulltext":""},{"type":"reviewerAgreed","content":"203662641259642569984132073988258019101","date":"2026-05-05T15:50:50+00:00","index":29,"fulltext":""},{"type":"reviewersInvited","content":"12","date":"2026-05-05T15:18:16+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-05-05T15:12:56+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-05-05T15:04:01+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-05-05T00:23:39+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Cancer","date":"2026-05-05T00:20:31+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-13T17:59:11+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-13 17:59:10","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9557672","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9557672","identity":"rs-9557672","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00