Artificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps

doi:10.21203/rs.3.rs-8408084/v1

Artificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps

2025 · doi:10.21203/rs.3.rs-8408084/v1

preprint OA: closed

Full text JSON View at publisher

Full text 151,115 characters · extracted from preprint-html · click to expand

Artificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Systematic Review Artificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps Hashim Hashim, Fahad Abubakr, Mohamed Elhassadi, Ali Hasnain This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8408084/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Generative artificial intelligence (AI) and machine learning (ML) are emerging as powerful tools for de novo drug discovery. Oncology, which faces arduous and lengthy development timelines, could gain considerably from these approaches. Previous reviews have generally described generative models, but none have provided a systematic and quantitative synthesis of their application to cancer drug discovery. Methods A PRISMA-guided systematic review of PubMed was carried from January 2015 to June 20, 2025. Eligible studies applied generative AI or ML architectures to design new molecules with cancer relevance. Extracted data included study targets, model families, docking scores, binding free energies, in vitro potency (IC₅₀/EC₅₀), in vivo validation, ADME(T) assessments, code availability, and comparator performance. Analyses were descriptive and aimed at mapping the coverage and distribution of reported outcomes. Results From 1,130 records screened, 57 studies met eligibility. Kinases were the most frequent targets (49%), followed by enzymes, GPCRs, and immune proteins. Publications rose sharply after 2021. Under half of the studies reported docking scores or in vitro potency values, and 14% described in vivo testing. Binding free energy values appeared in 26% of papers, and ADME(T) assessments in 37%. Code availability was inconsistent, with public release in 54% of papers, highlighting reproducibility gaps. Conclusion Generative AI demonstrates potential to design biologically active anticancer compounds. However, evidence is predominantly comprised of computational results with limited experimental validation. Future work should give priority to consistent reporting, benchmarking frameworks, open code and data, and prospective in vitro and in vivo testing. Drug Discovery, Design, & Development Oncology Artificial Intelligence and Machine Learning De novo drug design Cancer Artificial intelligence Machine learning Generative models Drug discovery Molecular design Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 1. Introduction Cancer is among the leading causes of illness and death across the world. In 2020, it accounted for an estimated 19.3 million new cases and 10 million deaths [ 1 ], and in 2022 the global burden was reported to increase at nearly 20 million new cases [ 2 ]. The economic costs are no less striking as between 2020 and 2050, cancer care and the loss of productivity are projected to amount to around 25.2 trillion US dollars [ 3 ]. Even with significant ongoing investments, the path of oncology drug discovery remains slow and highly inefficient [ 4 ]. Drug development is often slow and carry costs of several billion dollars, where only about 3.4% to 5.1% of oncology drugs that reach clinical testing ever gain regulatory approval [ 5 , 6 ]. These statistics show the need to expedite and prioritize the potential drug candidates much earlier in the pipeline. The traditional drug discovery process is a multi-stage endeavor that begins with target identification and validation, where researchers determine whether a biological molecule is causally linked to cancer progression and assess its suitability as a druggable target. Once a target is established, hit discovery is conducted, often through high-throughput experimental screening or virtual screening of chemical libraries, to identify initial “hit” compounds. This is followed by hit-to-lead optimization, where chemical structures are modified to improve potency, selectivity, and physicochemical properties. In the lead optimization stage, medicinal chemistry and computational modelling are used to enhance pharmacokinetic and pharmacodynamic characteristics, focusing on absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox). Promising leads progress into preclinical development, which includes in vitro assays and in vivo animal studies to evaluate efficacy and safety. Only a very small fraction of candidates advances into clinical trials, which are divided into Phase I (safety and dosage), Phase II (efficacy and side effects), and Phase III (comparative effectiveness) before any regulatory approval is possible [ 7 ]. Each stage is characterized by high attrition rates, lengthy timelines, and escalating costs, which collectively explain why oncology remains one of the most resource-intensive areas in drug development [ 7 , 8 ]. Generative artificial intelligence (AI) and machine learning (ML) have recently gained attention as potential ways to address this gap. Unlike traditional computational methods, which mainly screen compounds from existing chemical libraries, generative models are designed to learn patterns from chemical and biological data and propose entirely new molecules [ 9 ]. Several families of models are now being applied in drug discovery, including recurrent neural networks, generative adversarial networks, and transformer-based architectures [ 9 , 10 ]. These systems can suggest compounds that balance potency, selectivity, and drug-like properties, which, in theory, reduces both the time and cost of searching for chemical space [ 11 ]. Proof-of-concept studies have even shown that generative AI can reproduce known drugs or design new scaffolds within days or weeks [ 9 ]. Such demonstrations have resulted in expectations that AI-driven design could play a major role in accelerating the discovery of anticancer agents. In this article, we present a decade-long (January 2015 to June 20, 2025) systematic review that maps and quantitatively synthesizes the application of generative AI and ML to de novo cancer drug discovery. Unlike previous reviews, which were either method-oriented or disease-agnostic, this work specifically targets oncology. Our aim is to establish an evidence base that highlights both opportunities and limitations of these models in producing compounds with potential clinical relevance. Our initial findings show that out of 1,130 records screened, 57 studies met inclusion criteria. Kinases were by far the most common drug targets (49%), while enzymes, GPCRs, and immune proteins were less frequently studied. Publication activity rose sharply after 2021, mirroring the broader surge in generative AI research. Less than half of the studies reported docking or in vitro potency data, and only 14% described it in vivo validation. Binding free energy values appeared in just over one-quarter of studies, while ADME-Tox assessments were reported in about one-third. Roughly half of the included studies claimed that AI-generated compounds outperformed reference drugs, but most of these claims were based only on computational evidence. Reproducibility was inconsistent, with public code available in just over half of the papers. Taken together, these findings suggest that while generative AI can design molecules with promising anticancer potential, experimental validation and reproducibility remain major gaps. The selection criteria followed in our research adopts a PRISMA-guided, cancer-specific, quantitative synthesis of de novo generative AI and ML studies. Using a defined corpus, we (i) map targets, model families, and time trends; (ii) quantify how often studies report numeric docking and binding-free-energy values and summarize their distributions overall and by target type and model class; (iii) track progression to biochemical, cell, and in vivo end points; (iv) examine claims that AI-generated compounds outperform reference drugs alongside the numerical evidence provided; and (v) assess openness and reproducibility indicators such as code availability and dataset disclosure. Where prior reviews explain how models work, our goal is to put oncology-specific numbers on what is being reported and to highlight practical steps that would make studies more comparable and more informative for translation. The remainder of this paper is organized as follows: Section 2 presents the state of the art, Section 3 outlines methods and eligibility criteria, Section 4 presents the results, and Section 5 discusses implications, limitations, and future directions. 2. State of the Art A systematic attempt to summarise the use of generative AI in drug discovery was published by Martinelli’s 2022 review [ 12 ]. This publication classified six major model families including recurrent neural networks, variational autoencoders, generative adversarial networks, adversarial autoencoders, evolutionary algorithms, and reinforcement learning. It also described recurring challenges, such as limited molecular diversity, uncertain synthetic accessibility, and the lack of prospective experimental validation. Importantly, Martinelli’s review was disease agnostic and appeared before the surge of oncology focused applications that began to emerge after 2023. In the past three years, generative AI has been applied with increasing frequency to cancer relevant targets. Many studies now focus on kinases and oncogenic proteins such as EGFR and KRAS, and some extend beyond docking scores to report in vitro potency. The variety of architectures has also expanded. Diffusion models have demonstrated the capacity to generate drug like molecules, and in some cases three dimensional structures conditioned on binding pockets [ 13 ]. Transformer based chemical language models, trained on large compound libraries, use natural language processing methods to explore chemical space [ 14 ]. Deep generative models have been applied to learn chemical patterns from large datasets and generate novel molecular structures with desired properties, reshaping how candidate compounds are explored in anticancer drug discovery. [ 15 ]. Recent systematic reviews extend this picture. One review described how AI has been applied across the discovery pipeline, including target identification, molecular design, synthesis planning, and ADMET prediction [ 16 ]. Another highlighted recurring challenge in the field is data quality and bias, as well as model interpretability, while stressing the need to integrate AI methods with traditional experimental workflows to enhance reliability and translational relevance [ 17 ]. An oncology-focused review showed how AI supports cancer drug discovery and development, including hit identification, compound generation, and biomarker-informed design [ 18 ]. A second oncology-focused review described how artificial intelligence supports both therapeutic development and clinical cancer management, illustrating applications from predictive diagnostics to personalized treatment planning and improved patient care [ 19 ]. More specialized publications provide further details. A decade-long review catalogued advances in AI-based anti-cancer drug design, showing how data-driven and AI-driven strategies are being deployed across diverse computational targets and screening tasks. [ 20 ]. A scoping review focused on glioblastoma described how AI is used for both drug discovery and biomarker discovery in this high unmet need indication [ 21 ]. A comprehensive oncology review demonstrated how artificial intelligence is transforming clinical cancer care, from improving diagnosis and biomarker discovery to enabling individualized treatment planning and more effective patient management. [ 22 ]. Unlike earlier reviews, which were method oriented or disease agnostic, the present work is cancer specific. Previous surveys described model families and outlined opportunities but did not quantify translational outcomes. They rarely reported how often docking scores, binding free energies, in vitro potency, in vivo validation, or ADME(T) assessments were included. They also gave little attention to reproducibility indicators such as code and data availability. The present review provides the first systematic synthesis focused on oncology. By mapping computational and experimental endpoints and assessing comparator claims, it shows that AI generated molecules can reach levels of activity relevant to cancer treatment and in some cases demonstrate potential to outperform reference drugs. 3. Methods 3.1 Study Design A PRISMA-guided systematic review of PubMed was carried out from 2015 to June 20, 2025. Eligible studies applied generative AI or ML architectures to design new molecules with cancer relevance. Extracted data included study targets, model families, docking scores, binding free energies, in vitro potency (IC₅₀/EC₅₀), in vivo validation, ADME(T) assessments, code availability, and comparator performance. Analyses were descriptive and aimed at mapping the coverage and distribution of reported outcomes. 3.2 Data Search and Collection 3.2.1 Literature search We searched PubMed on June 20, 2025, to find studies that applied AI or ML to de novo anticancer drug design. The research combined three concepts: AI/ML, oncology, and drug discovery or design. Both MeSH terms and free text were used. To keep the focus on original work, we left out reviews, systematic reviews, meta-analyses, editorials, commentaries, and preprints. We have limited the time frame to the past ten years. The full Boolean string is given in Supplementary Appendix 1. No other databases or grey literature were checked. 3.2.2 Eligibility criteria Studies were included if they: were published in English. had clear relevance to cancer (any type), with the cancer type noted where specified. applied artificial intelligence (AI) or machine learning (ML), specifically generative models such as a variational autoencoder, generative adversarial network, or graph-based neural network to design molecules de novo. had the goal of discovering or designing novel anti-cancer drugs (not repurposing, screening, synergy, or response prediction). used AI/ML to generate new molecular entities, i.e. performed de novo drug design targeting cancer We excluded studies that only applied predictive or QSAR models and non-original formats. Exclusion Criteria For our research the publications were excluded if they met any of the following conditions Published only as conference abstracts without a full-text article. Published as opinion pieces, editorials, or letters. Did not report empirical data. Focused outside oncology. Did not involve an application of artificial intelligence. 3.3 Data Pre-processing 3.3.1 Screening and selection All records were moved into a reference manager and screened in two steps, following PRISMA 2020. First, titles and abstracts were screened by two reviewers. Disagreements were settled by consensus. Full texts of potentially eligible papers were then reviewed in the same way. Reasons for exclusion at this stage were recorded. A PRISMA flow chart was created to show the process. 3.3.2 Handling of duplicates and reviewer disagreements Because a single database was searched, no duplicate records were identified. Disagreements were resolved by discussion. 3.4 Data / Information Extraction 3.4.1 Data extraction We collected general details (title, first author, year, journal). Code availability was marked “yes” if a repository or link was provided and “no” if not. Targets were taken as reported, then grouped into categories (kinases, enzymes, GPCRs, immune proteins, transcription factors, epigenetic regulators, or other). All generative models were noted (for example, MORLD, REINVENT, MTMol-GPT). If more than one was used, all were listed. The model considered best was taken from author statements or from the most favorable results. 3.4.2 Computational endpoints Docking scores and binding free energy values were recorded for AI- or ML-generated compounds only. Human optimized analogues were excluded. If a lead compound was given, its value was used. If several were given, the most favorable was taken. If no lead was specified, the best value across compounds was used. Where several tools were applied to the same compound, values were averaged. If tools disagreed on the top compound, values were recorded as the most favorable overall result. This meant that each study contributed one representative value. Because docking and free energy methods vary vastly, absolute values are not comparable across studies. Analyses were descriptive and meant to show the spread of values rather than provide effect sizes. ADME(T) reporting was coded as present if a study included any in silico assessment of absorption, distribution, metabolism, or excretion, with toxicity optional. Methods were not harmonized. The aim was only to capture how often pharmacokinetic properties were considered. 3.4.3 Experimental endpoints In vitro potency (IC₅₀/EC₅₀) values were included only for AI- or ML-generated compounds. If a “best” compound was named, that value was taken. If not, the lowest reported value was used. Multiple assays for the same compound were kept as separate entries. All values were converted to nM. Predicted values and human modified analogues were excluded. Values reported with inequality signs (for example < 5 nM) were counted toward coverage but excluded from statistics. Because assays varied, results are shown descriptively to illustrate ranges. In vivo work was noted at the study level if any animal experiment was reported. Species were grouped as mouse, rat, dog, C. elegans, or others. Each study was counted once regardless of how many experiments it contained. The aim was to show which animal models were used and whether efficacy was reported. 3.4.4 Comparator analysis We noted whether AI-generated compounds were compared to a reference drug or compound. Studies were classified as showing superiority in at least 1 end point, showing no superiority, or giving no comparator. When superiority was claimed, we noted the type of evidence (docking, free energy, IC₅₀/EC₅₀, or therapeutic index). An “outperformed reference” field was created to record overall claims. 3.5 Data Analysis 3.5.1 Data synthesis and analysis Docking scores and free energy values were summarized using minimum, quartiles, median, and maximum. These show distributions, not cross-study effect sizes. In vitro potency values were summarized separately for biochemical and cell-based assays. Numeric values went into descriptive statistics. Censored values counted toward coverage only. Probe likeness was assessed using common thresholds: <100 nM for biochemical assays and < 1 µM for cell-based assays, consistent with Structural Genomics Consortium and EUbOPEN guidance. In vivo data were summarized as the proportion of studies reporting animal experiments, grouped by species. ADME(T) reporting was expressed as the share of studies including pharmacokinetic evaluation. Temporal patterns were described by calculating, for each year, the proportion of studies reporting docking, free energy, in vitro, in vivo, or ADME(T) data. For multi-model studies, all models were listed, and the best performer was taken from author statements or results. Comparator results were summarized as the frequency of superiority claims and the type of evidence used. 3.5.2 Reproducibility and transparency The study followed PRISMA 2020. Title and abstract screening, full-text review, and data extraction were done by two reviewers independently. Conflicts were resolved by discussion. Extraction rules were written in advance and applied consistently. The complete dataset is given in Supplementary Tables. The full search strategy is also provided to support transparency and reproducibility. 4. Results 4.1 Study Selection A single-source search of PubMed conducted on June 20, 2025 identified 1,130 records. No additional records were obtained from other sources or databases, and no duplicates were removed (0), yielding 1,130 records for title/abstract screening. Of these, 1,046 were excluded due to failure to meet the inclusion criteria. We sought 84 full‐text reports for retrieval; 11 could not be obtained. We assessed 73 full‐text reports for eligibility and excluded 15 for the following primary reasons: wrong publication type (n = 1), not AI/ML (n = 4), not drug discovery (n = 1), and not de novo design (n = 10). The remaining 57 studies were included in the review. 4.2 Study Characteristics The 57 included studies are described in terms of target type, year of publication, code availability, and data completeness. Each of these features is reported in the following subsections. 4.2.1 Target types The categories were grouped to reflect standard divisions in cancer drug discovery. Kinases were separated as the most widely targeted signalling proteins, while enzymes were grouped to cover metabolic and catalytic regulators distinct from kinases. GPCRs and immune proteins were kept separate for their roles in receptor signalling and tumour immune interactions. Transcription factors and epigenetic regulators were noted independently as harder classes to drug but of growing interest. Less common types such as GTPases, kinesins, and structural proteins were retained as individual groups to capture exploratory efforts. The most common category was kinases (28 studies; 49.1%), followed by enzymes (11; 19.3%), cell/omics context (8; 14.0%), and GPCR (8; 14.0%). Immune proteins appeared in 4 studies (7.0%), while transcription factors were present in 3 (5.3%). Less common were epigenetic regulators (2; 3.5%), GTPases (1; 1.8%), kinesins (1; 1.8%), and structural proteins (1; 1.8%). 4.2.2 Publication years Included studies were published between 2016 and 2025. Output was sparse before 2021, with one study each in 2016, 2018, 2019, and 2020. Annual publications increased to 5 in 2021 and 2022, rose to 9 in 2023, and peaked at 22 in 2024. On June 20, 2025, output remained high with 12 studies published. 4.2.3 Code availability Public code was available for 31 studies (54.4%), while 26 studies (45.6%) had no publicly available code. Availability was 0% from 2016 to 2020. It increased to 60% in 2021 but dropped to 40% in 2022. Availability surged in 2023 at 44.4%, peaked at 68.2% in 2024, and was 58.3% in 2025. 4.2.4 Data completeness Numeric docking scores were reported in 21 studies (36.8%), and BFE values in 15 studies (26.3%). In-vitro values were available for 21 studies (36.8%). In-vivo experiments were reported in 8 studies (14.0%), while 7 studies (12.3%) specified an animal model. The “Outperformed reference” field contained a substantive entry in 29 studies (50.9%). 4.3 Docking Score Coverage and Distribution 4.3.1 Overall distribution of Docking Scores Numeric docking scores were reported in 21 of the 57 included studies (36.8%), yielding 28 individual values. Reported scores ranged from − 54.38 kcal·mol⁻¹ to − 7.49 kcal·mol⁻¹, with a median of − 11.91 kcal·mol⁻¹ [IQR − 14.17 to − 10.21]. In several cases, a single study contributed more than one docking score, reflecting the evaluation of multiple targets. 4.3.2 Temporal Trend . The proportion of included studies reporting a numeric docking score is shown in the supplementary tables. No docking scores were reported prior to 2020. In 2020, the only included study contained a docking score (100%). Subsequent years showed variable reporting rates, with 20.0% of studies in 2021 (1 of 5), 40.0% in 2022 (2 of 5), and 22.2% in 2023 (2 of 9). Reporting increased to 36.4% in 2024 (8 of 22) and 58.3% in 2025 (7 of 12). While recent years suggest greater uptake in reporting docking scores, coverage remains inconsistent across publication years. 4.3.3 Docking scores by target type Median docking scores by target type are summarised in supplementary tables. Kinases were the most frequently assessed, with 11 studies reporting docking scores (median − 13.72 kcal·mol⁻¹, IQR − 23.16 to − 11.12). GPCRs were the second most common (n = 4, median − 10.20 kcal·mol⁻¹, IQR − 10.70 to − 9.83), followed by enzymes (n = 3, median − 11.10 kcal·mol⁻¹, IQR − 11.10 to − 10.30). Other target types including immune proteins, epigenetic regulators, ion channels, transcription factors, and transporters were represented by only one or two studies each, limiting the interpretability of their docking score distributions. Across all categories, more negative docking scores indicate more favourable predicted binding affinity. 4.4 Binding Free Energy Coverage and Distribution 4.4.1 Overall Distribution of Binding Free Energies Numeric binding free energy (BFE) values were reported in 15 of the included studies, yielding 17 individual values. Reported values ranged from − 255.50 kcal·mol⁻¹ to − 8.44 kcal·mol⁻¹, with a median of − 40.02 kcal·mol⁻¹ [IQR − 69.41 to − 14.06]. In two cases, a single study contributed more than one docking score, reflecting the evaluation of multiple targets. 4.4.2 Temporal Trend The proportion of included studies reporting a numeric binding free energy (BFE) value is shown in Fig. 8 . No BFEs were reported prior to 2022. In 2022, 20.0% of studies (1 of 5) reported a BFE value. Reporting remained similar in 2023 (22.2%, 2 of 9), then increased to 36.4% in 2024 (8 of 22). In 2025, coverage was slightly lower at 33.3% (4 of 12). While recent years indicate greater uptake in reporting BFE values compared with earlier periods, coverage remains inconsistent across publication years. 4.4.3 BFE values by target type Median BFE values by target type are summarised in the supplementary table. Kinases were the most frequently assessed, with six studies reporting BFE values (median − 12.99 kcal·mol⁻¹, IQR − 33.71 to − 9.13). Enzymes were the next most common (n = 4, median − 30.72 kcal·mol⁻¹, IQR − 66.55 to − 24.76). Other target types including immune proteins and GPCRs were represented by only two studies each, limiting the interpretability of their docking score distributions. Other target classes in the dataset had no BFE values reported. Across all categories, more negative BFE values indicate more favourable predicted binding affinity. 4.5 In-vitro potency assays 4.5.1 In-vitro potency coverage and distribution Experimental in-vitro potency values were reported in 21 unique studies. Within these, 19 studies contributed 22 biochemical assay results, and 6 studies contributed 6 cell-based assay results. Among the 19 numeric biochemical values, potencies ranged from 0.09 nM to 764.0 nM, with a median of 13.5 nM [IQR 1.45–37.8 nM]. These values indicate that most AI-generated compounds with reported biochemical data demonstrated low-nanomolar activity, although a small number extended into the mid-nanomolar range. Cell-based assay data were limited (n = 6) and are presented descriptively without distributional statistics. 4.5.2 Temporal trends in in-vitro reporting Across the extraction period, 18 unique studies reported experimental in-vitro potency values. Reporting was sporadic in the early years, with one study each in 2016, 2018, 2019, and 2020. A modest rise was observed in 2021 (five studies, of which only one included in-vitro data). More consistent reporting appeared from 2022 onward, with 2 studies in 2022, 4 studies in 2023, 5 studies in 2024, and 2 studies in 2025 including experimental in-vitro results. Despite this increase, in-vitro validation remains a minority of total publications each year. For example, in 2024, only 22.7% of studies reported experimental IC₅₀/EC₅₀ values, and in 2025 the share was 16.7%. These findings indicate that although in-vitro validation of AI-generated compounds has become more frequent in recent years, it is still inconsistently reported. 4.5.3 Probe likeness assessment Across all experimental in-vitro assays, 14 biochemical entries and 6 cell-based entries were available for probe-likeness evaluation. Using conventional thresholds (< 100 nM for biochemical assays and < 1,000 nM for cellular assays), 11 of 14 biochemical entries (78.6%) and 5 of 6 cellular entries (83.3%) qualified as probe-like. These findings suggest that most AI-generated compounds with experimental data achieve potencies within ranges considered suitable for high-quality chemical probes. However, because assay conditions varied across studies, these results should be regarded as descriptive only and not directly comparable between assay families. To ensure rigor, our thresholds are grounded in authoritative chemical biology standards: SGC (Structural Genomics Consortium) guidelines define a high-quality chemical probe as having biochemical in‑vitro potency < 100 nM and cellular activity at ≤ 1 µM. [ 23 ] EUbOPEN, a large probe consortium, similarly states: potency ≤ 100 nM (biochemical) and evidence of cellular engagement ≤ 1 µM. [ 24 ] These values are widely cited in chemical probe literature and consistently used as benchmarks for qualifying probe like behaviour. 4.6 In-vivo and animal models Eight of the 57 included studies (14.0%) reported in-vivo evaluation of AI/ML-generated anticancer compounds. The majority employed mice (6 studies, 75%), while additional models included dogs (1 study, 12.5%), rats (1 study, 12.5%), and C. elegans (1 study, 12.5%). One study reported an unmapped animal model (12.5%). Importantly, all reported in-vivo experiments demonstrated efficacy of at least one generated compound against the intended cancer model. 4.7 In silico ADME(T) Coverage 4.7.1 Coverage of ADME(T) reporting Out of 57 included studies, 21 (36.8%) fulfilled the ADMET criterion, indicating that just over one-third of AI/ML drug design papers integrated systematic pharmacokinetic evaluation. Toxicity features were occasionally included but were not required for classification under this criterion. 4.7.2 Co-reporting with other data Among the 21 ADMET-reporting studies, 9 (43%) also presented docking scores, 10 (48%) reported BFE estimates, and 7 (33%) included experimental in-vitro assays. These findings show that ADMET evaluation was frequently co-reported alongside other forms of validation, although only about one-third of studies paired ADMET with experimental assays. 4.7.3 Temporal trend of ADME(T) reporting ADMET reporting was absent in 2018–2020 and 2022 but has steadily increased since 2021. The share of ADMET-reporting studies rose from 40% in 2021 to 58.3% in 2025, indicating growing emphasis on pharmacokinetic evaluation in recent years. 4.8 Multi-model evaluations Eleven studies directly compared more than one modelling approach. Wang et al. (2025) tested MOFF, FREED, REINVENT, LIMO, and MORLD, with MOFF giving the best docking results. Qin et al. (2025) compared ChEMBL, DRD2, and A2B-60, where A2B-60 was superior. Another study by Wang et al. (2025) evaluated Mamba and several GPT variants, with Mamba and T5MolGen outperforming the others. Ai et al. (2024) introduced MTMol-GPT and showed that MTMol-GPT and SF-MTMol-GPT were stronger than DLGN, RationaleRL, and CMolRNN. Aksamit et al. (2024) found ReLSO better than FragNet. Chandraghatgi et al. (2024) reported FDSL-DD outperforming AutoGrow4 and DeepFrag. Chen et al. (2024) identified AIxFuse as the best among REINVENT 2.0, RationaleRL, and MARS. Mukaidaisi et al. (2024) found the Adversarial Autoencoder superior to Junction Tree VAE and Fragment VAE. Xu et al. (2024) showed that Graph GA and HELM-GPT performed best among GA, LSTM, and transformer variants. Yang et al. (2024) reported that a sequence of RHSH then MARS-QSAR outperformed MARS, RationaleRL, and MolEvol. Finally, Yang L. et al. (2022) showed that the Wasserstein autoencoder outperformed Beta VAE and GAN in anticancer peptide generation. Across these studies, fragment-based methods (MOFF, FDSL-DD), task-tuned architectures (A2B-60, AIxFuse, ReLSO), and advanced generative approaches (Mamba with T5MolGen, MTMol-GPT, WAE) most often produced the best candidates, although superiority was target-specific. 4.9 Comparative performance vs reference drugs/compounds 4.9.1 Overall frequency of outperformance Across the included studies, twenty-four reported that at least one AI/ML generated compound outperformed a reference drug or benchmark compound. Five studies explicitly stated that the generated molecules did not exceed the reference, while twenty-eight provided no clear comparative statement. This indicates that approximately one-third of the literature made explicit claims of superiority over established comparators, although reporting practices varied. 4.9.2 Type of evidence provided Among the 24 studies that reported superiority of AI/ML designed compounds over reference drugs or compound, most claims were supported by in silico or in vitro metrics. Docking score comparisons were the most frequent, with ten papers (41.7%) showing more favourable binding scores than the reference compound. Eight studies (33.3%) reported superiority in IC50 values, while eight (33.3%) provided more favourable binding free energy estimates. A smaller number of studies supported superiority through EC50 (n = 1, 4.2%) or TI50 (n = 1, 4.2%). These findings highlight that docking and potency assays were the dominant forms of evidence used to demonstrate comparative advantages. 5. Discussion 5.1 Summary of findings In this PRISMA-guided systematic review, we analysed 57 studies that applied generative AI and machine learning to the de novo design of cancer drugs. Kinases emerged as the predominant targets, with enzymes, GPCRs, and immune-related proteins investigated less frequently. Publication activity rose sharply after 2021, a trend that parallels both the growing enthusiasm for AI in society at large and its accelerating adoption within oncology research. Yet, fewer than 40% of studies reported docking scores or in-vitro potency values, and only 14% incorporated in-vivo experiments. Binding free energy calculations were included in roughly one-quarter of the papers, while just over one-third provided ADME(T) assessments. Around half of the studies suggested that AI-generated compounds outperformed reference drugs, although these claims were usually based only on in-silico evidence. Together, these findings illustrate both the considerable potential of generative models to yield active anticancer molecules and the uneven progress toward endpoints of clear translational relevance. At this stage, the field can be seen as proof-of-concept: generative AI is able to design molecules with measurable anticancer activity, but validation remains inconsistent and fragmented. By mapping computational and experimental endpoints and assessing comparator claims, this review shows that AI-generated compounds can reach levels of activity relevant to cancer treatment and in some cases demonstrate potential to outperform reference drugs. What has been achieved here is the first systematic evidence map specific to oncology, which not only charts the progress to date but also makes visible the gaps that must be closed for translation into practice. 5.2 Context with existing literature Previous reviews of generative models in drug discovery have been mostly method-driven or disease-agnostic, with an emphasis on algorithmic innovation rather than on translational benchmarks. Our work extends this literature by concentrating on oncology and by quantifying outcomes that are directly relevant to translation. We report how often docking scores, binding free energy estimates, potency assays, and in-vivo studies are included, and how frequently AI-generated compounds are compared with reference drugs. This oncology focus brings to light patterns that broader reviews have overlooked: the predominance of kinase targets, the inconsistent reporting of docking and potency data, and the limited extent of experimental validation. By situating generative AI within the cancer domain, our synthesis reflects both the enthusiasm propelling the field and the critical gaps that continue to limit reproducibility and meaningful comparison. 5.3 Interpretation and implications The results point to the capacity of generative AI to yield compounds with genuine biological activity. In several studies, reported biochemical potencies reached the low nanomolar range, and a fair number of compounds passed probe-likeness thresholds, suggesting that these designs are not merely theoretical but can serve as practical starting points. More recently, ADME(T) reporting has become more common, hinting at growing awareness of pharmacokinetic issues in early discovery. Yet in-vivo validation is still rare, which shows how seldom AI-generated molecules are examined in conditions that resemble real disease settings. Because of this, claims that new compounds outperform reference drugs need to be interpreted carefully, especially when they rely only on docking or free-energy predictions. In practice, generative AI is changing how leads are conceived, yet the slower movement toward experimental confirmation imposes a great challenge. 5.4 Strengths and limitations Strengths: The main strengths of this review lie in the use of strict eligibility criteria, which restricted the corpus to genuinely de novo generative studies. By covering a ten-year time frame, the review captured the evolution of the field, showing both early proof-of-concept studies and the surge of work after 2021. Comparator analysis was captured, documenting how often AI-generated compounds were claimed to outperform reference drugs or compound and what type of evidence supported those claims. We also applied a structured template that allowed us to capture a wide set of endpoints. Unlike earlier reviews, this work quantifies both computational and experimental outcomes in a cancer-specific context. Limitations: Our analyses are descriptive, as the heterogeneity of methods made direct comparison or meta-analysis unrealistic. Docking scores, binding free energy estimates, and potency data were reported as they appeared, without harmonising across different software, so absolute values should not be read as effect sizes. The search strategy was restricted to PubMed, which means relevant work in other databases or unpublished sources may not have been included. Limited experimental validation was available overall, with very few in-vivo studies and most superiority claims based only on computational metrics. Finally, selective reporting within individual studies remains a concern, since in many cases only the most favourable compounds or assays were described, introducing a positivity bias that may overestimate apparent success. 5.5 Recommendations Future studies should report docking scores, binding free energy values, and potency data more consistently, with clear description of the methods used. Stronger experimental validation is required, particularly through in-vitro and in-vivo testing, to establish biological relevance beyond computational outputs. Comparator analysis should be applied in a standardised way, with transparent reporting of both favourable and unfavourable results to limit positivity bias. ADME(T) evaluation should be incorporated more routinely, as pharmacokinetic properties are essential for assessing translational potential. Code and dataset availability should be prioritised, since open resources are necessary for reproducibility and fair comparison across models. • Benchmarking frameworks should be developed to evaluate generative approaches consistently across different targets and endpoints. 5.6 Future Work The evidence map created here can be extended by linking docking and binding free energy distributions with experimental potency data to explore correlations between computational predictions and biological outcomes. Future work should examine in greater depth the subset of AI-generated compounds that reached in-vivo validation to understand which modelling approaches are most likely to translate into efficacy. Probe-likeness assessments could be expanded into prospective evaluations, testing whether AI-generated molecules consistently meet chemical biology standards when taken forward experimentally. Finally, updating this evidence synthesis at regular intervals will allow tracking of progress over time, particularly as newer generative families such as diffusion and transformer-based models expand in oncology. 6. Conclusion This review, conducted under PRISMA guidance, is, to our knowledge, the first quantitative synthesis focused on cancer studies using de novo generative artificial intelligence and machine learning. Out of 57 included papers, kinase targets were by far the most common. Enzymes, GPCRs and immune proteins appeared much less frequently. The number of publications grew rapidly after 2021, but fewer than half of the studies reported docking scores or in vitro potency values, and only a small group described in vivo testing. Binding free energy results were reported in about one quarter of the papers, and ADME(T) assessments in roughly one third. Around half of the studies suggested that AI generated molecules were superior to reference drugs, but in most cases, the evidence came only from computational comparisons. Taken together, these observations show that generative methods can design active anticancer compounds, yet they also reveal major shortcomings in validation and reproducibility. At present, the evidence base remains largely computational, with only limited movement into biological systems. Looking ahead, several priorities are clear. Reporting of docking and potency data should be more consistent. Benchmarking frameworks are needed so that models can be compared on a fair basis. The release of code and datasets would support transparency and reproducibility. Most of all, stronger experimental testing, both in vitro and in vivo, will be essential to determine the true value of generative AI for oncology. If these steps are taken, the field can move from early proof of concept toward practical tools that accelerate the development of new cancer treatments. Declarations Ethics statement This study did not involve human participants or animal experimentation. Ethical approval and informed consent were therefore not required. Declaration of competing interests The authors declare no competing interests. Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Author contributions (CRediT taxonomy) Conceptualization: Hashim Hashim, Ali Hasnain Methodology: Hashim Hashim, Ali Hasnain Data curation: Hashim Hashim, Fahad Abubakr Formal analysis: Hashim Hashim, Fahad Abubakr, Mohamed Elhassadi Writing – original draft: Hashim Hashim Writing – review & editing: All authors Supervision: Ali Hasnain Acknowledgements The authors thank colleagues at the Royal College of Surgeons in Ireland for academic support and constructive discussion. The authors thank Nisreen Abdulsalam for her careful reading of the manuscript and constructive feedback on clarity and presentation. References Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians. 2021;71(3):209-49. Bray F, Laversanne M, Weiderpass E, Soerjomataram I. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide. CA Cancer J Clin. 2024;74(3):229-63. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians. 2024;74(3):229-63. Scannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R&D efficiency. Nature Reviews Drug Discovery. 2012;11(3):191-200. Wouters OJ, McKee M, Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. JAMA. 2020;323(9):844-53. Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics. 2018;20(2):273-86. Hughes JP, Rees S, Kalindjian SB, Philpott KL. Principles of early drug discovery. Br J Pharmacol. 2011;162(6):1239-49. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. Journal of Health Economics. 2016;47:20-33. Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering. Science. 2018;361(6400):360-5. Walters WP, Barzilay R. Applications of Deep Learning in Molecule Generation and Molecular Property Prediction. Accounts of Chemical Research. 2021;54(2):263-70. Zeng X, Wang F, Luo Y, Kang S-g, Tang J, Lightstone FC, et al. Deep generative molecular design reshapes drug discovery. Cell Reports Medicine. 2022;3(12):100794. Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Computers in Biology and Medicine. 2022;145:105403. Yim J, Stärk H, Corso G, Jing B, Barzilay R, Jaakkola TS. Diffusion models in protein structure and docking. WIREs Computational Molecular Science. 2024;14(2):e1711. Alakhdar A, Poczos B, Washburn N. Diffusion Models in De Novo Drug Design. Journal of Chemical Information and Modeling. 2024;64(19):7238-56. Romanelli V, Cerchia C, Lavecchia A. Deep generative models in the quest for anticancer drugs: ways forward. Frontiers in Drug Discovery. 2024;Volume 4 - 2024. Odah M. Artificial Intelligence Meets Drug Discovery: A Systematic Review on AI-Powered Target Identification and Molecular Design. Preprints: Preprints; 2025. Abbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. ChemBioChem. 2024;25(14):e202300816. Sarvepalli S, Vadarevu S. Role of artificial intelligence in cancer drug discovery and development. Cancer Letters. 2025;627:217821. F. A. Artificial Intelligence in Oncology: Applications, Challenges and Future Frontiers. International Journal of Pharmaceutical Investigation. 2024 Jul 1;14(3):647–56. Wang L, Song Y, Wang H, Zhang X, Wang M, He J, et al. Advances of Artificial Intelligence in Anti-Cancer Drug Design: A Review of the Past Decade. Pharmaceuticals. 2023;16(2):253. Conte L, Caruso G, Philip AK, Cucci F, De Nunzio G, Cascio D, et al. Artificial Intelligence-Assisted Drug and Biomarker Discovery for Glioblastoma: A Scoping Review of the Literature. Cancers. 2025;17(4):571. Tiwari A, Mishra S, Kuo T-R. Current AI technologies in cancer diagnostics and treatment. Molecular Cancer. 2025;24(1):159. Structural Genomics Consortium (SGC-UNC). Chemical Probes [Internet]. Available from: https://www.sgc-unc.org/main-st; Accessed August 2025. EUbOPEN. Chemical Probes [Internet]. Available from: https://www.eubopen.org/chemical-probes; Accessed August 2025. A Y, F M, J B. Approach for the Design of Covalent Protein Kinase Inhibitors via Focused Deep Generative Modeling. Molecules (Basel, Switzerland). 2022;27(2). A Y, H H, J B. Adapting the DeepSARM approach for dual-target ligand design. Journal of computer-aided molecular design. 2021;35(5):587-600. A Y, Y A, E K, T T, S M, T S, et al. Design and Synthesis of DDR1 Inhibitors with a Desired Pharmacophore Using Deep Generative Models. ChemMedChem. 2021;16(6):955-8. AK P. AI-assisted generation and in-depth in-silico evaluation of potential inhibitor targeting aurora kinase A (AURKA): An anticancer discovery exploiting synthetic lethality approach. Archives of biochemistry and biophysics. 2024;762:110209. B O, Z F, W M, R S. A Deep-Learning Proteomic-Scale Approach for Drug Design. Pharmaceuticals (Basel, Switzerland). 2021;14(12). C A, H Y, X L, R D, Y D, F G. MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS computational biology. 2024;20(6):e1012229. C P, H C, H Y, X C, M Z, L Q, et al. Oral ENPP1 inhibitor designed using generative AI as next generation STING modulator for solid tumors. Nature communications. 2025;16(1):4793. C Y, S U, K K, M I, Y Y. De novo drug design based on patient gene expression profiles via deep learning. Molecular informatics. 2023;42(8):e2300064. CT F. PD-1 Targeted Antibody Discovery Using AI Protein Diffusion. Technology in cancer research & treatment. 2024;23:15330338241275947. D D, B C, R S, A R. Gex2SGen: Designing Drug-like Molecules from Desired Gene Expression Signatures. Journal of chemical information and modeling. 2023;63(7):1882-93. D H, Q L, Y M, Q M, L X, C H, et al. De Novo Generation and Identification of Novel Compounds with Drug Efficacy Based on Machine Learning. Advanced science (Weinheim, Baden-Wurttemberg, Germany). 2024;11(11):e2307245. D N, A H, RA B, M H. Machine Learning Application for Medicinal Chemistry: Colchicine Case, New Structures, and Anticancer Activity Prediction. Pharmaceuticals (Basel, Switzerland). 2024;17(2). D R, S Y, N B, A S, S M, C T. TumFlow: An AI Model for Predicting New Anticancer Molecules. International journal of molecular sciences. 2024;25(11). D S, L X, M T, Z W, W Z, J L, et al. De novo design of mIDH1 inhibitors by integrating deep learning and molecular modeling. Frontiers in pharmacology. 2024;15:1491699. DS B, DA O, VF O, AO A, A M, UC O, et al. In-silico-based lead optimization of hit compounds targeting mitotic kinesin Eg5 for cancer management. In silico pharmacology. 2025;13(1):9. EG G, P V, P G-N, E U, G M-A, C P, et al. AI-Driven De Novo Design and Development of Nontoxic DYRK1A Inhibitors. Journal of medicinal chemistry. 2025;68(10):10346-64. F G, CS N, G G, AT M, JA H, G S. Designing Anticancer Peptides by Constructive Machine Learning. ChemMedChem. 2018;13(13):1300-2. F G, CS N, M H, G G, JA H, M K, et al. De novo design of anticancer peptides by ensemble artificial neural networks. Journal of molecular modeling. 2019;25(5):112. F R, X D, M Z, M K, X C, W Z, et al. AlphaFold accelerates artificial intelligence powered drug discovery: efficient discovery of a novel CDK20 small molecule inhibitor. Chemical science. 2023;14(6):1443-52. G C, D K, J O. AI-Based Drug Discovery of TKIs Targeting L858R/T790M/C797S-Mutant EGFR in Non-small Cell Lung Cancer. Frontiers in pharmacology. 2021;12:660313. G KG. Accelerating drug discovery targeting dihydroorotate dehydrogenase using machine learning and generative AI approaches. Computational biology and chemistry. 2025;118:108443. HW vdM, J dMvO, JGC vH, PH vdG, GJP vW. Integrating Pharmacokinetics and Quantitative Systems Pharmacology Approaches in Generative Drug Design. Journal of chemical information and modeling. 2025;65(10):4783-96. J B, M M, A O, J C, G M, M RM. PaccMann(RL): De novo generation of hit-like anticancer molecules from transcriptomic data via reinforcement learning. iScience. 2021;24(4):102269. J C, D G, LV S, IX S, V SK, B M, et al. Integrated AI and machine learning pipeline identifies novel WEE1 kinase inhibitors for targeted cancer therapy. Molecular diversity. 2025. KY S, S MV, WR D, J K. COTI-2, a novel small molecule that is active against multiple human cancer cell lines in vitro and in vivo. Oncotarget. 2016;7(27):41363-79. L H, J L, HL Z, LC Z, RL Y, CM K. De novo design of dual-target JAK2, SMO inhibitors based on deep reinforcement learning, molecular docking and molecular dynamics simulations. Biochemical and biophysical research communications. 2023;638:23-7. L L, X Z, X H. Generating Potential RET-Specific Inhibitors Using a Novel LSTM Encoder-Decoder Model. International journal of molecular sciences. 2024;25(4). L W, M X, Z L, C J, X L, Y H, et al. Hit Identification Driven by Combining Artificial Intelligence and Computational Chemistry Methods: A PI5P4K-β Case Study. Journal of chemical information and modeling. 2023;63(16):5341-55. L Y, G Y, Z B, Y T, L H, Y N, et al. Accelerating the discovery of anticancer peptides targeting lung and breast cancers with the Wasserstein autoencoder model and PSO algorithm. Briefings in bioinformatics. 2022;23(5). M M, M A, K G, A A-J, S D, M O, et al. "Several birds with one stone": exploring the potential of AI methods for multi-target drug design. Molecular diversity. 2024. M N, NU A, T A, K J, MA S, M A, et al. Artificial Intelligence Assisted Pharmacophore Design for Philadelphia Chromosome-Positive Leukemia with Gamma-Tocotrienol: A Toxicity Comparison Approach with Asciminib. Biomedicines. 2023;11(4). M W, S L, J W, O Z, H D, D J, et al. ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning. Nature communications. 2024;15(1):10127. N A, J H, Y L, B O-B. Integrating transformers and many-objective optimization for drug design. BMC bioinformatics. 2024;25(1):208. OA D, AA Y, MO I, NO O, AS O, BC N. Investigation of the MDM2-binding potential of de novo designed peptides using enhanced sampling simulations. International journal of biological macromolecules. 2024;269:131840. OJ G, A N, T K, NZ R, B K. In silico evolution of autoinhibitory domains for a PD-L1 antagonist using deep learning models. Proceedings of the National Academy of Sciences of the United States of America. 2023;120(49):e2307371120. R C, HF J, GL R, BA S. Streamlining Computational Fragment-Based Drug Discovery through Evolutionary Optimization Informed by Ligand-Based Virtual Prescreening. Journal of chemical information and modeling. 2024;64(9):3826-40. R Q, H Z, W H, Z S, J L. Deep learning-based design and screening of benzimidazole-pyrazine derivatives as adenosine A(2B) receptor antagonists. Journal of biomolecular structure & dynamics. 2025;43(7):3225-41. S C, J X, R Y, DD X, Y Y. Structure-aware dual-target drug design through collaborative learning of pharmacophore combination and molecular simulation. Chemical science. 2024;15(27):10366-80. SH J, D S, SK M, J C, S L, M J, et al. PCW-A1001, AI-assisted de novo design approach to design a selective inhibitor for FLT-3(D835Y) in acute myeloid leukemia. Frontiers in molecular biosciences. 2022;9:1072028. T P, M A, RI O, RA G, JAR S, JP A. Deep generative model for therapeutic targets using transcriptomic disease-associated data-USP7 case study. Briefings in bioinformatics. 2022;23(4). T Q, Y W, M K, H Z, T W, Z X, et al. Identification of potential PIM-2 inhibitors via ligand-based generative models, molecular docking and molecular dynamics simulations. Molecular diversity. 2024;28(4):2245-62. T V, A D, O H, MR M, CY E. AI-Predicted mTOR Inhibitor Reduces Cancer Cell Proliferation and Extends the Lifespan of C. elegans. International journal of molecular sciences. 2023;24(9). TM C, G L, P D, M C, N C, M S, et al. DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues. Journal of chemical information and modeling. 2022;62(6):1411-24. TR Q, KA G, C T, JA B, G B, E B, et al. Accelerated Discovery of Carbamate Cbl-b Inhibitors Using Generative AI Models and Structure-Based Drug Design. Journal of medicinal chemistry. 2024;67(16):14210-33. W C, B J, Y Y, L M, T L, J C. Identification of STAT3 phosphorylation inhibitors using generative deep learning, virtual screening, molecular dynamics simulations, and biological evaluation for non-small cell lung cancer therapy. Molecular diversity. 2024. W Z, P Z, T L, F G, Q L, X C, et al. Discovery of Novel SIK2/3 Inhibitors for the Potential Treatment of MEF2C+ Acute Myeloid Leukemia (AML). Journal of medicinal chemistry. 2025;68(7):7518-38. X L, Q L, X Y, L W, J Q, X Y, et al. A Specialized and Enhanced Deep Generation Model for Active Molecular Design Targeting Kinases Guided by Affinity Prediction Models and Reinforcement Learning. Journal of chemical information and modeling. 2025;65(7):3294-308. X W, J L, L M, B W, B L, Y J. Towards novel small-molecule inhibitors blocking PD-1/PD-L1 pathway: From explainable machine learning models to molecular dynamics simulation. International journal of biological macromolecules. 2024;282:136325. X X, C X, W H, L W, H L, J Z, et al. HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer. Bioinformatics (Oxford, England). 2024;40(6). Y W, C W, J L, D S, F M, M Z, et al. Discovery of 3-hydroxymethyl-azetidine derivatives as potent polymerase theta inhibitors. Bioorganic & medicinal chemistry. 2024;103:117662. Y W, M G, X C, D A. Screening of multi deep learning-based de novo molecular generation models and their application for specific target molecular generation. Scientific reports. 2025;15(1):4419. Y W, Z W, Y L, P Y, X L. Improving Covalent and Noncovalent Molecule Generation via Reinforcement Learning with Functional Fragments. Journal of chemical information and modeling. 2025. Y Y, D A, Y W, W Z, G C, J T, et al. Wee1 inhibitor optimization through deep-learning-driven decision making. European journal of medicinal chemistry. 2024;280:116912. Y Y, J H, H H, J H, G Y, T X, et al. Accelerated Discovery of Macrocyclic CDK2 Inhibitor QR-6401 by Generative Models and Structure-Based Drug Design. ACS medicinal chemistry letters. 2023;14(3):297-304. Y Y, R Z, Z L, L M, S W, H D, et al. Discovery of Highly Potent, Selective, and Orally Efficacious p300/CBP Histone Acetyltransferases Inhibitors. Journal of medicinal chemistry. 2020;63(3):1337-60. Y Z, J H, X L, W S, N Z, J Z, et al. Self-awareness of retrosynthesis via chemically inspired contrastive learning for reinforced molecule generation. Briefings in bioinformatics. 2025;26(2). Z L, J H, Y J, B L, A Z. Computational design of CDK1 inhibitors with enhanced target affinity and drug-likeness using deep-learning framework. Heliyon. 2024;10(22):e40345. Additional Declarations The authors declare no competing interests. Supplementary Files SupplementaryAppendixSearchStrategy.docx Supplementary Appendix SupplementaryTables.xlsx Supplementary Tables Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8408084","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Systematic Review","associatedPublications":[],"authors":[{"id":563182218,"identity":"9c4d3427-40b6-4b07-b312-c98852e55515","order_by":0,"name":"Hashim Hashim","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+UlEQVRIiWNgGAWjYBACAwYGNoYEhgMMDBKMDQwfKmyAYkBGArFaGGecSSNSCwNYCwMDM2/bYcIOM2c//uzBw5w78vKzm1s38LCdz+fvP9zA8KACtxbLnhxzg8Rtzww33DnYdkOC57bljBuJQIedweOwAzlsEonbDjNukEhsu2EgcdvAABQOiW14tJx//gykxX7+DKCWBINzBgb8B4Fa/uHRciPBDKQlseEGUMuBhAMGBgxAhwERHi1vwFqSNwC13Gw4kGwgAfTLgYRj+ByW/kzy57bDtvNnpD+7/fefnQF///GHD3/U4NaCHRwgVcMoGAWjYBSMAlQAACj8YXyP5ppBAAAAAElFTkSuQmCC","orcid":"https://orcid.org/0009-0009-3559-6311","institution":"Royal College of Surgeons In Ireland","correspondingAuthor":true,"prefix":"","firstName":"Hashim","middleName":"","lastName":"Hashim","suffix":""},{"id":563182219,"identity":"815d0303-503e-4dbf-90e5-c4987db1a3f5","order_by":1,"name":"Fahad Abubakr","email":"","orcid":"https://orcid.org/0009-0002-3546-2056","institution":"Royal College of Surgeons In Ireland","correspondingAuthor":false,"prefix":"","firstName":"Fahad","middleName":"","lastName":"Abubakr","suffix":""},{"id":563182220,"identity":"59a66270-b7ee-4eeb-b55c-af533d880ee7","order_by":2,"name":"Mohamed Elhassadi","email":"","orcid":"https://orcid.org/0009-0007-8230-3214","institution":"Royal College of Surgeons In Ireland","correspondingAuthor":false,"prefix":"","firstName":"Mohamed","middleName":"","lastName":"Elhassadi","suffix":""},{"id":563182221,"identity":"05a7cdc2-3666-46b8-81a2-4807d1f2af9c","order_by":3,"name":"Ali Hasnain","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA0klEQVRIiWNgGAWjYDACdsY2MM3GwHwASEkAGYQAM1wLWwKxWuBqeAyIcxd/M3Pbgx9/7sjzSfd8+/DjjwUDH3sDfi0ShxnbDXvbnhm2yZzdPLO3DegwngMErDnM2CbB2wAiczczMzYAtUgk4NchD1Qs+efPYfs2iZzHzAx/gFrkH+DXYgDUIs3DdjgRqIUZGBQgWwi4yxCkRbbtcHKbRJoxI9AvPGw8BBwmd7z9meSbP4dt589Ifszw40+dnHz7AQLWoAMeEtWPglEwCkbBKMAGAJr4OcTJ0VNDAAAAAElFTkSuQmCC","orcid":"https://orcid.org/0000-0003-4014-4394","institution":"Royal College of Surgeons In Ireland","correspondingAuthor":true,"prefix":"","firstName":"Ali","middleName":"","lastName":"Hasnain","suffix":""}],"badges":[],"createdAt":"2025-12-19 20:48:54","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-8408084/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8408084/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":98861293,"identity":"13a9d3cf-0492-4f17-a31e-96fc862cab23","added_by":"auto","created_at":"2025-12-23 09:27:54","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":221365,"visible":true,"origin":"","legend":"","description":"","filename":"PreprintArtificialIntelligenceandMachineLearningforDeNovoCancerDrugDiscovery.docx","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/5aef6773c0b9db81cd1173b3.docx"},{"id":98861298,"identity":"c8cdc537-31e9-465a-9ad6-f22fd1029979","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":342,"visible":true,"origin":"","legend":"","description":"","filename":"rs8408084.json","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/eccfdde78ee6cb5292dd2a3d.json"},{"id":99308713,"identity":"f48c0630-93f7-41c5-8f1f-d98e5132c407","added_by":"auto","created_at":"2025-12-31 16:09:00","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":152222,"visible":true,"origin":"","legend":"","description":"","filename":"rs84080840enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/5095f39bb9d188dbc65902cb.xml"},{"id":98861302,"identity":"af1f2d38-30f4-4257-b352-c090657f8843","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"jpeg","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":108715,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/f89d0d3f9b1e68c5e2c09c06.jpeg"},{"id":99308566,"identity":"88b4a027-2df3-4800-84b6-27a79f282749","added_by":"auto","created_at":"2025-12-31 16:08:47","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":54923,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/919afd0d75bb6a25b1159f16.png"},{"id":98861310,"identity":"78f4d30f-c824-46a3-9cf9-3b8032bf5054","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"xml","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":143496,"visible":true,"origin":"","legend":"","description":"","filename":"rs84080840structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/028f3c4bff79bdeff8c40a67.xml"},{"id":98861307,"identity":"020203e3-b2bd-44d8-aab4-66372dfd4989","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"html","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":172300,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/941237b7b0492ec0c32cc96e.html"},{"id":99308615,"identity":"4d589250-3bef-4328-8c26-120b3e771d88","added_by":"auto","created_at":"2025-12-31 16:08:50","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":27473,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of the study workflow, illustrating study design, data search and collection, data preprocessing, information extraction, and results presentation.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/ba9e25d327b612d3682167ce.png"},{"id":99309102,"identity":"f687fc85-09f5-49d9-a54e-a0495e54aeec","added_by":"auto","created_at":"2025-12-31 16:09:44","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":240465,"visible":true,"origin":"","legend":"\u003cp\u003ePRISMA 2020 flow diagram summarizing identification, screening, eligibility, and inclusion with counts and full-text exclusion reasons.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/dabb8570bab287d065ca0a34.png"},{"id":98861290,"identity":"127d118f-4649-418b-9857-3f42c7cbf124","added_by":"auto","created_at":"2025-12-23 09:27:54","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":32708,"visible":true,"origin":"","legend":"\u003cp\u003eDistribution of studies by target type.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/f7bbf3055e2d577e1f749f98.png"},{"id":98861299,"identity":"b22da861-67c9-497d-b6e0-121c40614f63","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":13579,"visible":true,"origin":"","legend":"\u003cp\u003eAnnual number of published studies using generative AI for cancer drug discovery from 2016 to 2025.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/602696a97d0350e06b9497a8.png"},{"id":98861295,"identity":"e64cc836-a129-4319-a656-06c6f649e46b","added_by":"auto","created_at":"2025-12-23 09:27:54","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":16329,"visible":true,"origin":"","legend":"\u003cp\u003eTrends in public release of code from 2016 to 2025.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/9c7b0ff44b52503f6b7825b8.png"},{"id":98861305,"identity":"bf96ca50-c27c-4f1c-99c5-9efb2491b2fe","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":19538,"visible":true,"origin":"","legend":"\u003cp\u003eData completeness across reporting fields. The figure illustrates variation in how comprehensively studies reported key endpoints.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/0716ab3562d6af4cd26e8933.png"},{"id":98861308,"identity":"59dca543-c873-4989-97df-95b700ffc5ee","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":18072,"visible":true,"origin":"","legend":"\u003cp\u003eNumber of studies reporting binding free energy Docking Score values by year. This figure depicts the temporal adoption of docking score calculations in AI-driven cancer drug discovery research.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/b08d3d3ec46062841cad9c11.png"},{"id":99308648,"identity":"678f1fd3-bbf0-44eb-84f4-c52fc0d831e8","added_by":"auto","created_at":"2025-12-31 16:08:53","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":18519,"visible":true,"origin":"","legend":"\u003cp\u003eNumber of studies reporting binding free energy (BFE) values by year. This figure depicts the temporal adoption of BFE calculations in AI-driven cancer drug discovery research.\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/6dd57632fab94d0560d3eaf6.png"},{"id":98861301,"identity":"15369585-414c-4e09-ab74-cd84fbaa0692","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":15043,"visible":true,"origin":"","legend":"\u003cp\u003eAssay breakdown by entry type. The figure distinguishes between biochemical and cell-based assays across total and probe-like entries.\u003c/p\u003e","description":"","filename":"9.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/b5ad93d50b746a1f847dcb5c.png"},{"id":98861306,"identity":"2d154065-8da9-4381-94a5-07530ed8a912","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":21941,"visible":true,"origin":"","legend":"\u003cp\u003eDistribution of in-vivo species. The pie chart summarizes the animal models used across studies conducting in-vivo validation.\u003c/p\u003e","description":"","filename":"10.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/3d21acf03cfc2be269cd110a.png"},{"id":98861309,"identity":"07ef3463-202f-482a-aaf3-21741f7a4723","added_by":"auto","created_at":"2025-12-23 09:27:55","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":16859,"visible":true,"origin":"","legend":"\u003cp\u003eStudies using ADME(T) by year. This figure illustrates the temporal trend in reporting absorption, distribution, metabolism, excretion, and toxicity (ADME[T]) analyses within AI-generated cancer drug discovery studies.\u003c/p\u003e","description":"","filename":"11.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/51bc8fb1691777153e326c97.png"},{"id":99308562,"identity":"4ebd975b-7dbe-4e5f-b4ff-63effc7ee04f","added_by":"auto","created_at":"2025-12-31 16:08:47","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":19590,"visible":true,"origin":"","legend":"\u003cp\u003eReference compound performance outcomes. The figure categorizes studies based on whether AI-generated compounds were reported to outperform established reference drugs, providing an overview of how comparative claims were represented across the reviewed literature.\u003c/p\u003e","description":"","filename":"12.png","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/0d019a64fafbfd9c104f17ad.png"},{"id":99788040,"identity":"f9a76e00-6ca4-441b-8522-6b6a921a3ce9","added_by":"auto","created_at":"2026-01-08 12:43:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1607706,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/d2c59e17-a35c-4e1d-92a5-101658c5c5a4.pdf"},{"id":99308754,"identity":"35bddb76-51f1-4747-af92-fdf3e91ef01b","added_by":"auto","created_at":"2025-12-31 16:09:06","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":15357,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary Appendix\u003c/p\u003e","description":"","filename":"SupplementaryAppendixSearchStrategy.docx","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/af5ad378f3f86e6cc42fd771.docx"},{"id":99308879,"identity":"c9c18987-fce9-4fc5-bc2e-9899a61abc8d","added_by":"auto","created_at":"2025-12-31 16:09:25","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":70248,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary Tables\u003c/p\u003e","description":"","filename":"SupplementaryTables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8408084/v1/b02b1a68d6d2c05962e4e6c1.xlsx"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eArtificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps\u003c/p\u003e","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eCancer is among the leading causes of illness and death across the world. In 2020, it accounted for an estimated 19.3\u0026nbsp;million new cases and 10\u0026nbsp;million deaths [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e], and in 2022 the global burden was reported to increase at nearly 20\u0026nbsp;million new cases [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. The economic costs are no less striking as between 2020 and 2050, cancer care and the loss of productivity are projected to amount to around 25.2 trillion US dollars [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Even with significant ongoing investments, the path of oncology drug discovery remains slow and highly inefficient [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Drug development is often slow and carry costs of several billion dollars, where only about 3.4% to 5.1% of oncology drugs that reach clinical testing ever gain regulatory approval [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. These statistics show the need to expedite and prioritize the potential drug candidates much earlier in the pipeline.\u003c/p\u003e \u003cp\u003eThe traditional drug discovery process is a multi-stage endeavor that begins with target identification and validation, where researchers determine whether a biological molecule is causally linked to cancer progression and assess its suitability as a druggable target. Once a target is established, hit discovery is conducted, often through high-throughput experimental screening or virtual screening of chemical libraries, to identify initial \u0026ldquo;hit\u0026rdquo; compounds. This is followed by hit-to-lead optimization, where chemical structures are modified to improve potency, selectivity, and physicochemical properties. In the lead optimization stage, medicinal chemistry and computational modelling are used to enhance pharmacokinetic and pharmacodynamic characteristics, focusing on absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox). Promising leads progress into preclinical development, which includes in vitro assays and in vivo animal studies to evaluate efficacy and safety. Only a very small fraction of candidates advances into clinical trials, which are divided into Phase I (safety and dosage), Phase II (efficacy and side effects), and Phase III (comparative effectiveness) before any regulatory approval is possible [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Each stage is characterized by high attrition rates, lengthy timelines, and escalating costs, which collectively explain why oncology remains one of the most resource-intensive areas in drug development [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eGenerative artificial intelligence (AI) and machine learning (ML) have recently gained attention as potential ways to address this gap. Unlike traditional computational methods, which mainly screen compounds from existing chemical libraries, generative models are designed to learn patterns from chemical and biological data and propose entirely new molecules [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Several families of models are now being applied in drug discovery, including recurrent neural networks, generative adversarial networks, and transformer-based architectures [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. These systems can suggest compounds that balance potency, selectivity, and drug-like properties, which, in theory, reduces both the time and cost of searching for chemical space [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Proof-of-concept studies have even shown that generative AI can reproduce known drugs or design new scaffolds within days or weeks [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Such demonstrations have resulted in expectations that AI-driven design could play a major role in accelerating the discovery of anticancer agents.\u003c/p\u003e \u003cp\u003eIn this article, we present a decade-long (January 2015 to June 20, 2025) systematic review that maps and quantitatively synthesizes the application of generative AI and ML to de novo cancer drug discovery. Unlike previous reviews, which were either method-oriented or disease-agnostic, this work specifically targets oncology. Our aim is to establish an evidence base that highlights both opportunities and limitations of these models in producing compounds with potential clinical relevance.\u003c/p\u003e \u003cp\u003eOur initial findings show that out of 1,130 records screened, 57 studies met inclusion criteria. Kinases were by far the most common drug targets (49%), while enzymes, GPCRs, and immune proteins were less frequently studied. Publication activity rose sharply after 2021, mirroring the broader surge in generative AI research. Less than half of the studies reported docking or in vitro potency data, and only 14% described it in vivo validation. Binding free energy values appeared in just over one-quarter of studies, while ADME-Tox assessments were reported in about one-third. Roughly half of the included studies claimed that AI-generated compounds outperformed reference drugs, but most of these claims were based only on computational evidence. Reproducibility was inconsistent, with public code available in just over half of the papers. Taken together, these findings suggest that while generative AI can design molecules with promising anticancer potential, experimental validation and reproducibility remain major gaps.\u003c/p\u003e \u003cp\u003eThe selection criteria followed in our research adopts a PRISMA-guided, cancer-specific, quantitative synthesis of de novo generative AI and ML studies. Using a defined corpus, we (i) map targets, model families, and time trends; (ii) quantify how often studies report numeric docking and binding-free-energy values and summarize their distributions overall and by target type and model class; (iii) track progression to biochemical, cell, and in vivo end points; (iv) examine claims that AI-generated compounds outperform reference drugs alongside the numerical evidence provided; and (v) assess openness and reproducibility indicators such as code availability and dataset disclosure. Where prior reviews explain how models work, our goal is to put oncology-specific numbers on what is being reported and to highlight practical steps that would make studies more comparable and more informative for translation.\u003c/p\u003e \u003cp\u003eThe remainder of this paper is organized as follows: Section 2 presents the state of the art, Section 3 outlines methods and eligibility criteria, Section 4 presents the results, and Section 5 discusses implications, limitations, and future directions.\u003c/p\u003e"},{"header":"2. State of the Art","content":"\u003cp\u003eA systematic attempt to summarise the use of generative AI in drug discovery was published by Martinelli\u0026rsquo;s 2022 review [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. This publication classified six major model families including recurrent neural networks, variational autoencoders, generative adversarial networks, adversarial autoencoders, evolutionary algorithms, and reinforcement learning. It also described recurring challenges, such as limited molecular diversity, uncertain synthetic accessibility, and the lack of prospective experimental validation. Importantly, Martinelli\u0026rsquo;s review was disease agnostic and appeared before the surge of oncology focused applications that began to emerge after 2023.\u003c/p\u003e \u003cp\u003eIn the past three years, generative AI has been applied with increasing frequency to cancer relevant targets. Many studies now focus on kinases and oncogenic proteins such as EGFR and KRAS, and some extend beyond docking scores to report in vitro potency. The variety of architectures has also expanded. Diffusion models have demonstrated the capacity to generate drug like molecules, and in some cases three dimensional structures conditioned on binding pockets [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Transformer based chemical language models, trained on large compound libraries, use natural language processing methods to explore chemical space [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Deep generative models have been applied to learn chemical patterns from large datasets and generate novel molecular structures with desired properties, reshaping how candidate compounds are explored in anticancer drug discovery. [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eRecent systematic reviews extend this picture. One review described how AI has been applied across the discovery pipeline, including target identification, molecular design, synthesis planning, and ADMET prediction [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Another highlighted recurring challenge in the field is data quality and bias, as well as model interpretability, while stressing the need to integrate AI methods with traditional experimental workflows to enhance reliability and translational relevance [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. An oncology-focused review showed how AI supports cancer drug discovery and development, including hit identification, compound generation, and biomarker-informed design [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. A second oncology-focused review described how artificial intelligence supports both therapeutic development and clinical cancer management, illustrating applications from predictive diagnostics to personalized treatment planning and improved patient care [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eMore specialized publications provide further details. A decade-long review catalogued advances in AI-based anti-cancer drug design, showing how data-driven and AI-driven strategies are being deployed across diverse computational targets and screening tasks. [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. A scoping review focused on glioblastoma described how AI is used for both drug discovery and biomarker discovery in this high unmet need indication [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. A comprehensive oncology review demonstrated how artificial intelligence is transforming clinical cancer care, from improving diagnosis and biomarker discovery to enabling individualized treatment planning and more effective patient management. [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eUnlike earlier reviews, which were method oriented or disease agnostic, the present work is cancer specific. Previous surveys described model families and outlined opportunities but did not quantify translational outcomes. They rarely reported how often docking scores, binding free energies, in vitro potency, in vivo validation, or ADME(T) assessments were included. They also gave little attention to reproducibility indicators such as code and data availability. The present review provides the first systematic synthesis focused on oncology. By mapping computational and experimental endpoints and assessing comparator claims, it shows that AI generated molecules can reach levels of activity relevant to cancer treatment and in some cases demonstrate potential to outperform reference drugs.\u003c/p\u003e"},{"header":"3. Methods","content":"\u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Study Design\u003c/h2\u003e \u003cp\u003eA PRISMA-guided systematic review of PubMed was carried out from 2015 to June 20, 2025. Eligible studies applied generative AI or ML architectures to design new molecules with cancer relevance. Extracted data included study targets, model families, docking scores, binding free energies, in vitro potency (IC₅₀/EC₅₀), in vivo validation, ADME(T) assessments, code availability, and comparator performance. Analyses were descriptive and aimed at mapping the coverage and distribution of reported outcomes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Data Search and Collection\u003c/h2\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e3.2.1 Literature search\u003c/h2\u003e \u003cp\u003eWe searched PubMed on June 20, 2025, to find studies that applied AI or ML to de novo anticancer drug design. The research combined three concepts: AI/ML, oncology, and drug discovery or design. Both MeSH terms and free text were used. To keep the focus on original work, we left out reviews, systematic reviews, meta-analyses, editorials, commentaries, and preprints. We have limited the time frame to the past ten years. The full Boolean string is given in Supplementary Appendix 1. No other databases or grey literature were checked.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e3.2.2 Eligibility criteria\u003c/h2\u003e \u003cp\u003eStudies were included if they:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003ewere published in English.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003ehad clear relevance to cancer (any type), with the cancer type noted where specified.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eapplied artificial intelligence (AI) or machine learning (ML), specifically generative models such as a variational autoencoder, generative adversarial network, or graph-based neural network to design molecules de novo.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003ehad the goal of discovering or designing novel anti-cancer drugs (not repurposing, screening, synergy, or response prediction).\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eused AI/ML to generate new molecular entities, i.e. performed de novo drug design targeting cancer\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eWe excluded studies that only applied predictive or QSAR models and non-original formats.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eExclusion Criteria\u003c/strong\u003e \u003cp\u003eFor our research the publications were excluded if they met any of the following conditions\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003ePublished only as conference abstracts without a full-text article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003ePublished as opinion pieces, editorials, or letters.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eDid not report empirical data.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eFocused outside oncology.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eDid not involve an application of artificial intelligence.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Data Pre-processing\u003c/h2\u003e \u003cdiv id=\"Sec9\" class=\"Section3\"\u003e \u003ch2\u003e3.3.1 Screening and selection\u003c/h2\u003e \u003cp\u003eAll records were moved into a reference manager and screened in two steps, following PRISMA 2020. First, titles and abstracts were screened by two reviewers. Disagreements were settled by consensus. Full texts of potentially eligible papers were then reviewed in the same way. Reasons for exclusion at this stage were recorded. A PRISMA flow chart was created to show the process.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003e3.3.2 Handling of duplicates and reviewer disagreements\u003c/h2\u003e \u003cp\u003eBecause a single database was searched, no duplicate records were identified. Disagreements were resolved by discussion.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Data / Information Extraction\u003c/h2\u003e \u003cdiv id=\"Sec12\" class=\"Section3\"\u003e \u003ch2\u003e3.4.1 Data extraction\u003c/h2\u003e \u003cp\u003eWe collected general details (title, first author, year, journal). Code availability was marked \u0026ldquo;yes\u0026rdquo; if a repository or link was provided and \u0026ldquo;no\u0026rdquo; if not. Targets were taken as reported, then grouped into categories (kinases, enzymes, GPCRs, immune proteins, transcription factors, epigenetic regulators, or other).\u003c/p\u003e \u003cp\u003eAll generative models were noted (for example, MORLD, REINVENT, MTMol-GPT). If more than one was used, all were listed. The model considered best was taken from author statements or from the most favorable results.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e3.4.2 Computational endpoints\u003c/h2\u003e \u003cp\u003eDocking scores and binding free energy values were recorded for AI- or ML-generated compounds only. Human optimized analogues were excluded. If a lead compound was given, its value was used. If several were given, the most favorable was taken. If no lead was specified, the best value across compounds was used. Where several tools were applied to the same compound, values were averaged. If tools disagreed on the top compound, values were recorded as the most favorable overall result.\u003c/p\u003e \u003cp\u003eThis meant that each study contributed one representative value. Because docking and free energy methods vary vastly, absolute values are not comparable across studies. Analyses were descriptive and meant to show the spread of values rather than provide effect sizes.\u003c/p\u003e \u003cp\u003eADME(T) reporting was coded as present if a study included any in silico assessment of absorption, distribution, metabolism, or excretion, with toxicity optional. Methods were not harmonized. The aim was only to capture how often pharmacokinetic properties were considered.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e3.4.3 Experimental endpoints\u003c/h2\u003e \u003cp\u003eIn vitro potency (IC₅₀/EC₅₀) values were included only for AI- or ML-generated compounds. If a \u0026ldquo;best\u0026rdquo; compound was named, that value was taken. If not, the lowest reported value was used. Multiple assays for the same compound were kept as separate entries. All values were converted to nM. Predicted values and human modified analogues were excluded. Values reported with inequality signs (for example\u0026thinsp;\u0026lt;\u0026thinsp;5 nM) were counted toward coverage but excluded from statistics. Because assays varied, results are shown descriptively to illustrate ranges.\u003c/p\u003e \u003cp\u003eIn vivo work was noted at the study level if any animal experiment was reported. Species were grouped as mouse, rat, dog, C. elegans, or others. Each study was counted once regardless of how many experiments it contained. The aim was to show which animal models were used and whether efficacy was reported.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section3\"\u003e \u003ch2\u003e3.4.4 Comparator analysis\u003c/h2\u003e \u003cp\u003eWe noted whether AI-generated compounds were compared to a reference drug or compound. Studies were classified as showing superiority in at least 1 end point, showing no superiority, or giving no comparator. When superiority was claimed, we noted the type of evidence (docking, free energy, IC₅₀/EC₅₀, or therapeutic index). An \u0026ldquo;outperformed reference\u0026rdquo; field was created to record overall claims.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Data Analysis\u003c/h2\u003e \u003cdiv id=\"Sec17\" class=\"Section3\"\u003e \u003ch2\u003e3.5.1 Data synthesis and analysis\u003c/h2\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eDocking scores and free energy values were summarized using minimum, quartiles, median, and maximum. These show distributions, not cross-study effect sizes.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eIn vitro potency values were summarized separately for biochemical and cell-based assays. Numeric values went into descriptive statistics. Censored values counted toward coverage only.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eProbe likeness was assessed using common thresholds: \u0026lt;100 nM for biochemical assays and \u0026lt;\u0026thinsp;1 \u0026micro;M for cell-based assays, consistent with Structural Genomics Consortium and EUbOPEN guidance.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eIn vivo data were summarized as the proportion of studies reporting animal experiments, grouped by species.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eADME(T) reporting was expressed as the share of studies including pharmacokinetic evaluation.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eTemporal patterns were described by calculating, for each year, the proportion of studies reporting docking, free energy, in vitro, in vivo, or ADME(T) data.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eFor multi-model studies, all models were listed, and the best performer was taken from author statements or results.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eComparator results were summarized as the frequency of superiority claims and the type of evidence used.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section3\"\u003e \u003ch2\u003e3.5.2 Reproducibility and transparency\u003c/h2\u003e \u003cp\u003eThe study followed PRISMA 2020. Title and abstract screening, full-text review, and data extraction were done by two reviewers independently. Conflicts were resolved by discussion. Extraction rules were written in advance and applied consistently. The complete dataset is given in Supplementary Tables. The full search strategy is also provided to support transparency and reproducibility.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"4. Results","content":"\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Study Selection\u003c/h2\u003e \u003cp\u003eA single-source search of PubMed conducted on June 20, 2025 identified 1,130 records. No additional records were obtained from other sources or databases, and no duplicates were removed (0), yielding 1,130 records for title/abstract screening. Of these, 1,046 were excluded due to failure to meet the inclusion criteria. We sought 84 full‐text reports for retrieval; 11 could not be obtained. We assessed 73 full‐text reports for eligibility and excluded 15 for the following primary reasons: wrong publication type (n\u0026thinsp;=\u0026thinsp;1), not AI/ML (n\u0026thinsp;=\u0026thinsp;4), not drug discovery (n\u0026thinsp;=\u0026thinsp;1), and not de novo design (n\u0026thinsp;=\u0026thinsp;10). The remaining 57 studies were included in the review.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Study Characteristics\u003c/h2\u003e \u003cp\u003eThe 57 included studies are described in terms of target type, year of publication, code availability, and data completeness. Each of these features is reported in the following subsections.\u003c/p\u003e \u003cdiv id=\"Sec22\" class=\"Section3\"\u003e \u003ch2\u003e4.2.1 Target types\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe categories were grouped to reflect standard divisions in cancer drug discovery. Kinases were separated as the most widely targeted signalling proteins, while enzymes were grouped to cover metabolic and catalytic regulators distinct from kinases. GPCRs and immune proteins were kept separate for their roles in receptor signalling and tumour immune interactions. Transcription factors and epigenetic regulators were noted independently as harder classes to drug but of growing interest. Less common types such as GTPases, kinesins, and structural proteins were retained as individual groups to capture exploratory efforts.\u003c/p\u003e \u003cp\u003eThe most common category was kinases (28 studies; 49.1%), followed by enzymes (11; 19.3%), cell/omics context (8; 14.0%), and GPCR (8; 14.0%). Immune proteins appeared in 4 studies (7.0%), while transcription factors were present in 3 (5.3%). Less common were epigenetic regulators (2; 3.5%), GTPases (1; 1.8%), kinesins (1; 1.8%), and structural proteins (1; 1.8%).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section3\"\u003e \u003ch2\u003e4.2.2 Publication years\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIncluded studies were published between 2016 and 2025. Output was sparse before 2021, with one study each in 2016, 2018, 2019, and 2020. Annual publications increased to 5 in 2021 and 2022, rose to 9 in 2023, and peaked at 22 in 2024. On June 20, 2025, output remained high with 12 studies published.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section3\"\u003e \u003ch2\u003e4.2.3 Code availability\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003ePublic code was available for 31 studies (54.4%), while 26 studies (45.6%) had no publicly available code. Availability was 0% from 2016 to 2020. It increased to 60% in 2021 but dropped to 40% in 2022. Availability surged in 2023 at 44.4%, peaked at 68.2% in 2024, and was 58.3% in 2025.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec25\" class=\"Section3\"\u003e \u003ch2\u003e4.2.4 Data completeness\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eNumeric docking scores were reported in 21 studies (36.8%), and BFE values in 15 studies (26.3%). In-vitro values were available for 21 studies (36.8%). In-vivo experiments were reported in 8 studies (14.0%), while 7 studies (12.3%) specified an animal model. The \u0026ldquo;Outperformed reference\u0026rdquo; field contained a substantive entry in 29 studies (50.9%).\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec26\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Docking Score Coverage and Distribution\u003c/h2\u003e \u003cdiv id=\"Sec27\" class=\"Section3\"\u003e \u003ch2\u003e4.3.1 Overall distribution of Docking Scores\u003c/h2\u003e \u003cp\u003eNumeric docking scores were reported in 21 of the 57 included studies (36.8%), yielding 28 individual values. Reported scores ranged from \u0026minus;\u0026thinsp;54.38 kcal\u0026middot;mol⁻\u0026sup1; to \u0026minus;\u0026thinsp;7.49 kcal\u0026middot;mol⁻\u0026sup1;, with a median of \u0026minus;\u0026thinsp;11.91 kcal\u0026middot;mol⁻\u0026sup1; [IQR \u0026minus;\u0026thinsp;14.17 to \u0026minus;\u0026thinsp;10.21]. In several cases, a single study contributed more than one docking score, reflecting the evaluation of multiple targets.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec28\" class=\"Section3\"\u003e \u003ch2\u003e4.3.2 Temporal Trend\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e.\u003c/p\u003e \u003cp\u003eThe proportion of included studies reporting a numeric docking score is shown in the supplementary tables. No docking scores were reported prior to 2020. In 2020, the only included study contained a docking score (100%). Subsequent years showed variable reporting rates, with 20.0% of studies in 2021 (1 of 5), 40.0% in 2022 (2 of 5), and 22.2% in 2023 (2 of 9). Reporting increased to 36.4% in 2024 (8 of 22) and 58.3% in 2025 (7 of 12). While recent years suggest greater uptake in reporting docking scores, coverage remains inconsistent across publication years.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec29\" class=\"Section3\"\u003e \u003ch2\u003e4.3.3 Docking scores by target type\u003c/h2\u003e \u003cp\u003eMedian docking scores by target type are summarised in supplementary tables. Kinases were the most frequently assessed, with 11 studies reporting docking scores (median \u0026minus;\u0026thinsp;13.72 kcal\u0026middot;mol⁻\u0026sup1;, IQR \u0026minus;\u0026thinsp;23.16 to \u0026minus;\u0026thinsp;11.12). GPCRs were the second most common (n\u0026thinsp;=\u0026thinsp;4, median \u0026minus;\u0026thinsp;10.20 kcal\u0026middot;mol⁻\u0026sup1;, IQR \u0026minus;\u0026thinsp;10.70 to \u0026minus;\u0026thinsp;9.83), followed by enzymes (n\u0026thinsp;=\u0026thinsp;3, median \u0026minus;\u0026thinsp;11.10 kcal\u0026middot;mol⁻\u0026sup1;, IQR \u0026minus;\u0026thinsp;11.10 to \u0026minus;\u0026thinsp;10.30). Other target types including immune proteins, epigenetic regulators, ion channels, transcription factors, and transporters were represented by only one or two studies each, limiting the interpretability of their docking score distributions. Across all categories, more negative docking scores indicate more favourable predicted binding affinity.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec30\" class=\"Section2\"\u003e \u003ch2\u003e4.4 Binding Free Energy Coverage and Distribution\u003c/h2\u003e \u003cdiv id=\"Sec31\" class=\"Section3\"\u003e \u003ch2\u003e4.4.1 Overall Distribution of Binding Free Energies\u003c/h2\u003e \u003cp\u003eNumeric binding free energy (BFE) values were reported in 15 of the included studies, yielding 17 individual values. Reported values ranged from \u0026minus;\u0026thinsp;255.50 kcal\u0026middot;mol⁻\u0026sup1; to \u0026minus;\u0026thinsp;8.44 kcal\u0026middot;mol⁻\u0026sup1;, with a median of \u0026minus;\u0026thinsp;40.02 kcal\u0026middot;mol⁻\u0026sup1; [IQR \u0026minus;\u0026thinsp;69.41 to \u0026minus;\u0026thinsp;14.06]. In two cases, a single study contributed more than one docking score, reflecting the evaluation of multiple targets.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec32\" class=\"Section3\"\u003e \u003ch2\u003e4.4.2 Temporal Trend\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe proportion of included studies reporting a numeric binding free energy (BFE) value is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e. No BFEs were reported prior to 2022. In 2022, 20.0% of studies (1 of 5) reported a BFE value. Reporting remained similar in 2023 (22.2%, 2 of 9), then increased to 36.4% in 2024 (8 of 22). In 2025, coverage was slightly lower at 33.3% (4 of 12). While recent years indicate greater uptake in reporting BFE values compared with earlier periods, coverage remains inconsistent across publication years.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec33\" class=\"Section3\"\u003e \u003ch2\u003e4.4.3 BFE values by target type\u003c/h2\u003e \u003cp\u003eMedian BFE values by target type are summarised in the supplementary table. Kinases were the most frequently assessed, with six studies reporting BFE values (median \u0026minus;\u0026thinsp;12.99 kcal\u0026middot;mol⁻\u0026sup1;, IQR \u0026minus;\u0026thinsp;33.71 to \u0026minus;\u0026thinsp;9.13). Enzymes were the next most common (n\u0026thinsp;=\u0026thinsp;4, median \u0026minus;\u0026thinsp;30.72 kcal\u0026middot;mol⁻\u0026sup1;, IQR \u0026minus;\u0026thinsp;66.55 to \u0026minus;\u0026thinsp;24.76). Other target types including immune proteins and GPCRs were represented by only two studies each, limiting the interpretability of their docking score distributions. Other target classes in the dataset had no BFE values reported. Across all categories, more negative BFE values indicate more favourable predicted binding affinity.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec34\" class=\"Section2\"\u003e \u003ch2\u003e4.5 In-vitro potency assays\u003c/h2\u003e \u003cdiv id=\"Sec35\" class=\"Section3\"\u003e \u003ch2\u003e4.5.1 In-vitro potency coverage and distribution\u003c/h2\u003e \u003cp\u003eExperimental in-vitro potency values were reported in 21 unique studies. Within these, 19 studies contributed 22 biochemical assay results, and 6 studies contributed 6 cell-based assay results.\u003c/p\u003e \u003cp\u003eAmong the 19 numeric biochemical values, potencies ranged from 0.09 nM to 764.0 nM, with a median of 13.5 nM [IQR 1.45\u0026ndash;37.8 nM]. These values indicate that most AI-generated compounds with reported biochemical data demonstrated low-nanomolar activity, although a small number extended into the mid-nanomolar range. Cell-based assay data were limited (n\u0026thinsp;=\u0026thinsp;6) and are presented descriptively without distributional statistics.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec36\" class=\"Section3\"\u003e \u003ch2\u003e4.5.2 Temporal trends in in-vitro reporting\u003c/h2\u003e \u003cp\u003eAcross the extraction period, 18 unique studies reported experimental in-vitro potency values. Reporting was sporadic in the early years, with one study each in 2016, 2018, 2019, and 2020. A modest rise was observed in 2021 (five studies, of which only one included in-vitro data). More consistent reporting appeared from 2022 onward, with 2 studies in 2022, 4 studies in 2023, 5 studies in 2024, and 2 studies in 2025 including experimental in-vitro results. Despite this increase, in-vitro validation remains a minority of total publications each year. For example, in 2024, only 22.7% of studies reported experimental IC₅₀/EC₅₀ values, and in 2025 the share was 16.7%. These findings indicate that although in-vitro validation of AI-generated compounds has become more frequent in recent years, it is still inconsistently reported.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec37\" class=\"Section3\"\u003e \u003ch2\u003e4.5.3 Probe likeness assessment\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAcross all experimental in-vitro assays, 14 biochemical entries and 6 cell-based entries were available for probe-likeness evaluation. Using conventional thresholds (\u0026lt;\u0026thinsp;100 nM for biochemical assays and \u0026lt;\u0026thinsp;1,000 nM for cellular assays), 11 of 14 biochemical entries (78.6%) and 5 of 6 cellular entries (83.3%) qualified as probe-like. These findings suggest that most AI-generated compounds with experimental data achieve potencies within ranges considered suitable for high-quality chemical probes. However, because assay conditions varied across studies, these results should be regarded as descriptive only and not directly comparable between assay families.\u003c/p\u003e \u003cp\u003eTo ensure rigor, our thresholds are grounded in authoritative chemical biology standards:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eSGC (Structural Genomics Consortium) guidelines define a high-quality chemical probe as having biochemical in‑vitro potency\u0026thinsp;\u0026lt;\u0026thinsp;100 nM and cellular activity at \u0026le;\u0026thinsp;1 \u0026micro;M. [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eEUbOPEN, a large probe consortium, similarly states: potency\u0026thinsp;\u0026le;\u0026thinsp;100 nM (biochemical) and evidence of cellular engagement\u0026thinsp;\u0026le;\u0026thinsp;1 \u0026micro;M. [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eThese values are widely cited in chemical probe literature and consistently used as benchmarks for qualifying probe like behaviour.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec38\" class=\"Section2\"\u003e \u003ch2\u003e4.6 In-vivo and animal models\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eEight of the 57 included studies (14.0%) reported in-vivo evaluation of AI/ML-generated anticancer compounds. The majority employed mice (6 studies, 75%), while additional models included dogs (1 study, 12.5%), rats (1 study, 12.5%), and C. elegans (1 study, 12.5%). One study reported an unmapped animal model (12.5%). Importantly, all reported in-vivo experiments demonstrated efficacy of at least one generated compound against the intended cancer model.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec39\" class=\"Section2\"\u003e \u003ch2\u003e4.7 In silico ADME(T) Coverage\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec40\" class=\"Section3\"\u003e \u003ch2\u003e4.7.1 Coverage of ADME(T) reporting\u003c/h2\u003e \u003cp\u003eOut of 57 included studies, 21 (36.8%) fulfilled the ADMET criterion, indicating that just over one-third of AI/ML drug design papers integrated systematic pharmacokinetic evaluation. Toxicity features were occasionally included but were not required for classification under this criterion.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec41\" class=\"Section3\"\u003e \u003ch2\u003e4.7.2 Co-reporting with other data\u003c/h2\u003e \u003cp\u003eAmong the 21 ADMET-reporting studies, 9 (43%) also presented docking scores, 10 (48%) reported BFE estimates, and 7 (33%) included experimental in-vitro assays. These findings show that ADMET evaluation was frequently co-reported alongside other forms of validation, although only about one-third of studies paired ADMET with experimental assays.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec42\" class=\"Section3\"\u003e \u003ch2\u003e4.7.3 Temporal trend of ADME(T) reporting\u003c/h2\u003e \u003cp\u003eADMET reporting was absent in 2018\u0026ndash;2020 and 2022 but has steadily increased since 2021. The share of ADMET-reporting studies rose from 40% in 2021 to 58.3% in 2025, indicating growing emphasis on pharmacokinetic evaluation in recent years.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec43\" class=\"Section2\"\u003e \u003ch2\u003e4.8 Multi-model evaluations\u003c/h2\u003e \u003cp\u003eEleven studies directly compared more than one modelling approach. Wang et al. (2025) tested MOFF, FREED, REINVENT, LIMO, and MORLD, with MOFF giving the best docking results. Qin et al. (2025) compared ChEMBL, DRD2, and A2B-60, where A2B-60 was superior. Another study by Wang et al. (2025) evaluated Mamba and several GPT variants, with Mamba and T5MolGen outperforming the others. Ai et al. (2024) introduced MTMol-GPT and showed that MTMol-GPT and SF-MTMol-GPT were stronger than DLGN, RationaleRL, and CMolRNN. Aksamit et al. (2024) found ReLSO better than FragNet. Chandraghatgi et al. (2024) reported FDSL-DD outperforming AutoGrow4 and DeepFrag. Chen et al. (2024) identified AIxFuse as the best among REINVENT 2.0, RationaleRL, and MARS. Mukaidaisi et al. (2024) found the Adversarial Autoencoder superior to Junction Tree VAE and Fragment VAE. Xu et al. (2024) showed that Graph GA and HELM-GPT performed best among GA, LSTM, and transformer variants. Yang et al. (2024) reported that a sequence of RHSH then MARS-QSAR outperformed MARS, RationaleRL, and MolEvol. Finally, Yang L. et al. (2022) showed that the Wasserstein autoencoder outperformed Beta VAE and GAN in anticancer peptide generation.\u003c/p\u003e \u003cp\u003eAcross these studies, fragment-based methods (MOFF, FDSL-DD), task-tuned architectures (A2B-60, AIxFuse, ReLSO), and advanced generative approaches (Mamba with T5MolGen, MTMol-GPT, WAE) most often produced the best candidates, although superiority was target-specific.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec44\" class=\"Section2\"\u003e \u003ch2\u003e4.9 Comparative performance vs reference drugs/compounds\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec45\" class=\"Section3\"\u003e \u003ch2\u003e4.9.1 Overall frequency of outperformance\u003c/h2\u003e \u003cp\u003eAcross the included studies, twenty-four reported that at least one AI/ML generated compound outperformed a reference drug or benchmark compound. Five studies explicitly stated that the generated molecules did not exceed the reference, while twenty-eight provided no clear comparative statement. This indicates that approximately one-third of the literature made explicit claims of superiority over established comparators, although reporting practices varied.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec46\" class=\"Section3\"\u003e \u003ch2\u003e4.9.2 Type of evidence provided\u003c/h2\u003e \u003cp\u003eAmong the 24 studies that reported superiority of AI/ML designed compounds over reference drugs or compound, most claims were supported by in silico or in vitro metrics. Docking score comparisons were the most frequent, with ten papers (41.7%) showing more favourable binding scores than the reference compound. Eight studies (33.3%) reported superiority in IC50 values, while eight (33.3%) provided more favourable binding free energy estimates. A smaller number of studies supported superiority through EC50 (n\u0026thinsp;=\u0026thinsp;1, 4.2%) or TI50 (n\u0026thinsp;=\u0026thinsp;1, 4.2%). These findings highlight that docking and potency assays were the dominant forms of evidence used to demonstrate comparative advantages.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"5. Discussion","content":"\u003cdiv id=\"Sec48\" class=\"Section2\"\u003e \u003ch2\u003e5.1 Summary of findings\u003c/h2\u003e \u003cp\u003eIn this PRISMA-guided systematic review, we analysed 57 studies that applied generative AI and machine learning to the de novo design of cancer drugs. Kinases emerged as the predominant targets, with enzymes, GPCRs, and immune-related proteins investigated less frequently. Publication activity rose sharply after 2021, a trend that parallels both the growing enthusiasm for AI in society at large and its accelerating adoption within oncology research. Yet, fewer than 40% of studies reported docking scores or in-vitro potency values, and only 14% incorporated in-vivo experiments. Binding free energy calculations were included in roughly one-quarter of the papers, while just over one-third provided ADME(T) assessments. Around half of the studies suggested that AI-generated compounds outperformed reference drugs, although these claims were usually based only on in-silico evidence. Together, these findings illustrate both the considerable potential of generative models to yield active anticancer molecules and the uneven progress toward endpoints of clear translational relevance.\u003c/p\u003e \u003cp\u003eAt this stage, the field can be seen as proof-of-concept: generative AI is able to design molecules with measurable anticancer activity, but validation remains inconsistent and fragmented. By mapping computational and experimental endpoints and assessing comparator claims, this review shows that AI-generated compounds can reach levels of activity relevant to cancer treatment and in some cases demonstrate potential to outperform reference drugs. What has been achieved here is the first systematic evidence map specific to oncology, which not only charts the progress to date but also makes visible the gaps that must be closed for translation into practice.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec49\" class=\"Section2\"\u003e \u003ch2\u003e5.2 Context with existing literature\u003c/h2\u003e \u003cp\u003ePrevious reviews of generative models in drug discovery have been mostly method-driven or disease-agnostic, with an emphasis on algorithmic innovation rather than on translational benchmarks. Our work extends this literature by concentrating on oncology and by quantifying outcomes that are directly relevant to translation. We report how often docking scores, binding free energy estimates, potency assays, and in-vivo studies are included, and how frequently AI-generated compounds are compared with reference drugs. This oncology focus brings to light patterns that broader reviews have overlooked: the predominance of kinase targets, the inconsistent reporting of docking and potency data, and the limited extent of experimental validation. By situating generative AI within the cancer domain, our synthesis reflects both the enthusiasm propelling the field and the critical gaps that continue to limit reproducibility and meaningful comparison.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec50\" class=\"Section2\"\u003e \u003ch2\u003e5.3 Interpretation and implications\u003c/h2\u003e \u003cp\u003eThe results point to the capacity of generative AI to yield compounds with genuine biological activity. In several studies, reported biochemical potencies reached the low nanomolar range, and a fair number of compounds passed probe-likeness thresholds, suggesting that these designs are not merely theoretical but can serve as practical starting points. More recently, ADME(T) reporting has become more common, hinting at growing awareness of pharmacokinetic issues in early discovery. Yet in-vivo validation is still rare, which shows how seldom AI-generated molecules are examined in conditions that resemble real disease settings. Because of this, claims that new compounds outperform reference drugs need to be interpreted carefully, especially when they rely only on docking or free-energy predictions. In practice, generative AI is changing how leads are conceived, yet the slower movement toward experimental confirmation imposes a great challenge.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec51\" class=\"Section2\"\u003e \u003ch2\u003e5.4 Strengths and limitations\u003c/h2\u003e \u003cp\u003eStrengths:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eThe main strengths of this review lie in the use of strict eligibility criteria, which restricted the corpus to genuinely de novo generative studies.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eBy covering a ten-year time frame, the review captured the evolution of the field, showing both early proof-of-concept studies and the surge of work after 2021.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eComparator analysis was captured, documenting how often AI-generated compounds were claimed to outperform reference drugs or compound and what type of evidence supported those claims.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eWe also applied a structured template that allowed us to capture a wide set of endpoints.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eUnlike earlier reviews, this work quantifies both computational and experimental outcomes in a cancer-specific context.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eLimitations:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eOur analyses are descriptive, as the heterogeneity of methods made direct comparison or meta-analysis unrealistic.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eDocking scores, binding free energy estimates, and potency data were reported as they appeared, without harmonising across different software, so absolute values should not be read as effect sizes.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eThe search strategy was restricted to PubMed, which means relevant work in other databases or unpublished sources may not have been included.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eLimited experimental validation was available overall, with very few in-vivo studies and most superiority claims based only on computational metrics.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eFinally, selective reporting within individual studies remains a concern, since in many cases only the most favourable compounds or assays were described, introducing a positivity bias that may overestimate apparent success.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003e5.5 Recommendations\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eFuture studies should report docking scores, binding free energy values, and potency data more consistently, with clear description of the methods used.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eStronger experimental validation is required, particularly through in-vitro and in-vivo testing, to establish biological relevance beyond computational outputs.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eComparator analysis should be applied in a standardised way, with transparent reporting of both favourable and unfavourable results to limit positivity bias.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eADME(T) evaluation should be incorporated more routinely, as pharmacokinetic properties are essential for assessing translational potential.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eCode and dataset availability should be prioritised, since open resources are necessary for reproducibility and fair comparison across models.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e\u0026bull; Benchmarking frameworks should be developed to evaluate generative approaches consistently across different targets and endpoints.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec52\" class=\"Section2\"\u003e \u003ch2\u003e5.6 Future Work\u003c/h2\u003e \u003cp\u003eThe evidence map created here can be extended by linking docking and binding free energy distributions with experimental potency data to explore correlations between computational predictions and biological outcomes.\u003c/p\u003e \u003cp\u003eFuture work should examine in greater depth the subset of AI-generated compounds that reached in-vivo validation to understand which modelling approaches are most likely to translate into efficacy.\u003c/p\u003e \u003cp\u003eProbe-likeness assessments could be expanded into prospective evaluations, testing whether AI-generated molecules consistently meet chemical biology standards when taken forward experimentally.\u003c/p\u003e \u003cp\u003eFinally, updating this evidence synthesis at regular intervals will allow tracking of progress over time, particularly as newer generative families such as diffusion and transformer-based models expand in oncology.\u003c/p\u003e \u003c/div\u003e"},{"header":"6. Conclusion","content":"\u003cp\u003eThis review, conducted under PRISMA guidance, is, to our knowledge, the first quantitative synthesis focused on cancer studies using de novo generative artificial intelligence and machine learning. Out of 57 included papers, kinase targets were by far the most common. Enzymes, GPCRs and immune proteins appeared much less frequently. The number of publications grew rapidly after 2021, but fewer than half of the studies reported docking scores or in vitro potency values, and only a small group described in vivo testing. Binding free energy results were reported in about one quarter of the papers, and ADME(T) assessments in roughly one third. Around half of the studies suggested that AI generated molecules were superior to reference drugs, but in most cases, the evidence came only from computational comparisons.\u003c/p\u003e\n\u003cp\u003eTaken together, these observations show that generative methods can design active anticancer compounds, yet they also reveal major shortcomings in validation and reproducibility. At present, the evidence base remains largely computational, with only limited movement into biological systems.\u003c/p\u003e\n\u003cp\u003eLooking ahead, several priorities are clear. Reporting of docking and potency data should be more consistent. Benchmarking frameworks are needed so that models can be compared on a fair basis. The release of code and datasets would support transparency and reproducibility. Most of all, stronger experimental testing, both in vitro and in vivo, will be essential to determine the true value of generative AI for oncology. If these steps are taken, the field can move from early proof of concept toward practical tools that accelerate the development of new cancer treatments.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study did not involve human participants or animal experimentation. Ethical approval and informed consent were therefore not required.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDeclaration of competing interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions (CRediT taxonomy)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization: Hashim Hashim, Ali Hasnain\u003cbr\u003e\u0026nbsp;Methodology: Hashim Hashim, Ali Hasnain\u003cbr\u003e\u0026nbsp;Data curation: Hashim Hashim, Fahad Abubakr\u0026nbsp;\u003cbr\u003e\u0026nbsp;Formal analysis: Hashim Hashim, Fahad Abubakr, Mohamed Elhassadi\u003cbr\u003e\u0026nbsp;Writing \u0026ndash; original draft: Hashim Hashim\u003cbr\u003e\u0026nbsp;Writing \u0026ndash; review \u0026amp; editing: All authors\u003cbr\u003e\u0026nbsp;Supervision: Ali Hasnain\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors thank colleagues at the Royal College of Surgeons in Ireland for academic support and constructive discussion. The authors thank Nisreen Abdulsalam for her careful reading of the manuscript and constructive feedback on clarity and presentation.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians. 2021;71(3):209-49.\u003c/li\u003e\n\u003cli\u003eBray F, Laversanne M, Weiderpass E, Soerjomataram I. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide. \u003cem\u003eCA Cancer J Clin.\u003c/em\u003e 2024;74(3):229-63.\u003c/li\u003e\n\u003cli\u003eBray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians. 2024;74(3):229-63.\u003c/li\u003e\n\u003cli\u003eScannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R\u0026amp;D efficiency. Nature Reviews Drug Discovery. 2012;11(3):191-200.\u003c/li\u003e\n\u003cli\u003eWouters OJ, McKee M, Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. JAMA. 2020;323(9):844-53.\u003c/li\u003e\n\u003cli\u003eWong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics. 2018;20(2):273-86.\u003c/li\u003e\n\u003cli\u003eHughes JP, Rees S, Kalindjian SB, Philpott KL. Principles of early drug discovery. Br J Pharmacol. 2011;162(6):1239-49.\u003c/li\u003e\n\u003cli\u003eDiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R\u0026amp;D costs. Journal of Health Economics. 2016;47:20-33.\u003c/li\u003e\n\u003cli\u003eSanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering. Science. 2018;361(6400):360-5.\u003c/li\u003e\n\u003cli\u003eWalters WP, Barzilay R. Applications of Deep Learning in Molecule Generation and Molecular Property Prediction. Accounts of Chemical Research. 2021;54(2):263-70.\u003c/li\u003e\n\u003cli\u003eZeng X, Wang F, Luo Y, Kang S-g, Tang J, Lightstone FC, et al. Deep generative molecular design reshapes drug discovery. Cell Reports Medicine. 2022;3(12):100794.\u003c/li\u003e\n\u003cli\u003eMartinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Computers in Biology and Medicine. 2022;145:105403.\u003c/li\u003e\n\u003cli\u003eYim J, St\u0026auml;rk H, Corso G, Jing B, Barzilay R, Jaakkola TS. Diffusion models in protein structure and docking. WIREs Computational Molecular Science. 2024;14(2):e1711.\u003c/li\u003e\n\u003cli\u003eAlakhdar A, Poczos B, Washburn N. Diffusion Models in De Novo Drug Design. Journal of Chemical Information and Modeling. 2024;64(19):7238-56.\u003c/li\u003e\n\u003cli\u003eRomanelli V, Cerchia C, Lavecchia A. Deep generative models in the quest for anticancer drugs: ways forward. Frontiers in Drug Discovery. 2024;Volume 4 - 2024.\u003c/li\u003e\n\u003cli\u003eOdah M. Artificial Intelligence Meets Drug Discovery: A Systematic Review on AI-Powered Target Identification and Molecular Design. Preprints: Preprints; 2025.\u003c/li\u003e\n\u003cli\u003eAbbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. ChemBioChem. 2024;25(14):e202300816.\u003c/li\u003e\n\u003cli\u003eSarvepalli S, Vadarevu S. Role of artificial intelligence in cancer drug discovery and development. Cancer Letters. 2025;627:217821.\u003c/li\u003e\n\u003cli\u003eF. A. Artificial Intelligence in Oncology: Applications, Challenges and Future Frontiers. International Journal of Pharmaceutical Investigation. 2024 Jul 1;14(3):647\u0026ndash;56.\u003c/li\u003e\n\u003cli\u003eWang L, Song Y, Wang H, Zhang X, Wang M, He J, et al. Advances of Artificial Intelligence in Anti-Cancer Drug Design: A Review of the Past Decade. Pharmaceuticals. 2023;16(2):253.\u003c/li\u003e\n\u003cli\u003eConte L, Caruso G, Philip AK, Cucci F, De Nunzio G, Cascio D, et al. Artificial Intelligence-Assisted Drug and Biomarker Discovery for Glioblastoma: A Scoping Review of the Literature. Cancers. 2025;17(4):571.\u003c/li\u003e\n\u003cli\u003eTiwari A, Mishra S, Kuo T-R. Current AI technologies in cancer diagnostics and treatment. Molecular Cancer. 2025;24(1):159.\u003c/li\u003e\n\u003cli\u003eStructural Genomics Consortium (SGC-UNC). Chemical Probes [Internet]. Available from: https://www.sgc-unc.org/main-st; Accessed August 2025.\u003c/li\u003e\n\u003cli\u003eEUbOPEN. Chemical Probes [Internet]. Available from: https://www.eubopen.org/chemical-probes; Accessed August 2025.\u003c/li\u003e\n\u003cli\u003eA Y, F M, J B. Approach for the Design of Covalent Protein Kinase Inhibitors via Focused Deep Generative Modeling. Molecules (Basel, Switzerland). 2022;27(2).\u003c/li\u003e\n\u003cli\u003eA Y, H H, J B. Adapting the DeepSARM approach for dual-target ligand design. Journal of computer-aided molecular design. 2021;35(5):587-600.\u003c/li\u003e\n\u003cli\u003eA Y, Y A, E K, T T, S M, T S, et al. Design and Synthesis of DDR1 Inhibitors with a Desired Pharmacophore Using Deep Generative Models. ChemMedChem. 2021;16(6):955-8.\u003c/li\u003e\n\u003cli\u003eAK P. AI-assisted generation and in-depth in-silico evaluation of potential inhibitor targeting aurora kinase A (AURKA): An anticancer discovery exploiting synthetic lethality approach. Archives of biochemistry and biophysics. 2024;762:110209.\u003c/li\u003e\n\u003cli\u003eB O, Z F, W M, R S. A Deep-Learning Proteomic-Scale Approach for Drug Design. Pharmaceuticals (Basel, Switzerland). 2021;14(12).\u003c/li\u003e\n\u003cli\u003eC A, H Y, X L, R D, Y D, F G. MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS computational biology. 2024;20(6):e1012229.\u003c/li\u003e\n\u003cli\u003eC P, H C, H Y, X C, M Z, L Q, et al. Oral ENPP1 inhibitor designed using generative AI as next generation STING modulator for solid tumors. Nature communications. 2025;16(1):4793.\u003c/li\u003e\n\u003cli\u003eC Y, S U, K K, M I, Y Y. De novo drug design based on patient gene expression profiles via deep learning. Molecular informatics. 2023;42(8):e2300064.\u003c/li\u003e\n\u003cli\u003eCT F. PD-1 Targeted Antibody Discovery Using AI Protein Diffusion. Technology in cancer research \u0026amp; treatment. 2024;23:15330338241275947.\u003c/li\u003e\n\u003cli\u003eD D, B C, R S, A R. Gex2SGen: Designing Drug-like Molecules from Desired Gene Expression Signatures. Journal of chemical information and modeling. 2023;63(7):1882-93.\u003c/li\u003e\n\u003cli\u003eD H, Q L, Y M, Q M, L X, C H, et al. De Novo Generation and Identification of Novel Compounds with Drug Efficacy Based on Machine Learning. Advanced science (Weinheim, Baden-Wurttemberg, Germany). 2024;11(11):e2307245.\u003c/li\u003e\n\u003cli\u003eD N, A H, RA B, M H. Machine Learning Application for Medicinal Chemistry: Colchicine Case, New Structures, and Anticancer Activity Prediction. Pharmaceuticals (Basel, Switzerland). 2024;17(2).\u003c/li\u003e\n\u003cli\u003eD R, S Y, N B, A S, S M, C T. TumFlow: An AI Model for Predicting New Anticancer Molecules. International journal of molecular sciences. 2024;25(11).\u003c/li\u003e\n\u003cli\u003eD S, L X, M T, Z W, W Z, J L, et al. De novo design of mIDH1 inhibitors by integrating deep learning and molecular modeling. Frontiers in pharmacology. 2024;15:1491699.\u003c/li\u003e\n\u003cli\u003eDS B, DA O, VF O, AO A, A M, UC O, et al. In-silico-based lead optimization of hit compounds targeting mitotic kinesin Eg5 for cancer management. In silico pharmacology. 2025;13(1):9.\u003c/li\u003e\n\u003cli\u003eEG G, P V, P G-N, E U, G M-A, C P, et al. AI-Driven De Novo Design and Development of Nontoxic DYRK1A Inhibitors. Journal of medicinal chemistry. 2025;68(10):10346-64.\u003c/li\u003e\n\u003cli\u003eF G, CS N, G G, AT M, JA H, G S. Designing Anticancer Peptides by Constructive Machine Learning. ChemMedChem. 2018;13(13):1300-2.\u003c/li\u003e\n\u003cli\u003eF G, CS N, M H, G G, JA H, M K, et al. De novo design of anticancer peptides by ensemble artificial neural networks. Journal of molecular modeling. 2019;25(5):112.\u003c/li\u003e\n\u003cli\u003eF R, X D, M Z, M K, X C, W Z, et al. AlphaFold accelerates artificial intelligence powered drug discovery: efficient discovery of a novel CDK20 small molecule inhibitor. Chemical science. 2023;14(6):1443-52.\u003c/li\u003e\n\u003cli\u003eG C, D K, J O. AI-Based Drug Discovery of TKIs Targeting L858R/T790M/C797S-Mutant EGFR in Non-small Cell Lung Cancer. Frontiers in pharmacology. 2021;12:660313.\u003c/li\u003e\n\u003cli\u003eG KG. Accelerating drug discovery targeting dihydroorotate dehydrogenase using machine learning and generative AI approaches. Computational biology and chemistry. 2025;118:108443.\u003c/li\u003e\n\u003cli\u003eHW vdM, J dMvO, JGC vH, PH vdG, GJP vW. Integrating Pharmacokinetics and Quantitative Systems Pharmacology Approaches in Generative Drug Design. Journal of chemical information and modeling. 2025;65(10):4783-96.\u003c/li\u003e\n\u003cli\u003eJ B, M M, A O, J C, G M, M RM. PaccMann(RL): De novo generation of hit-like anticancer molecules from transcriptomic data via reinforcement learning. iScience. 2021;24(4):102269.\u003c/li\u003e\n\u003cli\u003eJ C, D G, LV S, IX S, V SK, B M, et al. Integrated AI and machine learning pipeline identifies novel WEE1 kinase inhibitors for targeted cancer therapy. Molecular diversity. 2025.\u003c/li\u003e\n\u003cli\u003eKY S, S MV, WR D, J K. COTI-2, a novel small molecule that is active against multiple human cancer cell lines in vitro and in vivo. Oncotarget. 2016;7(27):41363-79.\u003c/li\u003e\n\u003cli\u003eL H, J L, HL Z, LC Z, RL Y, CM K. De novo design of dual-target JAK2, SMO inhibitors based on deep reinforcement learning, molecular docking and molecular dynamics simulations. Biochemical and biophysical research communications. 2023;638:23-7.\u003c/li\u003e\n\u003cli\u003eL L, X Z, X H. Generating Potential RET-Specific Inhibitors Using a Novel LSTM Encoder-Decoder Model. International journal of molecular sciences. 2024;25(4).\u003c/li\u003e\n\u003cli\u003eL W, M X, Z L, C J, X L, Y H, et al. Hit Identification Driven by Combining Artificial Intelligence and Computational Chemistry Methods: A PI5P4K-\u0026beta; Case Study. Journal of chemical information and modeling. 2023;63(16):5341-55.\u003c/li\u003e\n\u003cli\u003eL Y, G Y, Z B, Y T, L H, Y N, et al. Accelerating the discovery of anticancer peptides targeting lung and breast cancers with the Wasserstein autoencoder model and PSO algorithm. Briefings in bioinformatics. 2022;23(5).\u003c/li\u003e\n\u003cli\u003eM M, M A, K G, A A-J, S D, M O, et al. \u0026quot;Several birds with one stone\u0026quot;: exploring the potential of AI methods for multi-target drug design. Molecular diversity. 2024.\u003c/li\u003e\n\u003cli\u003eM N, NU A, T A, K J, MA S, M A, et al. Artificial Intelligence Assisted Pharmacophore Design for Philadelphia Chromosome-Positive Leukemia with Gamma-Tocotrienol: A Toxicity Comparison Approach with Asciminib. Biomedicines. 2023;11(4).\u003c/li\u003e\n\u003cli\u003eM W, S L, J W, O Z, H D, D J, et al. ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning. Nature communications. 2024;15(1):10127.\u003c/li\u003e\n\u003cli\u003eN A, J H, Y L, B O-B. Integrating transformers and many-objective optimization for drug design. BMC bioinformatics. 2024;25(1):208.\u003c/li\u003e\n\u003cli\u003eOA D, AA Y, MO I, NO O, AS O, BC N. Investigation of the MDM2-binding potential of de novo designed peptides using enhanced sampling simulations. International journal of biological macromolecules. 2024;269:131840.\u003c/li\u003e\n\u003cli\u003eOJ G, A N, T K, NZ R, B K. In silico evolution of autoinhibitory domains for a PD-L1 antagonist using deep learning models. Proceedings of the National Academy of Sciences of the United States of America. 2023;120(49):e2307371120.\u003c/li\u003e\n\u003cli\u003eR C, HF J, GL R, BA S. Streamlining Computational Fragment-Based Drug Discovery through Evolutionary Optimization Informed by Ligand-Based Virtual Prescreening. Journal of chemical information and modeling. 2024;64(9):3826-40.\u003c/li\u003e\n\u003cli\u003eR Q, H Z, W H, Z S, J L. Deep learning-based design and screening of benzimidazole-pyrazine derivatives as adenosine A(2B) receptor antagonists. Journal of biomolecular structure \u0026amp; dynamics. 2025;43(7):3225-41.\u003c/li\u003e\n\u003cli\u003eS C, J X, R Y, DD X, Y Y. Structure-aware dual-target drug design through collaborative learning of pharmacophore combination and molecular simulation. Chemical science. 2024;15(27):10366-80.\u003c/li\u003e\n\u003cli\u003eSH J, D S, SK M, J C, S L, M J, et al. PCW-A1001, AI-assisted de novo design approach to design a selective inhibitor for FLT-3(D835Y) in acute myeloid leukemia. Frontiers in molecular biosciences. 2022;9:1072028.\u003c/li\u003e\n\u003cli\u003eT P, M A, RI O, RA G, JAR S, JP A. Deep generative model for therapeutic targets using transcriptomic disease-associated data-USP7 case study. Briefings in bioinformatics. 2022;23(4).\u003c/li\u003e\n\u003cli\u003eT Q, Y W, M K, H Z, T W, Z X, et al. Identification of potential PIM-2 inhibitors via ligand-based generative models, molecular docking and molecular dynamics simulations. Molecular diversity. 2024;28(4):2245-62.\u003c/li\u003e\n\u003cli\u003eT V, A D, O H, MR M, CY E. AI-Predicted mTOR Inhibitor Reduces Cancer Cell Proliferation and Extends the Lifespan of C. elegans. International journal of molecular sciences. 2023;24(9).\u003c/li\u003e\n\u003cli\u003eTM C, G L, P D, M C, N C, M S, et al. DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues. Journal of chemical information and modeling. 2022;62(6):1411-24.\u003c/li\u003e\n\u003cli\u003eTR Q, KA G, C T, JA B, G B, E B, et al. Accelerated Discovery of Carbamate Cbl-b Inhibitors Using Generative AI Models and Structure-Based Drug Design. Journal of medicinal chemistry. 2024;67(16):14210-33.\u003c/li\u003e\n\u003cli\u003eW C, B J, Y Y, L M, T L, J C. Identification of STAT3 phosphorylation inhibitors using generative deep learning, virtual screening, molecular dynamics simulations, and biological evaluation for non-small cell lung cancer therapy. Molecular diversity. 2024.\u003c/li\u003e\n\u003cli\u003eW Z, P Z, T L, F G, Q L, X C, et al. Discovery of Novel SIK2/3 Inhibitors for the Potential Treatment of MEF2C+ Acute Myeloid Leukemia (AML). Journal of medicinal chemistry. 2025;68(7):7518-38.\u003c/li\u003e\n\u003cli\u003eX L, Q L, X Y, L W, J Q, X Y, et al. A Specialized and Enhanced Deep Generation Model for Active Molecular Design Targeting Kinases Guided by Affinity Prediction Models and Reinforcement Learning. Journal of chemical information and modeling. 2025;65(7):3294-308.\u003c/li\u003e\n\u003cli\u003eX W, J L, L M, B W, B L, Y J. Towards novel small-molecule inhibitors blocking PD-1/PD-L1 pathway: From explainable machine learning models to molecular dynamics simulation. International journal of biological macromolecules. 2024;282:136325.\u003c/li\u003e\n\u003cli\u003eX X, C X, W H, L W, H L, J Z, et al. HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer. Bioinformatics (Oxford, England). 2024;40(6).\u003c/li\u003e\n\u003cli\u003eY W, C W, J L, D S, F M, M Z, et al. Discovery of 3-hydroxymethyl-azetidine derivatives as potent polymerase theta inhibitors. Bioorganic \u0026amp; medicinal chemistry. 2024;103:117662.\u003c/li\u003e\n\u003cli\u003eY W, M G, X C, D A. Screening of multi deep learning-based de novo molecular generation models and their application for specific target molecular generation. Scientific reports. 2025;15(1):4419.\u003c/li\u003e\n\u003cli\u003eY W, Z W, Y L, P Y, X L. Improving Covalent and Noncovalent Molecule Generation via Reinforcement Learning with Functional Fragments. Journal of chemical information and modeling. 2025.\u003c/li\u003e\n\u003cli\u003eY Y, D A, Y W, W Z, G C, J T, et al. Wee1 inhibitor optimization through deep-learning-driven decision making. European journal of medicinal chemistry. 2024;280:116912.\u003c/li\u003e\n\u003cli\u003eY Y, J H, H H, J H, G Y, T X, et al. Accelerated Discovery of Macrocyclic CDK2 Inhibitor QR-6401 by Generative Models and Structure-Based Drug Design. ACS medicinal chemistry letters. 2023;14(3):297-304.\u003c/li\u003e\n\u003cli\u003eY Y, R Z, Z L, L M, S W, H D, et al. Discovery of Highly Potent, Selective, and Orally Efficacious p300/CBP Histone Acetyltransferases Inhibitors. Journal of medicinal chemistry. 2020;63(3):1337-60.\u003c/li\u003e\n\u003cli\u003eY Z, J H, X L, W S, N Z, J Z, et al. Self-awareness of retrosynthesis via chemically inspired contrastive learning for reinforced molecule generation. Briefings in bioinformatics. 2025;26(2).\u003c/li\u003e\n\u003cli\u003eZ L, J H, Y J, B L, A Z. Computational design of CDK1 inhibitors with enhanced target affinity and drug-likeness using deep-learning framework. Heliyon. 2024;10(22):e40345.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Royal College of Surgeons in Ireland","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"De novo drug design, Cancer, Artificial intelligence, Machine learning, Generative models, Drug discovery, Molecular design","lastPublishedDoi":"10.21203/rs.3.rs-8408084/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8408084/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eGenerative artificial intelligence (AI) and machine learning (ML) are emerging as powerful tools for de novo drug discovery. Oncology, which faces arduous and lengthy development timelines, could gain considerably from these approaches. Previous reviews have generally described generative models, but none have provided a systematic and quantitative synthesis of their application to cancer drug discovery.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eA PRISMA-guided systematic review of PubMed was carried from January 2015 to June 20, 2025. Eligible studies applied generative AI or ML architectures to design new molecules with cancer relevance. Extracted data included study targets, model families, docking scores, binding free energies, in vitro potency (IC₅₀/EC₅₀), in vivo validation, ADME(T) assessments, code availability, and comparator performance. Analyses were descriptive and aimed at mapping the coverage and distribution of reported outcomes.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eFrom 1,130 records screened, 57 studies met eligibility. Kinases were the most frequent targets (49%), followed by enzymes, GPCRs, and immune proteins. Publications rose sharply after 2021. Under half of the studies reported docking scores or in vitro potency values, and 14% described in vivo testing. Binding free energy values appeared in 26% of papers, and ADME(T) assessments in 37%. Code availability was inconsistent, with public release in 54% of papers, highlighting reproducibility gaps.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eGenerative AI demonstrates potential to design biologically active anticancer compounds. However, evidence is predominantly comprised of computational results with limited experimental validation. Future work should give priority to consistent reporting, benchmarking frameworks, open code and data, and prospective in vitro and in vivo testing.\u003c/p\u003e","manuscriptTitle":"Artificial Intelligence and Machine Learning for De Novo Cancer Drug Discovery: A Systematic Review of Generative Design and Validation Gaps","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-23 09:27:50","doi":"10.21203/rs.3.rs-8408084/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"782025c4-11a8-4875-9edc-cc351bbb0f7c","owner":[],"postedDate":"December 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":59977529,"name":"Drug Discovery, Design, \u0026 Development"},{"id":59977530,"name":"Oncology"},{"id":59977531,"name":"Artificial Intelligence and Machine Learning"}],"tags":[],"updatedAt":"2025-12-23T09:27:50+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-23 09:27:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8408084","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8408084","identity":"rs-8408084","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00