Single base focal hypermutation cooccurs with structural variation as an early event in advanced prostate tumourigenesis with ancestry specific independence: a multi-ancestral observational study | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Single base focal hypermutation cooccurs with structural variation as an early event in advanced prostate tumourigenesis with ancestry specific independence: a multi-ancestral observational study Jue Jiang, Avraam Tapinos, Ruotian Huang, M.S. Riana Bornman, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7624142/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 11 You are reading this latest preprint version Abstract Background Kataegis, the focal hypermutation of single base positions in tumour genomes, has received little attention with regards to prostate cancer (PCa) molecular features, tumour evolution and associated clinical presentation. Most notably, the impact of this phenomenon is yet to be explored across ancestral lineages representing the extremities of PCa presentation and outcomes, with men of African ancestry disproportionately disadvantaged. The purpose of this study is to address the knowledge gap through African inclusive multi-ancestral interrogation. Methods We assessed for ancestrally shared and unique molecular, evolutionary and clinical features of kataegis in 669 multi-ancestral whole PCa genomes. Access to raw whole-genome sequenced data allowed for direct single-pipeline comparative analysis between 109 southern African and 57 European derived treatment naïve high-risk-biased primary tumours (74% and 88%) with paired blood samples, further assessed against publicly available 207 Asian high-risk-leaning comparative (65%) and 296 European low-risk-biased alternative (79%) resources. Comparisons between ancestries and risk groups were through Wilcoxon’s rank sum test and Fisher’s exact tests, with P values adjusted by false discovery rate. Results Confirming relatively low burdens, we found kataegis to be significantly associated with genomic instability, cancer drivers, and clinical adversity across ancestries (false discovery rate = 7 x 10 -6 - 0.04). Notably, kataegis-postive tumours were associated with elevated prostate-specific antigen levels at presentation in African (false discovery rate = 0.002) and higher risk for metastatic progression in European patients (Kaplan-Meier estimator, P=0.03). Enrichment of APOBEC’s context preferences showed more attribution from APOBEC3B than APOBEC3A. Further through analyses of evolution and structural variant (SV) cooccurrence, commonly the ancestry agnostic SV-associated kataegis predominated in the clonal evolutionary state, while the less common the SV-independent kataegis (P=0.002) and subclonal kataegis (P=0.03) showed African specificity. Conclusions We found kataegis-positivity to be associated with poor PCa presentation and prognosis, irrespective of patient ancestry. Kataegis-related genomic instability occurring early and late during African derived tumourigenesis, may partly explain the heightened tumour and clinical heterogeneity observed for patients of African ancestry. kataegis prostate cancer ancestral disparity APOBEC cancer evolution Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction Prostate cancer (PCa) is the most frequently diagnosed male cancer in most regions of the world, disproportionately affecting men of African ancestry and particularly from Sub-Saharan Africa [ 1 ]. Mortality rates from PCa are highest in Sub-Saharan Africa and the Caribbean, with southern Africa ranking the first globally at 29.7 (age-standardised rate per 100,000 males) [ 1 ]. Notably, the incidence rate of southern Africa is lower than that of economically stable regions, such as Australia and New Zealand (59.9 vs. 78.1, age-standardised rate per 100,000) [ 1 ]. Conversely, both incidence and mortality rates are lowest across the Asian diaspora of nations. While the disparities may be attributed to diminished access to PCa screening and medical resources or exposure to yet unknown geographic risk factors, studies from the United States have shown that African American men are at greatest risk for aggressive disease presentation and associated lethality after accounting for non-genetic factors [ 2 , 3 ]. Additional studies that alluded to biological and genomic contributions are needed for a better understanding of the disparities across different ancestral populations. Kataegis, meaning thunderstorm in Greek, describes the focal hypermutation phenomenon in cancer genomes [ 4 ]. A kataegis event is defined as a cluster of closely distributed single nucleotide variants (SNVs) and results from a single mutational action of APOBEC3A (A3A) or APOBEC3B (A3B) cytidine deaminases on exposed single-strand DNAs (ssDNAs) [ 4 – 6 ]. This mutational process has been linked to single base substitution (SBS) signatures, SBS2 and SBS13 [ 7 ]. Despite Pan-Cancer Analysis of Whole Genomes (PCAWG) and organ-specific studies suggesting kataegis to be frequent in cancers of the breast, bladder, lung, and skin (melanoma) [ 4 , 5 , 8 , 9 ], the evolution of kataegis and its clinical implications remain elusive for PCa, and unclear for African patients due to a lack of African-derived whole tumour genome data [ 10 , 11 ]. Controversially, breast cancer (BRCA) research reported kataegis with a favourable prognosis [ 12 ] and lower genomic instability [ 10 , 12 ], with others showing a link to aggressive disease [ 7 ]. The early event of kataegis arising with chromothripsis during telomere crisis has been suggested by modified cell line experiments [ 13 , 14 ], while late kataegis development is observed in PCa [ 15 ] and hepatocellular carcinoma [ 8 ]. However, the potential contribution and association of kataegis in PCa ancestral disparities are yet to be determined. This study aims to characterise kataegis mutational processes in PCa genomes from patients of different ancestries and to assess the potential clinical implication, with a particular focus on aggressive disease in African men. We processed samples from 109 African men (Black South Africans) and 57 European men (predominantly Australians) through the same pipeline, providing a direct comparative analysis based on genetic ancestry. Across ancestries, our findings linked kataegis events with more aggressive PCa manifestations and adverse clinical outcomes. The investigation of the aetiology primarily attributed kataegis to APOBEC enzymes with variation between cancer aggressiveness among African patients. We observed ancestral disparities in the evolutionary timing of kataegis and the distribution of distances between kataegis and structural variants (SVs). These findings highlight the unique genetic factors contributing to PCa in African men and underscore the importance of including diverse ancestral populations in cancer research. Materials and Methods Subjects and whole genome sequencing data Treatment naive samples of blood and tumour pairs were collected from 166 patients diagnosed with PCa recruited from South Africa (n = 113) and Australia (n = 53, Table. 1). Patient ancestry was determined using whole genome interrogation for subpopulation fraction analyses, as previously described [ 16 ]. In short, 109 patients categorised as African (all South African) with greater than 85% African ancestral fraction; 57 were categorised as European (53 Australian and 4 South African), allowing up to 3% African ancestral and 26% Asian contributions. Tumour aggressiveness was defined from histopathological Gleason Scores as the International Society of Urological Pathology (ISUP) Grade Group (GG) either at diagnosis (South Africans) or surgery (Australians). Patients presented either as low-risk (LR, GG1 and GG2) or high-risk (HR, GG3–5), with the African derived HR group biased towards very HR PCa (89%, 72/81 ISUP GG4/5). For comparison, we intentionally selected untreated biobanked samples with advanced disease for our European cohort (98%, 49/50 ISUP GG 4/5). As previously reported for South-East Africa [ 17 ], both prostate specific antigen (PSA) levels (median 82.60 vs 8.15) and age at presentation/surgery (median 69 vs 63 in HR groups) are elevated for our African over the European cohort of HR groups. The latter cohort allows for extensive follow-up data defined as biochemical relapse (BCR) and/or metastasis. All samples underwent deep WGS using the Illumina NovaSeq and Hiseq platforms (median coverages tumour 88.64 X and blood 44.19X), GRCh38 referenced variant calling and annotation, and evolutionary timing pipelines, as previously described [ 16 , 18 ]. Table 1 Demographic and clinical information of the current study Ancestry Cohort size Cohort size per country (%) Cohort size of low-risk (GG1–2, %) Cohort size of high-risk (GG3–5, %) Median age (range) The study cohort Total 166 113 (68%) South Africa, 53 (32%) Australia 35 (21%) 131 (79%) 65 (45–99) a African 109 109 (100%) South Africa 28 (26%) 81 (74%) 68 (45–99) a European 57 4 (7%) South Africa, 53 93% Australia 7 (12%) 50 (88%) 63 (46–72) Public validation cohorts European 296 296 (100%) Canada 234 (79%) 62 (21%) 64 (42–81) Asian 207 207 (100%) China 73 (35%) 134 (65%) 69 (50–88) a a One patient with missing age excluded. Public validation cohorts Somatic SNVs were downloaded from published deep WGS primary tumour-normal data derived from 296 European and 207 Asian PCa donors, with available clinical data (Table. 1). European data were derived from the Prostate Adenocarcinoma Canada project via the International Cancer Genome Consortium (ICGC) Data Portal [ 19 , 20 ]. Asian data were obtained from the Chinese Prostate Cancer Genome and Epigenome Atlas (CPGEA) with accession number PRJCA001124 [ 21 ]. The European data are biased towards the LR PCa, with no age differences between LR and HR cases for either European data (79%, n = 234 vs. 21%, n = 62; median of age, 64 vs. 63.5 years; Wilcoxon’s rank sum test, P = 0.58) or Asian data (35%, n = 73 vs. 65%, n = 134; the same median of age at 69 years). Kataegis identification and evolution Kataegis identification followed the methods of the PCAWG study, using an adjusted threshold for candidate calling, followed by two criteria (detailed in Additional file1: Supplementary methods) [ 5 ]. Briefly, inter-mutational distances of SNVs were adjusted with the piecewise constant fitting (PCF) model using the core algorithms of the kataegis package [ 22 ] with default parameters [ 9 ]. The threshold, requiring a minimum of four SNVs with the PCF-adjusted distances less than one kb, was set and derived from the total number of SNVs per patient and identical for all patients. Kataegis events were further refined with evolutionary timing (detailed in Additional file1: Supplementary methods). As kataegis SNVs arise together from a single mutational process [ 5 ], we refined kataegis with evolution by examining each subset of SNVs that occurred during the same evolutionary epochs, including clonal (early, late, and unspecified) and subclonal epochs. This step was applied only to the current study cohort and identified a total of 249 evolutionary kataegis events in 65 patients. Evolutionary kataegis was unavailable for public cohorts due to the lack of available copy number variants (CNVs). Statistical Analysis Statistical tests included Fisher’s exact test for categorical variables using the stats package [ 23 ], and Wilcoxon’s rank sum test for continuous data comparisons between two ancestries or risk groups using the ggpubr package (v0.6.0) [ 24 ] in R (v 4.2.2) [ 23 ]. P s of multiple hypothesis testing were adjusted using the false discovery rate (FDR) when specified. Four outliers with extreme kataegis burdens were excluded, including one European patient (42 kataegis events) in the study cohort, and three patients whose z-scores were greater than three in the public European cohort. For genomic features significantly associated with the presence of kataegis, we further analysed their associations with kataegis burden with a negative binomial regression model. The negative binomial regression model was suitable to describe the kataegis burden that had many zero values and a variance greater than its mean (4.03 vs. 1.03). The analysis excluded the aforementioned outlier, and three African patients with PSA or age unavailable. Besides all the genomic features associated with kataegis, the analysis also included ancestry, patient risk levels and age at diagnosis. Log-transformation was applied to adjust data skewness found in SV burden, tumour mutational burden (TMB), chromothripsis burden, percentage of genome alteration (PGA), and copy number (CN) gain. Co-occurrence of kataegis with cancer driver mutations Co-occurrence of kataegis with cancer driver mutations We examined associations between kataegis and point mutations of 58 selected genes using Fisher’s exact test ( P -values and FDRs in Additional file2: Table S1 ). We examined previously reported top cancer drivers for PCa [ 16 , 25 ] and/or genes potentially related to kataegis development, such as cell-cycle checkpoint-related genes [ 26 ], APOBEC3A , and APOBEC3B . Survival analysis We performed survival analyses using Kaplan-Meier estimates from the survival package (v 3.5-5) [ 27 ] and log-rank tests from the survminer package (v 0.4.9) [ 28 ]. To assess clinical progression, we compared (i) patients with BCR and/or metastasis to those with neither, and (ii) patients with metastasis only to those without metastasis or BCR. The survival distribution was compared by kataegis state (positive or negative), and by kataegis burden (elevated burden with a kataegis count > 1 or ≤1). The analysis was performed for LR and HR groups concurrently and separately for the European patients with available follow-up data from our study cohort, and for public European and Asian cohorts. From our study cohort, we excluded the small LR group of European patients (n = 7), a hyper-kataegic outlier, and three patients not curative after radical prostatectomy from the HR group (n = 42 remaining). From the validation cohorts, we excluded three outliers defined by z-scores greater than three, and patients with missing clinical follow-up from the public European cohort (n = 281 remaining). We also filtered out 21 patients with missing clinical information from the Asian cohort (n = 186 remaining). SBS and SV signatures Kataegic SNVs, genome-wide SBS, and SV signatures were decomposed and assigned using SigProfilerExtractor (v.1.1.24) [ 29 ]. The analysis processed kataegic SNVs from 283 kataegis positive tumours from this study and validation cohorts. The aforementioned outliers, one from the study cohort and three from the public European cohort, were excluded from the analysis. Kataegic SNVs from the public European data were lifted to GRCh38 reference using liftOver (last modified 2022-01-31) [ 30 ]. The signature identification steps included de novo signature discovery using nonnegative matrix factorisation (NMF) and the assignment of conventional Catalogue Of Somatic Mutations In Cancer (COSMIC) signatures (v3.4, Oct. 2023). We used default settings with some modifications, including a maximum of 15 signatures, 500 NMF replicates, one million maximal NMF iterations, and the GRCh38 reference. The assignment of SBS signatures was challenging for kataegic SNVs due to a small number of SNVs compared to genome-wide SNVs. To maintain the accuracy, 33 samples were filtered out from a cut-off of cosine similarity greater than 0.5. The passed samples had a median cosine similarity of 0.851 (range, 0.508–0.988). In addition, genome-wide SBS and SV signatures were identified from 165 samples, excluding a European outlier. The SigProfilerExtractor parameters and version of the COSMIC database were the same as those used for the kataegic SBS signatures. The NMF extraction methods were based on the frequency matrix of 32 SV types [ 31 ]. APOBEC attribution to kataegis We used Fisher’s exact test to identify APOBEC-enriched kataegis, which were further tested for A3A or A3B enrichment according to the context preference of APOBEC enzymes. The identification mainly followed the method previously used for genome-wide enrichment [ 32 ]. For the APOBEC enrichment, kataegis events were compared with other non-clustering SNVs from the sample for the count of mutated cytosines in each motif (C and TCW) adjusted by the accessible rate of the motif ( \(\:\pm\:\) 20 bp context). Here, we used TCW to represent the APOBEC enzyme preference motif, as observing comparable amounts of cytosine mutations in TCA and TCT, rather than a skewness toward TCA reported previously [ 32 ]. We used TCW to represent a cytosine mutation in the TCW motif, and more details of the Fisher’s exact test are in Additional file1: Supplementary methods. Further, for each APOBEC-enriched kataegis, we identified A3A-enriched kataegis with YTCW motif and A3B-enriched kataegis with RTCW motif [ 32 , 33 ], where the underlined cytosine means mutated. P -values were adjusted with FDR. Distribution of kataegis and proximal SVs The enrichment or sparsity of SVs proximal to kataegis events was tested by comparing kataegis with simulated kataegis events. For each kataegis event (n = 831) identified in this study and validation cohorts, excluding four outliers, we simulated 1,000 pseudo kataegis events with the same event interval by randomly assigning the central position with 1,000 non-clustering SNVs from the sample. For both identified kataegis and simulated kataegis, their distances to proximal SVs were compared using log-spaced bins (0–1 kb, 1 kb – 10 kb, 10kb – 0.1 Mb, 0.1 Mb – 1 Mb, 1 Mb – 10 Mb, 10 Mb – 100 Mb, and beyond 100 Mb). For each patient group defined by ancestry and risk level, we tested enrichment or sparsity of SVs by calculating P -values based on the rank of the identified kataegis in the 1,000 simulated kataegis events. P -values were adjusted with FDR. Results Ancestrally independent low prevalence and burden for prostate tumour kataegis From the study cohort including 113 Africans from South Africa, 53 Europeans from Australia, and validation cohorts 296 Europeans from Canadian, 207 Asian from China (Table 1 ), we identified kataegis with TMB-derived threshold and criteria based on known kataegis characteristics [ 5 ]. For the study cohort, 260 kataegis events were identified in 41% (68/166) of tumours (Fig. 1 A, Additional file2: Table S2 ), consistent with a previous report for European patients [ 5 ]. Within the validation cohorts, we identified 321 kataegis events in 39.2% (116/296) of European and 297 events in 49.8% (103/207) of Asian patients (Additional file2: Table S3, S4). Overall, we observed a low kataegis burden (median: two events, range: 1 to 13, Fig. 1 B), excluding one hyper-kataegic outlier (47 events) derived from a single European patient. The median number of SNVs of a kataegis event is six, spanning a narrow range of 2.67 kb and differing between HR groups by ancestry (African 5 SNVs vs. European 7 SNVs, Wilcoxon’s rank-sum test, FDR = \(\:5\times\:{10}^{-4}\) ). Kataegic regions were unique to each patient, as previously described [ 34 ], and only a few were within functional genomic regions (Additional file2: Table S5). Kataegis is associated with genomic instability and co-occurs with cancer drivers Kataegis-positive tumours exhibited increased genomic instability marked by various genomic features observed in one or more groups of risk levels and genetic ancestries with Wilcoxon rank sum test (Additional file1: Fig. S1 ). Elevated TMB, SVs, and chromothripsis were observed in kataegis-positives across ancestries and cancer aggressiveness (FDRs = \(\:3\times\:{10}^{-5}\) –0.02, \(\:7\times\:{10}^{-6}\) –0.002, and \(\:2\times\:{10}^{-5}\) –0.002, respectively). Notably, the SV burden was the most significant factor, showing a further association with kataegis burden in both ancestral groups (negative binomial model, P = 0.001), consistent with the previous study of European patients [ 5 ]. CNVs were significantly correlated with kataegis exclusive to HR groups of African and European patients, characterised by gains (FDRs = 0.01–0.04), losses (FDRs = \(\:2\times\:{10}^{-4}\) –0.002) or both as measured by PGA (FDRs = \(\:2\times\:{10}^{-4}\) –0.002). Significantly shorter telomere lengths were observed in kataegis-positive tumours derived from African patients within the LR group (FDR = 0.03). Significant co-occurrences of kataegis and cancer driver point mutations were observed using Fisher's exact test (Additional file2: Table S1 ). The significant co-occurrence of kataegis and RBFOX1 in HR groups was found for both ancestries (FDRs = \(\:7\times\:{10}^{-4}\) –0.003) and validated by the LR group of public European patients (n = 234, FDR = \(\:3\times\:{10}^{-6}\) ). Additionally, significant on-occurrence of kataegis with PDE4D, TP53 , and ZFHX3 were observed in the HR group of European patients of the study cohort (FDRs = \(\:5\times\:{10}^{-4}\) , 0.04, 0.005, respectively), as well as ATM , ATRX , and CHEK2 observed the LR group of the public European patients (FDRs = 0.002, 0.03, and 0.03, respectively). However, no significant co-occurrence was found in the public Asian cohort (n = 207, FDR >0.3). Kataegis correlates with adverse PCa clinical outcomes To study the clinical implication of kataegis, we examined the PSA level of patients, a widely used clinical measurement for PCa detection [ 35 ] and post-treatment recurrence [ 36 ]. Higher PSA levels were observed with kataegis positive tumours compared to those with negative tumours in the HR group of African patients (median: 100 vs. 43.0 ng/mL; Wilcoxon’s rank-sum test FDR = 0.002; Fig. 2 A). Appreciating that high PSA levels may be an indicator of metastasis risk [ 35 ], lack of associated clinical follow-up data, including associated magnetic resonance imaging (MRI) data, limited further investigation. For the HR group of our European patients sampled at surgery, neither prevalence nor burden of kataegis was a significant predictor of BCR (Kaplan-Meier test), likely due to their small cohort size. Leveraging a larger LR-biased European PCa data resource, we showed LR patients with elevated kataegis burden (more than one event) to be significantly susceptible to metastasis (Log-rank test, P = 0.03), while observing no association for BCR (Fig. 2 B). APOBEC3B is the main aetiology for kataegis in prostate tumours Our analysis of kataegis aetiology identified APOBEC as the primary contributing factor to kataegis across ancestries, except for the LR group of African patients. Particularly, SBS2 and SBS13 signatures accounted for approximately 80% (median: 79.1–93.6% for eight subgroups; Fig. 3 A; all SBS proportions in Additional file1: Fig. S2 ), consistent with previous report (81.7%) [ 5 ]. Consistently, more than 50% (51.6–71.8%) of kataegis were APOBEC-enriched identified based on the motif preferences (Fig. 3 B). Between ancestries, the LR group of Asian patients exhibited significantly more APOBEC-enriched kataegis than other ancestries (Fisher's exact test, P = 0.02 and 0.04 for LR group of African and public European data, respectively). Between risk-levels, the LR group of African patients showed significantly less APOBEC-enriched kataegis than the HR group (Fisher’s exact test, P = 0.04). After observing APOBEC as the main contribution of kataegis events, we conducted a focused comparison between the two APOBEC-related signatures SBS2 and SBS13. The predominance of SBS13 (median, 40–62% for eight subgroups) over SBS2 was observed with significance in the HR group of African patients, and in both LR and HR groups of public European and Asian data (Wilcoxon’s rank sum test, FDR = \(\:4\times\:{10}^{-10}\) –0.005; Fig. 3 A). Different from the other groups, the LR group of African patients showed the lowest proportion of APOBEC-related SBS2, significantly lower than the HR group (median, 0% vs. 25.4%; Wilcoxon’s rank sum test, P = 0.048). We further attributed kataegis to the two APOBEC enzymes A3A and A3B. We observed higher proportions of A3B enrichment in all groups except for the LR group of African patients, with significance observed for larger public European LR and Asian LR/HR data (Wilcoxson’s rank sum test, FDR = \(\:1\times\:{10}^{-4}\) –0.02; Fig. 3 B). This differs from the previous observation in hypermutated samples where A3A was strongly associated [ 32 ], probably because our samples exhibit lower levels of APOBEC activity. This argument is supported by the observation that APOBEC-related signatures were exclusively within kataegic SNVs and not from genome-wide SNVs (Additional file1: Fig. S3). Also, our PCa patients showed no APOBEC3A and APOBEC3B germline predispositions, as previously reported in other cancers [ 5 , 37 , 38 ], including rs12628403 [ 5 ], rs1014971 [ 38 ]. Kataegis status was not associated with somatic CNVs in APOBEC3A and APOBEC3B genes and regions within and between the genes. These findings align with the low frequency and burden of kataegis observed in PCa. Genomic rearrangement processes of ancestry predominant kataegis Having observed a close association between SV and kataegis abundance, we sought to further determine their distributions across the tumour genome. Kataegis events observed across ancestries and risk levels were significantly enriched around SV breakpoints, with 50% (413/831) within a 10 kb distance, 40% (335/831) within a 1-kb distance, and 13% spanning across SV breakpoints (109/831). Comparing kataegis to simulated kataegis events (1,000 times) with randomly selected non-clustering SNVs, we defined the ranges where kataegis were significantly enriched or sparse from an SV. Kataegis were significantly enriched around SV regions with varying ranges (0–10 kb to 0–1 Mb) and sparse at distances beyond 10 Mb or 100 Mb between groups of ancestries and risk levels (simulation tests on log-spaced bins, FDR = 0.003–0.01; Fig. 4 , Additional file1: Fig. S4). We categorised kataegis to be SV-associated and independent for events located within enriched and sparse regions, respectively. The two types of kataegis varied in proportions between risk levels and between African and European ancestries. More SV-associated kataegis was observed in HR over LR groups (Fisher’s exact test, public European data, FDR = 0.004), and in the HR groups of European over African patients (Fisher’s exact one-way test, P = 0.03). Focusing on SV types, chromothripsis was significantly enriched around kataegis (Fisher’s exact test, FDR = 0.04; Fig. 4 ), aligned with previous findings [ 6 ]. Conversely, kataegis did not occur close to translocations (FDR = \(\:1\times\:{10}^{-4}\) ) as shown in the European public data analysed in this study (Additional file1: Fig. S5, S6). The analysis of genome-wide SV signatures for HR groups of the study cohort revealed an association between translocation SV type and kataegis. Compared to kataegis-negative tumours, kataegis-positives from both ancestries exhibited significantly lower proportion and less presence of the predominant SV2 signature and higher proportions and/or more presences of SV4 and SV10 (Fisher’s one-way exact test, FDR = \(\:1\times\:{10}^{-4}\) –0.01; Wilcoxon’s rank sum test, FDR = \(\:1\:\times\:{10}^{-3}\) – \(\:9\:\times\:{10}^{-3}\) ; Fig. 5 ; all SV signatures identified in Additional file1: Fig. S7). According to the COSMIC SV signature database [ 31 ], simple translocations and clustered translocations are the primary components of SV2 and SV4, respectively, while SV10 encompass simple rearrangements of other types. These suggest kataegis-positive prostate tumours characterised by an increase in clustered translocations alongside non-clustered SVs of other types. Differential evolution of prostate tumour kataegis events between ancestries We revealed the uneven rise of kataegis across different evolutionary timeframes by assigning kataegis to clonal epochs (early, late, and unspecified) and the subclonal epoch for the study cohort (Additional file1: Supplementary methods). Both ancestries showed a bias towards clonal origins (65.0% clonal kataegis, 128/197; Fig. 6 ). The clonal proportion of kataegis was significantly higher than that of genome-wide SNVs (median, 100% vs 68.3%; paired Wilcoxon's rank sum test, P = 0.01), aligning with the clonal origin of chromothripsis [ 5 ] that could arise along with kataegis during telomere crisis [ 14 ]. This clonal bias of kataegis appears to be unreported in previous PCa studies [ 5 , 15 ], while subclonal bias was reported for cancers with high kataegis burdens, excluding PCa [ 5 ]. Between ancestries, early clonal kataegis events were more frequent in European patients studied (EUR = 17.2% vs. AFR = 6.9%, Fisher’s exact test on HR groups, P = 0.04). In contrast, African-derived tumours exhibited an increased proportion of subclonal kataegis in both LR and HR groups; the latter showed significance when compared to the European patients (19%, Fisher’s exact test on HR groups, P = 0.002). These findings suggest ancestral specific dynamics during carcinogenesis. Discussion Kataegis is largely overlooked in PCa research due to its low frequency and burden compared to other cancer types, such as bladder and lung cancer [ 5 ]. To the best of our knowledge, no study has investigated the potential contribution of kataegis to ancestrally associated PCa health disparities. Here, using a unique multi-ancestral PCa resource, including southern African men representing the highest global region for PCa-associated mortality, complemented with published data [ 19 – 21 ], we present a detailed characterisation of kataegis features in prostate tumours, highlighting its implications in worse clinical outcomes and ancestrally different mutational processes. We observed prostate tumours exhibiting kataegis, often accompanied by cancer driver mutations and elevated genomic instability, are linked to adverse clinical outcomes. Tumours derived from African patients exhibited a higher proportion of kataegis independent of SVs and later occurrence in subclones. Among African patients, the proportion of kataegis attributed to APOBEC varied between cancer risks. These findings refine the earlier findings of ancestry-related cancer progression trajectories [ 16 ] by emphasising disparities in hypermutations, further underscoring the importance of African-inclusive investigations. Furthermore, we propose kataegis as an indicator of adverse PCa which is independent of both ancestry and risk level. The similar prevalence of kataegis between risk levels highlights the limitation of current cancer grading which failed to detect any morphological or physical changes resulting from the interplay of kataegis, cancer drivers and genomic instability. Patients with kataegis-positive tumours may be recommended for more frequent monitoring during the remission period, as with a potentially higher metastatic risk. The heightened metastatic risk may be driven by genomic instability [ 39 ] and the two co-occurrent oncogenes of kataegis, RBFOX1 and TP53 [ 40 , 41 ]. Also, PSA levels, known to be implicated in bone metastasis via the stimulation of osteoprotegerin [ 42 ], are found to rise in African patients with kataegis-positive aggressive PCa. However, our observation has challenged a previous statement that kataegis is a marker of good prognosis for BRCA [ 12 ]. The BRCA study showed significantly shorter survival time for patients with kataegis, but proposed that aging might be the driving force. Therefore, further follow-up data from African patients is required for investigation, ideally in a large cohort to exclude potential confounding by age. Besides, we propose that the implication of kataegis in prostate tumours progression is different from BRCA and other cancer types with high kataegis burden, despite sharing features including elevated genomic instability, close association with SVs, and attribution to APOBEC enzyme activity [ 5 , 11 ]. Specific to prostate tumours, only the clustered mutations are attributed to APOBEC enzyme activity, mostly to A3B, and no germline predisposition effects that previously reported in BRCA [ 37 ] have been identified through variation in A3A and A3B genes. Our African-inclusive study design has revealed ancestral disparities in kataegis development through evolutionary timing and mutational processes. Our evolutionary analyses have shown clonal kataegis predominated irrespective of patient ancestry. In particular, European ancestry has exhibited the high proportion of early clonal kataegis, indicating an implication in cancer initiation. In contrast, the subclonal kataegis identified in this study are notably biased towards African patients, regardless of clinicopathological presentation, suggesting a high level of genomic instability in cancer and, therefore, marked tumour heterogeneity and associated chemoresistance [ 43 ]. However, we acknowledge that our computational estimation of subclonal kataegis is a simplified model, further investigation with more sequencing techniques may help discern subclones and multiclonal origins, the prevalence of which is unknown to African patients. Additionally, we describe two kataegis mutational mechanisms as SV-associated and independent, observing varying proportions by ancestry with the former significantly more frequent in European than African patients. While kataegis are attributed to APOBEC deamination of cytosines from exposed ssDNA, mostly to APOBEC3B observed in this study, the deamination may take place under different processes for the two kataegis types (Fig. 7 ). We speculate that SV-associated kataegis could have arisen during telomere crisis [ 14 ] and double-strand breaks (DSBs) repair mechanisms, such as break-induced replication (BIR) [ 44 – 46 ], as well as non-homologous end-joining (NHEJ) and alternative end-joining (A-EJ) concerning to the close association with chromothripsis [ 47 ]. The concurrence of driver mutations in TP53 , a cell-cycle checkpoint gene, observed in European patients from this study supports the hypothesis that telomere crisis may result in chromothripsis-associated kataegis bypassing a cell cycle checkpoint due to checkpoint deficiency [ 14 ]. Also, significantly shorter telomere length has been observed in kataegis-positive tumours derived from the low-risk group of African patients and has been previously reported for the aggressive tumours derived from African men [ 48 ]. Conversely, we hypothesise that SV-independent kataegis, which we found to be more common in African ancestrally derived tumours, may arise on R loops in transcription bubbles or on the lagging strand of the DNA replication fork [ 49 , 50 ]. The transcription and replication may be interplayed as R-loops in one of the sources that increase replication stress, leading to an elevated exposure of ssDNA at the replication fork [ 51 ]. We acknowledge, however, that our proposed hypotheses require further validation of cell experiments, such as DNA/RNA immunoprecipitation sequencing (DRIP)–R-loop experiments. Altogether, these findings suggest divergent tumour pathways to some extent between ancestries. While this study provides novel insights into kataegis in relation to ancestries and cancer aggressiveness, several limitations must be acknowledged. The lack of relevant data has hindered further validation or investigation, although this has been mitigated by integrating public cohorts. The clinical implications of kataegis for African patients need future research due to a lack of African follow-up and validation data. More LR data derived from African patients are also required for differentiating features between cancer aggressiveness. To scrutinise the ancestrally shared and distinctive features of kataegis, we integrated publicly available PCa data from European and Asian ancestral patients. However, this study and public cohorts differ in their composition of cancer aggressiveness and variant identification pipelines. While our study cohort is biased towards very HR disease (ISUP GG4/5), the public European dataset focused on intermediate risk disease (82%, ISUP GG2/3) [ 20 ], and the public Asian data lacks ISUP GG 5 (0.5%, 1/207) [ 21 ]. Additionally, although we applied consistent methods for kataegis identification and downstream analyses, our somatic variant identification is more stringent due to filtering by a panel of normal samples. These limitations highlight not only areas for future research, but importantly underscores the need for tailored data collection and analysis. Conclusions The available PCa whole genome cohort remains one of the largest of its kind for the African continent and benefits from the inclusion of clinically, technically, and analytically matched non-African data, allowing for direct, unbiased comparative analyses. Using this African inclusive resource, supported by published non-African data, enabled us to discern both universal (or shared) and ancestrally unique kataegis positive prostate tumour features, particularly with regards to advanced disease. Demonstrating heightened African-specific kataegis-associated heterogeneity, our study emphasises the need for further African inclusion, specifically to elucidate the potential of kataegis and APOBEC3 enzymes as biomarkers of targeted cancer therapy. Collectively, by elucidating the occurrence of kataegis from tumorigenesis to later subclonal stage in African and European patients, we highlight the significance of different underlying mutational processes between ancestries, which provides a valuable resource for targeted therapeutic interventions and emphasises the need for continued exploration of biological behaviours and environmental exposures in African patients. Abbreviations A-EJ alternative end-joining A3A APOBEC3A A3B APOBEC3B AFR African ancestry ASI Asian ancestry BCR biochemical relapse BIR break-induced replication BRCA breast cancer CN copy number CNVs copy number variants COSMIC Catalogue Of Somatic Mutations In Cancer CPGEA Chinese Prostate Cancer Genome and Epigenome Atlas DRIP DNA/RNA immunoprecipitation sequencing DSBs double-strand breaks EGA European Genome-Phenome Archive EUR European ancestry FDR false discovery rate GG Grade Group HR high-risk, Grade Group 3–5 ICGC International Cancer Genome Consortium ISUP International Society of Urological Pathology LR low-risk, Grade Group 1/2 MRI magnetic resonance imaging NHEJ non-homologous end-joining NMF nonnegative matrix factorisation PCa prostate cancer PCAWG Pan-Cancer Analysis of Whole Genomes PCF piecewise constant fitting PGA percentage of genome alteration PSA prostate specific antigen SAPCS Southern African Prostate Cancer Study SBS single base substitution SNVs single nucleotide variants ssDNAs single-strand DNAs SV structural variants TMB tumour mutational burden Declarations Ethics approval and consent to participate Conforming to the principles of the Helsinki Declaration, South African patients were recruited as part of the Southern African Prostate Cancer Study (SAPCS) with approval granted by the University of Pretoria Faculty of Health Research Ethics Committee (HRECs, with US Federal wide assurance FWA00002567 and IRB00002235 IORG0001762; #43/2010), while in Australia participant recruitment was approved by the St Vincent’s HREC (#SVH/12/231). Samples were shipped to the Garvan Institute of Medical Research and/or the University of Sydney in accordance with institutional Material Transfer Agreements (MTAs) and appropriate Republic of South Africa Department of Health Export Permit (National Health Act 2003; J1/2/4/2 #1/12). Genomic interrogation required for this study was approved by the St. Vincent’s HREC (#SVH/15/227), with additional IRB review and approval granted by the Human Research Protection Office of the US Army Medical Research and Development Command E02371 (TARGET Africa) and E03280 (HEROIC PCaPH Africa1K). Consent to publication All data used in this study has been previously published, which for the Southern African Prostate Cancer Study (SAPCS) data was accessed through Data Access Committee approval. Competing Interests Hayes is a Member of Active Surveillance Movember Committee and received an honorarium from The Korean Urological Oncology Society for 2024 Annual Conference as a guest speaker. Funding Genomic sequencing was supported by the National Health and Medical Research Council (NHMRC) of Australia through a Project Grant (2018/GNT1165762 to V.M.H.) and Ideas Grants (2020/GNT2001098 and 2021/GNT2010551 to V.M.H.). Further analytics was supported by the U.S.A. Congressionally Directed Medical Research Programs (CDMRP) Prostate Cancer Research Program (PCRP) Idea Development Award (PC200390, TARGET Africa to V.M.H.), HEROIC Consortium Award (PC210168 and PC23067, HEROIC PCaPH Africa1K to V.M.H. and M.S.R.B., with co-Principal Investigators Professors Gail Prins, University of Illinois at Chicago, U.S.A. and Mungai Peter Ngugi, University of Nairobi, Kenya), U.S.A. National Institute of Health (NIH) National Cancer Institute (NCI) Award (1R01CA285772-01 to V.M.H.), U.S.A. Prostate Cancer Foundation (PCF) 2023 Challenge Award (2023CHAL4150 to V.M.H.) and NHMRC Ideas grant (2024/GNT2037298 to W.J.). V.M.H. is supported by the Petre Foundation via the University of Sydney Foundation, while J.J. is supported by a U.S.A. Prostate Cancer Foundation (PCF) Scholarship as part of the 2023 Challenge Award. Author Contribution JJ analysed and interpreted the data, wrote and reviewed the manuscript. AT elaborated the methods regarding signature analyses, reviewed and edited the draft. RH curated the analysed data regarding telomere lengths. MSRB collected data, investigated and conceptualised the study. PDS collected the data, investigated. DCW elaborated the methods regarding enrichment of structural variants, supervised the study, reviewed and edited the draft. WJ supervised and conceptualised the study, curated the data regarding evolution of kataegis and clustering of patients, wrote and reviewed the draft. VMH supervised and conceptualised the study, collected and curated data, wrote and reviewed the manuscript, provided resources and funding acquisition. All authors reviewed the manuscript. Acknowledgement We are forever grateful to the patients who contribute their time and samples to make this study possible, as well as the clinical staff who have participated in patient recruitment and maintenance of the SAPCS (South Africa) and the Garvan/St Vincent’s Hospital (Australia) Bioresources, with specific acknowledgement for Bioresource Managers Ms Tumisang Mbeke (University of Pretoria, South Africa) and Sr Anne-Maree Haynes (Garvan Institute of medical Research, Australia), respectively. We are thankful to Dr Pamela X.Y. Soh (University of Sydney, Australia) for providing ancestral clarifications for the study participants. We also acknowledge the use of the National Computational Infrastructure (NCI), which is supported by the Australian Government and accessed through the National Computational Merit Allocation Scheme (V.M.H., J.J. and W.J.), as well as the Sydney Informatics Hub, Core Research Facility at the University of Sydney. This work will form part of a Ph.D. thesis for J.J. Data Availability The analysed sequence data of the study cohort are available through the European Genome‐Phenome Archive (EGA; https://ega‐archive.org) under overarching accession EGAS00001006425, available from the authors upon reasonable request with the permission of Southern African Prostate Cancer Study (SAPCS) (EGAD00001009067) and Garvan/St Vincent’s Dataset (EGAD00001009066). The analysed variant data of two public cohorts are available from the ICGC Data Portal ( [http://dcc.icgc.org/](http:/dcc.icgc.org) ) for a European cohort and from the Genome Sequence Archive for Human (http://bigd.big.ac.cn/gsa-human/) under accession number PRJCA001124 for an Asian cohort. References Bray, F., et al., Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries . CA Cancer J Clin, 2024. 74(3): p. 229–263. Lee, K.M., et al., Association between prediagnostic prostate-specific antigen and prostate cancer probability in Black and non-Hispanic White men . Cancer, 2024. 130(2): p. 224–231. Nair, S.S., et al., Why do African-American men face higher risks for lethal prostate cancer? Curr Opin Urol, 2022. 32(1): p. 96–101. Nik-Zainal, S., et al., The life history of 21 breast cancers . Cell, 2012. 149(5): p. 994–1007. Aaltonen, L.A., et al., Pan-cancer analysis of whole genomes . Nature, 2020. 578(7793): p. 82–93. Taylor, B.J., et al., DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis . elife, 2013. 2: p. e00534. Veerla, S. and J. Staaf, Kataegis in clinical and molecular subgroups of primary breast cancer . npj Breast Cancer, 2024. 10(1): p. 32. Chen, L., et al., Deep whole-genome analysis of 494 hepatocellular carcinomas . Nature, 2024. 627(8004): p. 586–593. Alexandrov, L.B., et al., Signatures of mutational processes in human cancer . Nature, 2013. 500(7463): p. 415–421. Ansari-Pour, N., et al., Whole-genome analysis of Nigerian patients with breast cancer reveals ethnic-driven somatic evolution and distinct genomic subtypes . Nature communications, 2021. 12(1): p. 6946. Jakobsdottir, G.M., et al., APOBEC3 mutational signatures are associated with extensive and diverse genomic instability across multiple tumour types . BMC Biology, 2022. 20(1): p. 117. D’Antonio, M., et al., Kataegis expression signature in breast cancer is associated with late onset, better prognosis, and higher HER2 levels . Cell reports, 2016. 16(3): p. 672–683. Maciejowski, J., et al., APOBEC3-dependent kataegis and TREX1-driven chromothripsis during telomere crisis . Nature genetics, 2020. 52(9): p. 884–890. Maciejowski, J., et al., Chromothripsis and kataegis induced by telomere crisis . Cell, 2015. 163(7): p. 1641–1654. Cooper, C.S., et al., Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue . Nature Genetics, 2015. 47(4): p. 367–372. Jaratlerdsiri, W., et al., African-specific molecular taxonomy of prostate cancer . Nature, 2022. 609: p. 552–559. Patrick, S.M., et al., Prostate cancer clinicopathological presentation in South-East Africa during the 2010 decade . JNCI: Journal of the National Cancer Institute, 2025: p. djaf117. Jiang, J., et al., Scaling for African Inclusion in High-Throughput Whole Cancer Genome Bioinformatic Workflows . Cancers, 2025. 17(15): p. 2481. Zhang, J., et al., The international cancer genome consortium data portal . Nature biotechnology, 2019. 37(4): p. 367–369. Fraser, M., et al., Genomic hallmarks of localized, non-indolent prostate cancer . Nature, 2017. 541(7637): p. 359–364. Li, J., et al., A genomic and epigenomic atlas of prostate cancer in Asian populations . Nature, 2020. 580(7801): p. 93–99. Lin, X., et al., kataegis: an R package for identification and visualization of the genomic localized hypermutation regions using high-throughput sequencing . BMC genomics, 2021. 22(1): p. 440. Team, R.C., R: A language and environment for statistical computing. R Foundation for Statistical Computing . (No Title), 2013. Kassambara, A., ggpubr:'ggplot2'based publication ready plots. R package version, 2018: p. 2. Armenia, J., et al., The long tail of oncogenic drivers in prostate cancer . Nat Genet, 2018. 50(5): p. 645–651. Ding, L., et al., The roles of cyclin-dependent kinases in cell-cycle progression and therapeutic strategies in human breast cancer. International journal of molecular sciences, 2020. 21(6): p. 1960. Therneau, T.M. and T. Lumley, Package ‘survival’ . R Top Doc, 2015. 128(10): p. 28–33. Kassambara, A., et al., survminer: Drawing Survival Curves using ‘ggplot2’. R package version 0.4, 2021. 9: p. 2021. Islam, S.A., et al., Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor . Cell genomics, 2022. 2(11). Kuhn, R.M., D. Haussler, and W.J. Kent, The UCSC genome browser and associated tools . Briefings in bioinformatics, 2013. 14(2): p. 144–161. Everall, A., et al., Comprehensive repertoire of the chromosomal alteration and mutational signatures across 16 cancer types from 10,983 cancer patients. medRxiv, 2023: p. 2023.06. 07.23290970. Chan, K., et al., An APOBEC3A hypermutation signature is distinguishable from the signature of background mutagenesis by APOBEC3B in human cancers . Nature Genetics, 2015. 47(9): p. 1067–1072. Roberts, S.A., et al., An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers . Nature Genetics, 2013. 45(9): p. 970–976. Law, E.K., et al., APOBEC3A catalyzes mutation and drives carcinogenesis in vivo . J Exp Med, 2020. 217(12). Merriel, S.W.D., et al., Systematic review and meta-analysis of the diagnostic accuracy of prostate-specific antigen (PSA) for the detection of prostate cancer in symptomatic patients . BMC Medicine, 2022. 20(1): p. 54. Milonas, D., et al., The significance of prostate specific antigen persistence in prostate cancer risk groups on long-term oncological outcomes . Cancers, 2021. 13(10): p. 2453. Nik-Zainal, S., et al., Association of a germline copy number polymorphism of APOBEC3A and APOBEC3B with burden of putative APOBEC-dependent mutations in breast cancer . Nature Genetics, 2014. 46(5): p. 487–491. Middlebrooks, C.D., et al., Association of germline variants in the APOBEC3 region with cancer risk and enrichment with APOBEC-signature mutations in tumors . Nature Genetics, 2016. 48(11): p. 1330–1338. Fares, J., et al., Molecular principles of metastasis: a hallmark of cancer revisited . Signal Transduction and Targeted Therapy, 2020. 5(1): p. 28. Perron, G., et al., Pan-cancer analysis of mRNA stability for decoding tumour post-transcriptional programs . Communications Biology, 2022. 5(1): p. 851. De Laere, B., et al., TP53 outperforms other androgen receptor biomarkers to predict abiraterone or enzalutamide outcome in metastatic castration-resistant prostate cancer . Clinical cancer research, 2019. 25(6): p. 1766–1773. Wong, S.K., et al., Prostate cancer and bone metastases: the underlying mechanisms . International journal of molecular sciences, 2019. 20(10): p. 2587. Ashrafizadeh, M., et al., Molecular panorama of therapy resistance in prostate cancer: a pre-clinical and bioinformatics analysis for clinical translation . Cancer and Metastasis Reviews, 2024. 43(1): p. 229–260. Elango, R., et al., Repair of base damage within break-induced replication intermediates promotes kataegis associated with chromosome rearrangements . Nucleic acids research, 2019. 47(18): p. 9666–9684. Sakofsky, C.J., et al., Break-induced replication is a source of mutation clusters underlying kataegis . Cell reports, 2014. 7(5): p. 1640–1648. Green, A.M. and M.D. Weitzman, The spectrum of APOBEC3 activity: From anti-viral agents to anti-cancer opportunities . DNA repair, 2019. 83: p. 102700. Gelot, C., I. Magdalou, and B.S. Lopez, Replication stress in Mammalian cells and its consequences for mitosis . Genes, 2015. 6(2): p. 267–298. Huang, R., et al., The impact of telomere length on prostate cancer aggressiveness, genomic instability and health disparities . Sci Rep, 2024. 14(1): p. 7706. McCann, J.L., et al., APOBEC3B regulates R-loops and promotes transcription-associated mutagenesis in cancer . Nature Genetics, 2023. 55(10): p. 1721–1734. Seplyarskiy, V.B., et al., APOBEC-induced mutations in human cancers are strongly enriched on the lagging DNA strand during replication . Genome research, 2016. 26(2): p. 174–182. Saxena, S. and L. Zou, Hallmarks of DNA replication stress . Mol Cell, 2022. 82(12): p. 2298–2314. Additional Declarations Competing interest reported. Hayes is a Member of Active Surveillance Movember Committee and received an honorarium from The Korean Urological Oncology Society for 2024 Annual Conference as a guest speaker. Supplementary Files KataegisGenomeMedAdditionalfile1.pdf KataegisGenomeMedAdditionalfile2.xlsx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 09 Dec, 2025 Reviews received at journal 11 Nov, 2025 Reviews received at journal 05 Nov, 2025 Reviewers agreed at journal 26 Oct, 2025 Reviewers agreed at journal 23 Oct, 2025 Reviews received at journal 23 Oct, 2025 Reviewers agreed at journal 09 Oct, 2025 Reviewers invited by journal 08 Oct, 2025 Editor assigned by journal 26 Sep, 2025 Submission checks completed at journal 16 Sep, 2025 First submitted to journal 15 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7624142","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":531681307,"identity":"ed19e011-0f3a-4a26-9a51-7a3c5f170361","order_by":0,"name":"Jue Jiang","email":"","orcid":"","institution":"The University of Sydney","correspondingAuthor":false,"prefix":"","firstName":"Jue","middleName":"","lastName":"Jiang","suffix":""},{"id":531681308,"identity":"69e57688-31a9-45ae-afe4-e9128514a0b8","order_by":1,"name":"Avraam Tapinos","email":"","orcid":"","institution":"University of Manchester","correspondingAuthor":false,"prefix":"","firstName":"Avraam","middleName":"","lastName":"Tapinos","suffix":""},{"id":531681309,"identity":"1c82ee90-ab8d-4bf8-b888-2906323d5778","order_by":2,"name":"Ruotian Huang","email":"","orcid":"","institution":"The University of Sydney","correspondingAuthor":false,"prefix":"","firstName":"Ruotian","middleName":"","lastName":"Huang","suffix":""},{"id":531681310,"identity":"4040c8f9-c241-4186-b219-22aa1da76cd4","order_by":3,"name":"M.S. Riana Bornman","email":"","orcid":"","institution":"University of Pretoria","correspondingAuthor":false,"prefix":"","firstName":"M.S.","middleName":"Riana","lastName":"Bornman","suffix":""},{"id":531681311,"identity":"a28b95c3-437c-4dac-a36d-7b7aae4bb13b","order_by":4,"name":"Phillip Stricker","email":"","orcid":"","institution":"St Vincent's Hospital Sydney","correspondingAuthor":false,"prefix":"","firstName":"Phillip","middleName":"","lastName":"Stricker","suffix":""},{"id":531681312,"identity":"7244d6c2-0a53-4a25-af08-bc0bb8f6d2a6","order_by":5,"name":"Shingai Mutambirwa","email":"","orcid":"","institution":"Sefako Makgatho Health Sciences University","correspondingAuthor":false,"prefix":"","firstName":"Shingai","middleName":"","lastName":"Mutambirwa","suffix":""},{"id":531681313,"identity":"fefac1eb-a834-448e-b0fa-5772c4f6b2ee","order_by":6,"name":"David Wedge","email":"","orcid":"","institution":"University of Manchester","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Wedge","suffix":""},{"id":531681314,"identity":"80d5973d-6cd3-4763-abd6-174c7a1aa103","order_by":7,"name":"Weerachai Jaratlerdsiri","email":"","orcid":"","institution":"The University of Sydney","correspondingAuthor":false,"prefix":"","firstName":"Weerachai","middleName":"","lastName":"Jaratlerdsiri","suffix":""},{"id":531681315,"identity":"f3b0e7d1-a3b7-4af6-a5f4-c15a88a6e1ad","order_by":8,"name":"Vanessa Hayes","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABKklEQVRIiWNgGAWjYDADAzBZAcQHwCxmYrWcIVkLYxsRWgyOnz348McfOwZz9jOGD3/Oq0vsu918TIKhwjqxgf2MAVYtZ/KSjXnbkhkse3KMjXm3HU6ceedYmgTDmfTEBp4crFokG3LMpBkbmBkMDoAY2w4kbriRYybB2HY4sYEBh5b+N+Y/f/ypZzA4D2T8nFMH1JL/TYLxH1AL/xusWvglcswYeNgOMxgADWfgbWAG2cImwdgA1CKB3RZ+iTfG0rxtx3kMbjwrluY5dth45o00Y4uEY+nGbRLPCrBpYePPMfz440+1nMH55I0ff9TUyfbdSH5440ONtWw/f/IG7MEMATwMDBxgZzg2MDCwSCSATMOnHgLYH4BIeyBm/kBY9SgYBaNgFIwgAACD1mYZ2wTc2gAAAABJRU5ErkJggg==","orcid":"","institution":"The University of Sydney","correspondingAuthor":true,"prefix":"","firstName":"Vanessa","middleName":"","lastName":"Hayes","suffix":""}],"badges":[],"createdAt":"2025-09-15 21:53:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7624142/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7624142/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94138482,"identity":"daa3056f-98e6-477e-b9cb-1b767f353745","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2495935,"visible":true,"origin":"","legend":"","description":"","filename":"KataegisGenomeMedMainText.docx","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/25f67a9ed855674f6723399c.docx"},{"id":94138481,"identity":"233d4aed-e7d5-4b1a-af0d-3d2d79a07798","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":13041,"visible":true,"origin":"","legend":"","description":"","filename":"311539460b4f4628a0bf0bfd6ba73902.json","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/9f809e4f09d101dbdd055837.json"},{"id":94136679,"identity":"47ce10c3-0db4-4093-811b-017acd078698","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1040541,"visible":true,"origin":"","legend":"","description":"","filename":"KataegisGenomeMedAdditionalfile1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/920c5f68da5f4a8c3f84f97e.pdf"},{"id":94136680,"identity":"925af32d-b777-4312-b4cf-7057e1afab12","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":161089,"visible":true,"origin":"","legend":"","description":"","filename":"KataegisGenomeMedAdditionalfile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/a2e63cfc1aad56b28c455e52.xlsx"},{"id":94136684,"identity":"c4d88136-b4df-46c0-a9d1-40d98c11913a","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"xml","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":139321,"visible":true,"origin":"","legend":"","description":"","filename":"311539460b4f4628a0bf0bfd6ba739021enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/7ac84cdecdeefbbf35ab040a.xml"},{"id":94136683,"identity":"95c53936-44be-4f55-9bf4-b74121f91516","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2117992,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/31426ed5d8bc29aae154e74c.emf"},{"id":94136692,"identity":"ee16009c-fa51-4031-a84d-d3e5abbeb442","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":949300,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/f5ce5cb14c1f6d28787faf8b.emf"},{"id":94136693,"identity":"4fb592fc-3e56-40dc-a1fc-67bf85593500","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":810844,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/56f3541e8a93e49cb5f54910.emf"},{"id":94138485,"identity":"a0298cef-31f9-46c5-b3f8-92b8c4609c5d","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"emf","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":958432,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/2c6f4d39cc893f6b9b4a0b3f.emf"},{"id":94136688,"identity":"7211b0ec-f060-418a-9723-a4b52ed85f2a","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":422164,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/759c7a89dc17b360ef1d8548.emf"},{"id":94136694,"identity":"f92e54f2-e27d-4922-99fd-b58ef6cd5147","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":900104,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/4ad4d143168c0c2f56493e3b.emf"},{"id":94136699,"identity":"3cddfe07-df3b-447f-9508-8ee00706b880","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"emf","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1092428,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.emf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/f5d4922d2d7759397ed092b8.emf"},{"id":94138489,"identity":"9d374175-cc35-4013-adf3-7e7a346c4f74","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":24212,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/d6c16a0421031e96e05da60b.png"},{"id":94138486,"identity":"34d1086c-af60-4ade-9468-d34e570f8442","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7664,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/26996e60899baf9d506c2a21.png"},{"id":94139832,"identity":"00913d4f-c4a2-4d07-a546-4dd12c503b8b","added_by":"auto","created_at":"2025-10-22 19:36:26","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8595,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/51ea37ea3babab0833add797.png"},{"id":94138484,"identity":"7d043b59-7dc1-4f20-ad9d-b25cd7c3510e","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":12464,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/f8d2bb8cbd3bcec08d807e4e.png"},{"id":94138488,"identity":"38379d79-3ccc-4562-8865-00a9c2c42ffb","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5054,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/50528ab45ef4588bce87c375.png"},{"id":94136685,"identity":"e7b5e2a4-64da-4f8e-b25e-c76b0d070fa4","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":6380,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/e241ff072d1f8c8336dc0dbf.png"},{"id":94139833,"identity":"b1e32464-8ae9-4e1e-840f-20d8f3265535","added_by":"auto","created_at":"2025-10-22 19:36:26","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":6819,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/5cd5fa1411dfb55d4f12c90e.png"},{"id":94136697,"identity":"c9130c73-4b68-4141-9d8b-e0c39c8860d0","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"xml","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":133587,"visible":true,"origin":"","legend":"","description":"","filename":"311539460b4f4628a0bf0bfd6ba739021structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/3cffd31432edb707ec890e69.xml"},{"id":94136698,"identity":"ca8c81a3-007b-43f1-9b41-9fdc4db1749d","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"html","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":154943,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/496dc79686a54730d10728c3.html"},{"id":94136670,"identity":"783f8289-bf76-419f-ac93-d78c7d5c658b","added_by":"auto","created_at":"2025-10-22 19:20:25","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":112388,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistribution and prevalence of kataegis in prostate tumours.\u003c/strong\u003e \u003cstrong\u003e(A)\u003c/strong\u003e Distributions of clustered kataegic SNVs identified among 44 African and 24 European patients from this study cohort (n = 166). The extensive kataegis burden of chromosome 12 in European patients is driven by a single outlier. \u003cstrong\u003e(B)\u003c/strong\u003ePrevalence of kataegis and kataegis burden by patient ancestry. The presence of kataegis is defined as negative (kataegis -, dark blue) and positive (kataegis +), which are further indicated by a low to high kataegis burden (yellow to red gradient), with hyper-kataegis outliers excluded from the analysis. Patient ancestries are labelled as African (AFR), European (EUR), and Asian (ASI), with the prefix ‘Pub’ added for public data. Cancer risk levels are defined as low-risk (LR, ISUP GG1–2) and high-risk (HR, ISUP GG3–5) clinicopathological presentation. Numbers underneath define the number of kataegis + tumours \u003cem\u003evs\u003c/em\u003e the total number of tumours and the prevalence, with hyper-kataegis outliers excluded.\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/6fe816548fb7e5eeb9d869f7.png"},{"id":94136673,"identity":"58eadc0d-1e54-4c5e-aa8a-7f346648e5a7","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":27977,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eKataegis implication in clinical measurements and outcomes. (A) \u003c/strong\u003eProstate-specific antigen (PSA) present in 165 PCa patients and distinguished by clinical risk and genetic ancestry.\u003cstrong\u003e \u003c/strong\u003eA hyper-kataegis outlier is excluded. PSA values are compared between kataegis positive (+) and\u003cem\u003e \u003c/em\u003enegative (-) within particular risk levels, defined as low-risk (LR, ISUP GG1–2) and high-risk (HR, ISUP GG3–5) clinicopathological presentation, and by patient ancestry (AFR, African and EUR, European). Significance is defined by false discovery rate (FDR); ns, not significant; and **, FDR = 0.002. \u003cstrong\u003e(B)\u003c/strong\u003e Kaplan-Meier survival estimates correlating kataegis abundance within the low-risk (LR) group of public European data with clinical follow-up (time in days, n = 172). The comparisons are between patients having multiple kataegis (kataegis = multiple, n = 35) against having no or one kataegis (kataegis = 0–1, n = 137). Clinical outcomes analysed are defined as metastasis. Two outliers with z-scores greater than three and one with missing metastasis data were excluded (see Materials and methods). Patients with biochemical relapse and no metastasis were excluded.\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/3ab1549a37b488c48e00f04b.png"},{"id":94139831,"identity":"646edbd5-1df7-4193-9c2f-fac05198afee","added_by":"auto","created_at":"2025-10-22 19:36:26","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":34842,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAttribution of kataegis to APOBEC enzyme activities. (A)\u003c/strong\u003e Proportion of\u003cstrong\u003e \u003c/strong\u003eAPOBEC-related\u003cstrong\u003e \u003c/strong\u003esingle-base substitution (SBS) signatures per subgroup, including total APOBEC signatures (dark blue), SBS2 (gold) and SBS13 (red), with hyper-kataegic tumours excluded from the signature analysis. \u003cstrong\u003e(B)\u003c/strong\u003e Proportion of APOBEC-enriched kataegis per group, including APOBEC (blue-purple) determined by T\u003cu\u003eC\u003c/u\u003eW motifs, and further the APOBEC3A (A3A, green) and APOBEC3B (A3B, orange) with YT\u003cu\u003eC\u003c/u\u003eW and RT\u003cu\u003eC\u003c/u\u003eW motifs, respectively. Patient ancestries are labelled as African (AFR), European (EUR), and Asian (ASI), with the prefix ‘Pub’ added for public data. Cancer risk levels are defined as low-risk (LR, ISUP GG1–2) and high-risk (HR, ISUP GG3–5) clinicopathological presentation. The number of patients per group is labelled underneath, excluding outliers.\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/ee806b80bec0b0802bf91904.png"},{"id":94138479,"identity":"587fcadf-79b2-4b6b-af85-45eb7494cb12","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":35718,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistances between kataegis and proximal SVs.\u003c/strong\u003e The top bar charts show proportion of SVs per type in kataegis enriched regions and sparse regions per patient group, while the bottom lines represent the density of SVs along the distance to proximal kataegis events. Colours of the line show whether kataegis is significantly enriched (blue) or sparse (orange) within a region compared to simulations. Patient ancestries are labelled as African (AFR) and European (EUR). Cancer risk levels are defined as low-risk (LR, ISUP GG1–2) and high-risk (HR, ISUP GG3–5) clinicopathological presentation. The number of patients per group is labelled underneath excluding outliers.\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/22e29dd16c818e7b9551ee4f.png"},{"id":94136671,"identity":"1ea59b70-46b6-40a6-b955-e61e7db8e549","added_by":"auto","created_at":"2025-10-22 19:20:25","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":18294,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eProportion of structural variants (SV) signatures.\u003c/strong\u003e Genome-wide SV signatures are identified from kataegis positive (+) and negative (-) prostate tumours of high-risk PCa (ISUP GG3–5) derived from Africans (AFR, n = 77) and Europeans (EUR, n = 48). Proportion of SV signatures per tumour (column) is defined as SV2 (blue), SV4 (orange), SV10 (green) and the others (grey). The number of patients per group is labelled underneath, excluding outliers.\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/3c97e7d5ed1258371fff3db3.png"},{"id":94136677,"identity":"7258804c-585b-4192-9074-f562cef7da71","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":32454,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEvolution of kataegis events. \u003c/strong\u003eEvolutionary kataegis were identified between cancer patients by ancestry, African (AFR) or European (EUR), and cancer risk, low-risk (LR, ISUP GG1–2) or high-risk (HR, ISUP GG3–5). The evolution of kataegis events is shown by their proportion along the development of cancer. The number of patients per group is labelled underneath, excluding outliers.\u003c/p\u003e","description":"","filename":"image6.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/1eb3e72fb704b34bc4e5e282.png"},{"id":94138483,"identity":"b6b186af-4a53-406d-bf42-c89cbc363422","added_by":"auto","created_at":"2025-10-22 19:28:26","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":28446,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAncestrally distinct kataegis development proposed in prostate cancer.\u003c/strong\u003e Proposing two types of kataegis: SV-associated kataegis (orange) arises during double-strand breaks (DSBs) repair, break induced replication (BIR) and telomere crisis, while independent kataegis (blue) arises from dispersed APOBEC3 activity, could happen at R-loop during transcription and lagging strand of DNA replication. We propose that these two types of kateagis occur at different rates (indicated by bar plots) during the tumour evolution of African (AFR) vs. European (EUR) derived prostate tumours. Cancer risk levels are defined as low-risk (LR, ISUP GG1–2) and high-risk (HR, ISUP GG3–5) clinicopathological presentation.\u003c/p\u003e","description":"","filename":"image7.png","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/ab4051df6fa532fe368f7fdd.png"},{"id":94140616,"identity":"fb91ee15-cc13-4010-9d44-4cf2feba16a8","added_by":"auto","created_at":"2025-10-22 19:44:27","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1637357,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/d016df9c-18e7-4034-a2ad-f16cc94a9d48.pdf"},{"id":94136675,"identity":"30247d2c-87dc-4a7f-97a1-2d6673b02d11","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":1040541,"visible":true,"origin":"","legend":"","description":"","filename":"KataegisGenomeMedAdditionalfile1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/c7029f01a642f7ffb088896f.pdf"},{"id":94136681,"identity":"d7af0da7-a70f-4466-9915-9584b7e42771","added_by":"auto","created_at":"2025-10-22 19:20:26","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":161089,"visible":true,"origin":"","legend":"","description":"","filename":"KataegisGenomeMedAdditionalfile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7624142/v1/61447bd6f2780c7db1004bbc.xlsx"}],"financialInterests":"Competing interest reported. Hayes is a Member of Active Surveillance Movember Committee and received an honorarium from The Korean Urological Oncology Society for 2024 Annual Conference as a guest speaker.","formattedTitle":"Single base focal hypermutation cooccurs with structural variation as an early event in advanced prostate tumourigenesis with ancestry specific independence: a multi-ancestral observational study","fulltext":[{"header":"Introduction","content":"\u003cp\u003eProstate cancer (PCa) is the most frequently diagnosed male cancer in most regions of the world, disproportionately affecting men of African ancestry and particularly from Sub-Saharan Africa [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Mortality rates from PCa are highest in Sub-Saharan Africa and the Caribbean, with southern Africa ranking the first globally at 29.7 (age-standardised rate per 100,000 males) [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Notably, the incidence rate of southern Africa is lower than that of economically stable regions, such as Australia and New Zealand (59.9 vs. 78.1, age-standardised rate per 100,000) [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Conversely, both incidence and mortality rates are lowest across the Asian diaspora of nations. While the disparities may be attributed to diminished access to PCa screening and medical resources or exposure to yet unknown geographic risk factors, studies from the United States have shown that African American men are at greatest risk for aggressive disease presentation and associated lethality after accounting for non-genetic factors [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Additional studies that alluded to biological and genomic contributions are needed for a better understanding of the disparities across different ancestral populations.\u003c/p\u003e\u003cp\u003eKataegis, meaning thunderstorm in Greek, describes the focal hypermutation phenomenon in cancer genomes [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. A kataegis event is defined as a cluster of closely distributed single nucleotide variants (SNVs) and results from a single mutational action of APOBEC3A (A3A) or APOBEC3B (A3B) cytidine deaminases on exposed single-strand DNAs (ssDNAs) [\u003cspan additionalcitationids=\"CR5\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. This mutational process has been linked to single base substitution (SBS) signatures, SBS2 and SBS13 [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Despite Pan-Cancer Analysis of Whole Genomes (PCAWG) and organ-specific studies suggesting kataegis to be frequent in cancers of the breast, bladder, lung, and skin (melanoma) [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], the evolution of kataegis and its clinical implications remain elusive for PCa, and unclear for African patients due to a lack of African-derived whole tumour genome data [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Controversially, breast cancer (BRCA) research reported kataegis with a favourable prognosis [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] and lower genomic instability [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], with others showing a link to aggressive disease [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. The early event of kataegis arising with chromothripsis during telomere crisis has been suggested by modified cell line experiments [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e], while late kataegis development is observed in PCa [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] and hepatocellular carcinoma [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. However, the potential contribution and association of kataegis in PCa ancestral disparities are yet to be determined.\u003c/p\u003e\u003cp\u003eThis study aims to characterise kataegis mutational processes in PCa genomes from patients of different ancestries and to assess the potential clinical implication, with a particular focus on aggressive disease in African men. We processed samples from 109 African men (Black South Africans) and 57 European men (predominantly Australians) through the same pipeline, providing a direct comparative analysis based on genetic ancestry. Across ancestries, our findings linked kataegis events with more aggressive PCa manifestations and adverse clinical outcomes. The investigation of the aetiology primarily attributed kataegis to APOBEC enzymes with variation between cancer aggressiveness among African patients. We observed ancestral disparities in the evolutionary timing of kataegis and the distribution of distances between kataegis and structural variants (SVs). These findings highlight the unique genetic factors contributing to PCa in African men and underscore the importance of including diverse ancestral populations in cancer research.\u003c/p\u003e"},{"header":"Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eSubjects and whole genome sequencing data\u003c/h2\u003e\u003cp\u003eTreatment naive samples of blood and tumour pairs were collected from 166 patients diagnosed with PCa recruited from South Africa (n\u0026thinsp;=\u0026thinsp;113) and Australia (n\u0026thinsp;=\u0026thinsp;53, Table. 1). Patient ancestry was determined using whole genome interrogation for subpopulation fraction analyses, as previously described [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. In short, 109 patients categorised as African (all South African) with greater than 85% African ancestral fraction; 57 were categorised as European (53 Australian and 4 South African), allowing up to 3% African ancestral and 26% Asian contributions. Tumour aggressiveness was defined from histopathological Gleason Scores as the International Society of Urological Pathology (ISUP) Grade Group (GG) either at diagnosis (South Africans) or surgery (Australians). Patients presented either as low-risk (LR, GG1 and GG2) or high-risk (HR, GG3\u0026ndash;5), with the African derived HR group biased towards very HR PCa (89%, 72/81 ISUP GG4/5). For comparison, we intentionally selected untreated biobanked samples with advanced disease for our European cohort (98%, 49/50 ISUP GG 4/5). As previously reported for South-East Africa [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e], both prostate specific antigen (PSA) levels (median 82.60 vs 8.15) and age at presentation/surgery (median 69 vs 63 in HR groups) are elevated for our African over the European cohort of HR groups. The latter cohort allows for extensive follow-up data defined as biochemical relapse (BCR) and/or metastasis. All samples underwent deep WGS using the Illumina NovaSeq and Hiseq platforms (median coverages tumour 88.64 X and blood 44.19X), GRCh38 referenced variant calling and annotation, and evolutionary timing pipelines, as previously described [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e].\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDemographic and clinical information of the current study\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAncestry\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCohort size\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCohort size per country (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eCohort size of low-risk (GG1\u0026ndash;2, %)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCohort size of high-risk (GG3\u0026ndash;5, %)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eMedian age (range)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c6\" namest=\"c1\"\u003e\u003cp\u003eThe study cohort\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTotal\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e166\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e113 (68%) South Africa, 53 (32%) Australia\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e35 (21%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e131 (79%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e65 (45\u0026ndash;99) \u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAfrican\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e109\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e109 (100%) South Africa\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e28 (26%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e81 (74%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e68 (45\u0026ndash;99) \u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEuropean\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e57\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4 (7%) South Africa, 53 93% Australia\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e7 (12%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e50 (88%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e63 (46\u0026ndash;72)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c6\" namest=\"c1\"\u003e\u003cp\u003ePublic validation cohorts\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEuropean\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e296\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e296 (100%) Canada\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e234 (79%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e62 (21%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e64 (42\u0026ndash;81)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAsian\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e207\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e207 (100%) China\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e73 (35%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e134 (65%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e69 (50\u0026ndash;88) \u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003csup\u003ea\u003c/sup\u003e One patient with missing age excluded.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003ePublic validation cohorts\u003c/h3\u003e\n\u003cp\u003eSomatic SNVs were downloaded from published deep WGS primary tumour-normal data derived from 296 European and 207 Asian PCa donors, with available clinical data (Table. 1). European data were derived from the Prostate Adenocarcinoma Canada project via the International Cancer Genome Consortium (ICGC) Data Portal [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Asian data were obtained from the Chinese Prostate Cancer Genome and Epigenome Atlas (CPGEA) with accession number PRJCA001124 [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. The European data are biased towards the LR PCa, with no age differences between LR and HR cases for either European data (79%, n\u0026thinsp;=\u0026thinsp;234 vs. 21%, n\u0026thinsp;=\u0026thinsp;62; median of age, 64 vs. 63.5 years; Wilcoxon\u0026rsquo;s rank sum test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.58) or Asian data (35%, n\u0026thinsp;=\u0026thinsp;73 vs. 65%, n\u0026thinsp;=\u0026thinsp;134; the same median of age at 69 years).\u003c/p\u003e\n\u003ch3\u003eKataegis identification and evolution\u003c/h3\u003e\n\u003cp\u003eKataegis identification followed the methods of the PCAWG study, using an adjusted threshold for candidate calling, followed by two criteria (detailed in Additional file1: Supplementary methods) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Briefly, inter-mutational distances of SNVs were adjusted with the piecewise constant fitting (PCF) model using the core algorithms of the kataegis package [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e] with default parameters [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. The threshold, requiring a minimum of four SNVs with the PCF-adjusted distances less than one kb, was set and derived from the total number of SNVs per patient and identical for all patients.\u003c/p\u003e\u003cp\u003eKataegis events were further refined with evolutionary timing (detailed in Additional file1: Supplementary methods). As kataegis SNVs arise together from a single mutational process [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], we refined kataegis with evolution by examining each subset of SNVs that occurred during the same evolutionary epochs, including clonal (early, late, and unspecified) and subclonal epochs. This step was applied only to the current study cohort and identified a total of 249 evolutionary kataegis events in 65 patients. Evolutionary kataegis was unavailable for public cohorts due to the lack of available copy number variants (CNVs).\u003c/p\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003eStatistical Analysis\u003c/h2\u003e\u003cp\u003eStatistical tests included Fisher\u0026rsquo;s exact test for categorical variables using the stats package [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e], and Wilcoxon\u0026rsquo;s rank sum test for continuous data comparisons between two ancestries or risk groups using the ggpubr package (v0.6.0) [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] in R (v 4.2.2) [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. \u003cem\u003eP\u003c/em\u003es of multiple hypothesis testing were adjusted using the false discovery rate (FDR) when specified. Four outliers with extreme kataegis burdens were excluded, including one European patient (42 kataegis events) in the study cohort, and three patients whose z-scores were greater than three in the public European cohort.\u003c/p\u003e\u003cp\u003eFor genomic features significantly associated with the presence of kataegis, we further analysed their associations with kataegis burden with a negative binomial regression model. The negative binomial regression model was suitable to describe the kataegis burden that had many zero values and a variance greater than its mean (4.03 vs. 1.03). The analysis excluded the aforementioned outlier, and three African patients with PSA or age unavailable. Besides all the genomic features associated with kataegis, the analysis also included ancestry, patient risk levels and age at diagnosis. Log-transformation was applied to adjust data skewness found in SV burden, tumour mutational burden (TMB), chromothripsis burden, percentage of genome alteration (PGA), and copy number (CN) gain.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eCo-occurrence of kataegis with cancer driver mutations\u003c/h3\u003e\n\u003cdiv class=\"Heading\"\u003eCo-occurrence of kataegis with cancer driver mutations\u003c/div\u003e\u003cp\u003eWe examined associations between kataegis and point mutations of 58 selected genes using Fisher\u0026rsquo;s exact test (\u003cem\u003eP\u003c/em\u003e-values and FDRs in Additional file2: Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). We examined previously reported top cancer drivers for PCa [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] and/or genes potentially related to kataegis development, such as cell-cycle checkpoint-related genes [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e], \u003cem\u003eAPOBEC3A\u003c/em\u003e, and \u003cem\u003eAPOBEC3B\u003c/em\u003e.\u003c/p\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eSurvival analysis\u003c/h2\u003e\u003cp\u003eWe performed survival analyses using Kaplan-Meier estimates from the survival package (v 3.5-5) [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] and log-rank tests from the survminer package (v 0.4.9) [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. To assess clinical progression, we compared (i) patients with BCR and/or metastasis to those with neither, and (ii) patients with metastasis only to those without metastasis or BCR. The survival distribution was compared by kataegis state (positive or negative), and by kataegis burden (elevated burden with a kataegis count\u0026thinsp;\u0026gt;\u0026thinsp;1 or \u0026le;1). The analysis was performed for LR and HR groups concurrently and separately for the European patients with available follow-up data from our study cohort, and for public European and Asian cohorts. From our study cohort, we excluded the small LR group of European patients (n\u0026thinsp;=\u0026thinsp;7), a hyper-kataegic outlier, and three patients not curative after radical prostatectomy from the HR group (n\u0026thinsp;=\u0026thinsp;42 remaining). From the validation cohorts, we excluded three outliers defined by z-scores greater than three, and patients with missing clinical follow-up from the public European cohort (n\u0026thinsp;=\u0026thinsp;281 remaining). We also filtered out 21 patients with missing clinical information from the Asian cohort (n\u0026thinsp;=\u0026thinsp;186 remaining).\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eSBS and SV signatures\u003c/h3\u003e\n\u003cp\u003eKataegic SNVs, genome-wide SBS, and SV signatures were decomposed and assigned using SigProfilerExtractor (v.1.1.24) [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. The analysis processed kataegic SNVs from 283 kataegis positive tumours from this study and validation cohorts. The aforementioned outliers, one from the study cohort and three from the public European cohort, were excluded from the analysis. Kataegic SNVs from the public European data were lifted to GRCh38 reference using liftOver (last modified 2022-01-31) [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The signature identification steps included de novo signature discovery using nonnegative matrix factorisation (NMF) and the assignment of conventional Catalogue Of Somatic Mutations In Cancer (COSMIC) signatures (v3.4, Oct. 2023). We used default settings with some modifications, including a maximum of 15 signatures, 500 NMF replicates, one million maximal NMF iterations, and the GRCh38 reference. The assignment of SBS signatures was challenging for kataegic SNVs due to a small number of SNVs compared to genome-wide SNVs. To maintain the accuracy, 33 samples were filtered out from a cut-off of cosine similarity greater than 0.5. The passed samples had a median cosine similarity of 0.851 (range, 0.508\u0026ndash;0.988). In addition, genome-wide SBS and SV signatures were identified from 165 samples, excluding a European outlier. The SigProfilerExtractor parameters and version of the COSMIC database were the same as those used for the kataegic SBS signatures. The NMF extraction methods were based on the frequency matrix of 32 SV types [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e].\u003c/p\u003e\n\u003ch3\u003eAPOBEC attribution to kataegis\u003c/h3\u003e\n\u003cp\u003eWe used Fisher\u0026rsquo;s exact test to identify APOBEC-enriched kataegis, which were further tested for A3A or A3B enrichment according to the context preference of APOBEC enzymes. The identification mainly followed the method previously used for genome-wide enrichment [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. For the APOBEC enrichment, kataegis events were compared with other non-clustering SNVs from the sample for the count of mutated cytosines in each motif (C and TCW) adjusted by the accessible rate of the motif (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\pm\\:\\)\u003c/span\u003e\u003c/span\u003e20 bp context). Here, we used TCW to represent the APOBEC enzyme preference motif, as observing comparable amounts of cytosine mutations in TCA and TCT, rather than a skewness toward TCA reported previously [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. We used TCW to represent a cytosine mutation in the TCW motif, and more details of the Fisher\u0026rsquo;s exact test are in Additional file1: Supplementary methods. Further, for each APOBEC-enriched kataegis, we identified A3A-enriched kataegis with YTCW motif and A3B-enriched kataegis with RTCW motif [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e], where the underlined cytosine means mutated. \u003cem\u003eP\u003c/em\u003e-values were adjusted with FDR.\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eDistribution of kataegis and proximal SVs\u003c/h2\u003e\u003cp\u003eThe enrichment or sparsity of SVs proximal to kataegis events was tested by comparing kataegis with simulated kataegis events. For each kataegis event (n\u0026thinsp;=\u0026thinsp;831) identified in this study and validation cohorts, excluding four outliers, we simulated 1,000 pseudo kataegis events with the same event interval by randomly assigning the central position with 1,000 non-clustering SNVs from the sample. For both identified kataegis and simulated kataegis, their distances to proximal SVs were compared using log-spaced bins (0\u0026ndash;1 kb, 1 kb \u0026ndash; 10 kb, 10kb \u0026ndash; 0.1 Mb, 0.1 Mb \u0026ndash; 1 Mb, 1 Mb \u0026ndash; 10 Mb, 10 Mb \u0026ndash; 100 Mb, and beyond 100 Mb). For each patient group defined by ancestry and risk level, we tested enrichment or sparsity of SVs by calculating \u003cem\u003eP\u003c/em\u003e-values based on the rank of the identified kataegis in the 1,000 simulated kataegis events. \u003cem\u003eP\u003c/em\u003e-values were adjusted with FDR.\u003c/p\u003e\u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eAncestrally independent low prevalence and burden for prostate tumour kataegis\u003c/h2\u003e\u003cp\u003eFrom the study cohort including 113 Africans from South Africa, 53 Europeans from Australia, and validation cohorts 296 Europeans from Canadian, 207 Asian from China (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), we identified kataegis with TMB-derived threshold and criteria based on known kataegis characteristics [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. For the study cohort, 260 kataegis events were identified in 41% (68/166) of tumours (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA, Additional file2: Table \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e), consistent with a previous report for European patients [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Within the validation cohorts, we identified 321 kataegis events in 39.2% (116/296) of European and 297 events in 49.8% (103/207) of Asian patients (Additional file2: Table S3, S4). Overall, we observed a low kataegis burden (median: two events, range: 1 to 13, Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB), excluding one hyper-kataegic outlier (47 events) derived from a single European patient. The median number of SNVs of a kataegis event is six, spanning a narrow range of 2.67 kb and differing between HR groups by ancestry (African 5 SNVs vs. European 7 SNVs, Wilcoxon\u0026rsquo;s rank-sum test, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:5\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e). Kataegic regions were unique to each patient, as previously described [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e], and only a few were within functional genomic regions (Additional file2: Table S5).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003eKataegis is associated with genomic instability and co-occurs with cancer drivers\u003c/h2\u003e\u003cp\u003eKataegis-positive tumours exhibited increased genomic instability marked by various genomic features observed in one or more groups of risk levels and genetic ancestries with Wilcoxon rank sum test (Additional file1: Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). Elevated TMB, SVs, and chromothripsis were observed in kataegis-positives across ancestries and cancer aggressiveness (FDRs = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:3\\times\\:{10}^{-5}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.02, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:7\\times\\:{10}^{-6}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.002, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:2\\times\\:{10}^{-5}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.002, respectively). Notably, the SV burden was the most significant factor, showing a further association with kataegis burden in both ancestral groups (negative binomial model, \u003cem\u003eP\u003c/em\u003e = 0.001), consistent with the previous study of European patients [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. CNVs were significantly correlated with kataegis exclusive to HR groups of African and European patients, characterised by gains (FDRs = 0.01\u0026ndash;0.04), losses (FDRs = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:2\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.002) or both as measured by PGA (FDRs =\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:2\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.002). Significantly shorter telomere lengths were observed in kataegis-positive tumours derived from African patients within the LR group (FDR = 0.03).\u003c/p\u003e\u003cp\u003eSignificant co-occurrences of kataegis and cancer driver point mutations were observed using Fisher's exact test (Additional file2: Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). The significant co-occurrence of kataegis and \u003cem\u003eRBFOX1\u003c/em\u003e in HR groups was found for both ancestries (FDRs = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:7\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.003) and validated by the LR group of public European patients (n = 234, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:3\\times\\:{10}^{-6}\\)\u003c/span\u003e\u003c/span\u003e). Additionally, significant on-occurrence of kataegis with \u003cem\u003ePDE4D, TP53\u003c/em\u003e, and \u003cem\u003eZFHX3\u003c/em\u003e were observed in the HR group of European patients of the study cohort (FDRs = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:5\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e, 0.04, 0.005, respectively), as well as \u003cem\u003eATM\u003c/em\u003e, \u003cem\u003eATRX\u003c/em\u003e, and \u003cem\u003eCHEK2\u003c/em\u003e observed the LR group of the public European patients (FDRs = 0.002, 0.03, and 0.03, respectively). However, no significant co-occurrence was found in the public Asian cohort (n = 207, FDR \u0026gt;0.3).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003eKataegis correlates with adverse PCa clinical outcomes\u003c/h2\u003e\u003cp\u003eTo study the clinical implication of kataegis, we examined the PSA level of patients, a widely used clinical measurement for PCa detection [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e] and post-treatment recurrence [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. Higher PSA levels were observed with kataegis positive tumours compared to those with negative tumours in the HR group of African patients (median: 100 vs. 43.0 ng/mL; Wilcoxon\u0026rsquo;s rank-sum test FDR\u0026thinsp;=\u0026thinsp;0.002; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Appreciating that high PSA levels may be an indicator of metastasis risk [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], lack of associated clinical follow-up data, including associated magnetic resonance imaging (MRI) data, limited further investigation. For the HR group of our European patients sampled at surgery, neither prevalence nor burden of kataegis was a significant predictor of BCR (Kaplan-Meier test), likely due to their small cohort size. Leveraging a larger LR-biased European PCa data resource, we showed LR patients with elevated kataegis burden (more than one event) to be significantly susceptible to metastasis (Log-rank test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.03), while observing no association for BCR (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003eAPOBEC3B is the main aetiology for kataegis in prostate tumours\u003c/h2\u003e\u003cp\u003eOur analysis of kataegis aetiology identified APOBEC as the primary contributing factor to kataegis across ancestries, except for the LR group of African patients. Particularly, SBS2 and SBS13 signatures accounted for approximately 80% (median: 79.1\u0026ndash;93.6% for eight subgroups; Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA; all SBS proportions in Additional file1: Fig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e), consistent with previous report (81.7%) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Consistently, more than 50% (51.6\u0026ndash;71.8%) of kataegis were APOBEC-enriched identified based on the motif preferences (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). Between ancestries, the LR group of Asian patients exhibited significantly more APOBEC-enriched kataegis than other ancestries (Fisher's exact test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.02 and 0.04 for LR group of African and public European data, respectively). Between risk-levels, the LR group of African patients showed significantly less APOBEC-enriched kataegis than the HR group (Fisher\u0026rsquo;s exact test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.04).\u003c/p\u003e\u003cp\u003eAfter observing APOBEC as the main contribution of kataegis events, we conducted a focused comparison between the two APOBEC-related signatures SBS2 and SBS13. The predominance of SBS13 (median, 40\u0026ndash;62% for eight subgroups) over SBS2 was observed with significance in the HR group of African patients, and in both LR and HR groups of public European and Asian data (Wilcoxon\u0026rsquo;s rank sum test, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:4\\times\\:{10}^{-10}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.005; Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Different from the other groups, the LR group of African patients showed the lowest proportion of APOBEC-related SBS2, significantly lower than the HR group (median, 0% vs. 25.4%; Wilcoxon\u0026rsquo;s rank sum test, \u003cem\u003eP\u003c/em\u003e = 0.048).\u003c/p\u003e\u003cp\u003eWe further attributed kataegis to the two APOBEC enzymes A3A and A3B. We observed higher proportions of A3B enrichment in all groups except for the LR group of African patients, with significance observed for larger public European LR and Asian LR/HR data (Wilcoxson\u0026rsquo;s rank sum test, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:1\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.02; Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). This differs from the previous observation in hypermutated samples where A3A was strongly associated [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], probably because our samples exhibit lower levels of APOBEC activity. This argument is supported by the observation that APOBEC-related signatures were exclusively within kataegic SNVs and not from genome-wide SNVs (Additional file1: Fig. S3). Also, our PCa patients showed no \u003cem\u003eAPOBEC3A\u003c/em\u003e and \u003cem\u003eAPOBEC3B\u003c/em\u003e germline predispositions, as previously reported in other cancers [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e, \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e], including rs12628403 [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], rs1014971 [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Kataegis status was not associated with somatic CNVs in \u003cem\u003eAPOBEC3A\u003c/em\u003e and \u003cem\u003eAPOBEC3B\u003c/em\u003e genes and regions within and between the genes. These findings align with the low frequency and burden of kataegis observed in PCa.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003eGenomic rearrangement processes of ancestry predominant kataegis\u003c/h2\u003e\u003cp\u003eHaving observed a close association between SV and kataegis abundance, we sought to further determine their distributions across the tumour genome. Kataegis events observed across ancestries and risk levels were significantly enriched around SV breakpoints, with 50% (413/831) within a 10 kb distance, 40% (335/831) within a 1-kb distance, and 13% spanning across SV breakpoints (109/831). Comparing kataegis to simulated kataegis events (1,000 times) with randomly selected non-clustering SNVs, we defined the ranges where kataegis were significantly enriched or sparse from an SV. Kataegis were significantly enriched around SV regions with varying ranges (0\u0026ndash;10 kb to 0\u0026ndash;1 Mb) and sparse at distances beyond 10 Mb or 100 Mb between groups of ancestries and risk levels (simulation tests on log-spaced bins, FDR\u0026thinsp;=\u0026thinsp;0.003\u0026ndash;0.01; Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Additional file1: Fig. S4). We categorised kataegis to be SV-associated and independent for events located within enriched and sparse regions, respectively. The two types of kataegis varied in proportions between risk levels and between African and European ancestries. More SV-associated kataegis was observed in HR over LR groups (Fisher\u0026rsquo;s exact test, public European data, FDR\u0026thinsp;=\u0026thinsp;0.004), and in the HR groups of European over African patients (Fisher\u0026rsquo;s exact one-way test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.03). Focusing on SV types, chromothripsis was significantly enriched around kataegis (Fisher\u0026rsquo;s exact test, FDR\u0026thinsp;=\u0026thinsp;0.04; Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e), aligned with previous findings [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Conversely, kataegis did not occur close to translocations (FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:1\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e) as shown in the European public data analysed in this study (Additional file1: Fig. S5, S6).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThe analysis of genome-wide SV signatures for HR groups of the study cohort revealed an association between translocation SV type and kataegis. Compared to kataegis-negative tumours, kataegis-positives from both ancestries exhibited significantly lower proportion and less presence of the predominant SV2 signature and higher proportions and/or more presences of SV4 and SV10 (Fisher\u0026rsquo;s one-way exact test, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:1\\times\\:{10}^{-4}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash;0.01; Wilcoxon\u0026rsquo;s rank sum test, FDR = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:1\\:\\times\\:{10}^{-3}\\)\u003c/span\u003e\u003c/span\u003e\u0026ndash; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:9\\:\\times\\:{10}^{-3}\\)\u003c/span\u003e\u003c/span\u003e; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e; all SV signatures identified in Additional file1: Fig. S7). According to the COSMIC SV signature database [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e], simple translocations and clustered translocations are the primary components of SV2 and SV4, respectively, while SV10 encompass simple rearrangements of other types. These suggest kataegis-positive prostate tumours characterised by an increase in clustered translocations alongside non-clustered SVs of other types.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003eDifferential evolution of prostate tumour kataegis events between ancestries\u003c/h2\u003e\u003cp\u003eWe revealed the uneven rise of kataegis across different evolutionary timeframes by assigning kataegis to clonal epochs (early, late, and unspecified) and the subclonal epoch for the study cohort (Additional file1: Supplementary methods). Both ancestries showed a bias towards clonal origins (65.0% clonal kataegis, 128/197; Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). The clonal proportion of kataegis was significantly higher than that of genome-wide SNVs (median, 100% \u003cem\u003evs\u003c/em\u003e 68.3%; paired Wilcoxon's rank sum test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.01), aligning with the clonal origin of chromothripsis [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e] that could arise along with kataegis during telomere crisis [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. This clonal bias of kataegis appears to be unreported in previous PCa studies [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e], while subclonal bias was reported for cancers with high kataegis burdens, excluding PCa [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Between ancestries, early clonal kataegis events were more frequent in European patients studied (EUR\u0026thinsp;=\u0026thinsp;17.2% vs. AFR\u0026thinsp;=\u0026thinsp;6.9%, Fisher\u0026rsquo;s exact test on HR groups, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.04). In contrast, African-derived tumours exhibited an increased proportion of subclonal kataegis in both LR and HR groups; the latter showed significance when compared to the European patients (19%, Fisher\u0026rsquo;s exact test on HR groups, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.002). These findings suggest ancestral specific dynamics during carcinogenesis.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eKataegis is largely overlooked in PCa research due to its low frequency and burden compared to other cancer types, such as bladder and lung cancer [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. To the best of our knowledge, no study has investigated the potential contribution of kataegis to ancestrally associated PCa health disparities. Here, using a unique multi-ancestral PCa resource, including southern African men representing the highest global region for PCa-associated mortality, complemented with published data [\u003cspan additionalcitationids=\"CR20\" citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e], we present a detailed characterisation of kataegis features in prostate tumours, highlighting its implications in worse clinical outcomes and ancestrally different mutational processes. We observed prostate tumours exhibiting kataegis, often accompanied by cancer driver mutations and elevated genomic instability, are linked to adverse clinical outcomes. Tumours derived from African patients exhibited a higher proportion of kataegis independent of SVs and later occurrence in subclones. Among African patients, the proportion of kataegis attributed to APOBEC varied between cancer risks. These findings refine the earlier findings of ancestry-related cancer progression trajectories [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] by emphasising disparities in hypermutations, further underscoring the importance of African-inclusive investigations.\u003c/p\u003e\u003cp\u003eFurthermore, we propose kataegis as an indicator of adverse PCa which is independent of both ancestry and risk level. The similar prevalence of kataegis between risk levels highlights the limitation of current cancer grading which failed to detect any morphological or physical changes resulting from the interplay of kataegis, cancer drivers and genomic instability. Patients with kataegis-positive tumours may be recommended for more frequent monitoring during the remission period, as with a potentially higher metastatic risk. The heightened metastatic risk may be driven by genomic instability [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] and the two co-occurrent oncogenes of kataegis, \u003cem\u003eRBFOX1\u003c/em\u003e and \u003cem\u003eTP53\u003c/em\u003e [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e, \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. Also, PSA levels, known to be implicated in bone metastasis via the stimulation of osteoprotegerin [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e], are found to rise in African patients with kataegis-positive aggressive PCa. However, our observation has challenged a previous statement that kataegis is a marker of good prognosis for BRCA [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. The BRCA study showed significantly shorter survival time for patients with kataegis, but proposed that aging might be the driving force. Therefore, further follow-up data from African patients is required for investigation, ideally in a large cohort to exclude potential confounding by age. Besides, we propose that the implication of kataegis in prostate tumours progression is different from BRCA and other cancer types with high kataegis burden, despite sharing features including elevated genomic instability, close association with SVs, and attribution to APOBEC enzyme activity [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Specific to prostate tumours, only the clustered mutations are attributed to APOBEC enzyme activity, mostly to A3B, and no germline predisposition effects that previously reported in BRCA [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] have been identified through variation in \u003cem\u003eA3A\u003c/em\u003e and \u003cem\u003eA3B\u003c/em\u003e genes.\u003c/p\u003e\u003cp\u003eOur African-inclusive study design has revealed ancestral disparities in kataegis development through evolutionary timing and mutational processes. Our evolutionary analyses have shown clonal kataegis predominated irrespective of patient ancestry. In particular, European ancestry has exhibited the high proportion of early clonal kataegis, indicating an implication in cancer initiation. In contrast, the subclonal kataegis identified in this study are notably biased towards African patients, regardless of clinicopathological presentation, suggesting a high level of genomic instability in cancer and, therefore, marked tumour heterogeneity and associated chemoresistance [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. However, we acknowledge that our computational estimation of subclonal kataegis is a simplified model, further investigation with more sequencing techniques may help discern subclones and multiclonal origins, the prevalence of which is unknown to African patients.\u003c/p\u003e\u003cp\u003eAdditionally, we describe two kataegis mutational mechanisms as SV-associated and independent, observing varying proportions by ancestry with the former significantly more frequent in European than African patients. While kataegis are attributed to APOBEC deamination of cytosines from exposed ssDNA, mostly to APOBEC3B observed in this study, the deamination may take place under different processes for the two kataegis types (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). We speculate that SV-associated kataegis could have arisen during telomere crisis [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] and double-strand breaks (DSBs) repair mechanisms, such as break-induced replication (BIR) [\u003cspan additionalcitationids=\"CR45\" citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e], as well as non-homologous end-joining (NHEJ) and alternative end-joining (A-EJ) concerning to the close association with chromothripsis [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. The concurrence of driver mutations in \u003cem\u003eTP53\u003c/em\u003e, a cell-cycle checkpoint gene, observed in European patients from this study supports the hypothesis that telomere crisis may result in chromothripsis-associated kataegis bypassing a cell cycle checkpoint due to checkpoint deficiency [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Also, significantly shorter telomere length has been observed in kataegis-positive tumours derived from the low-risk group of African patients and has been previously reported for the aggressive tumours derived from African men [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. Conversely, we hypothesise that SV-independent kataegis, which we found to be more common in African ancestrally derived tumours, may arise on R loops in transcription bubbles or on the lagging strand of the DNA replication fork [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e, \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. The transcription and replication may be interplayed as R-loops in one of the sources that increase replication stress, leading to an elevated exposure of ssDNA at the replication fork [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e]. We acknowledge, however, that our proposed hypotheses require further validation of cell experiments, such as DNA/RNA immunoprecipitation sequencing (DRIP)\u0026ndash;R-loop experiments. Altogether, these findings suggest divergent tumour pathways to some extent between ancestries.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWhile this study provides novel insights into kataegis in relation to ancestries and cancer aggressiveness, several limitations must be acknowledged. The lack of relevant data has hindered further validation or investigation, although this has been mitigated by integrating public cohorts. The clinical implications of kataegis for African patients need future research due to a lack of African follow-up and validation data. More LR data derived from African patients are also required for differentiating features between cancer aggressiveness. To scrutinise the ancestrally shared and distinctive features of kataegis, we integrated publicly available PCa data from European and Asian ancestral patients. However, this study and public cohorts differ in their composition of cancer aggressiveness and variant identification pipelines. While our study cohort is biased towards very HR disease (ISUP GG4/5), the public European dataset focused on intermediate risk disease (82%, ISUP GG2/3) [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], and the public Asian data lacks ISUP GG 5 (0.5%, 1/207) [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Additionally, although we applied consistent methods for kataegis identification and downstream analyses, our somatic variant identification is more stringent due to filtering by a panel of normal samples. These limitations highlight not only areas for future research, but importantly underscores the need for tailored data collection and analysis.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe available PCa whole genome cohort remains one of the largest of its kind for the African continent and benefits from the inclusion of clinically, technically, and analytically matched non-African data, allowing for direct, unbiased comparative analyses. Using this African inclusive resource, supported by published non-African data, enabled us to discern both universal (or shared) and ancestrally unique kataegis positive prostate tumour features, particularly with regards to advanced disease. Demonstrating heightened African-specific kataegis-associated heterogeneity, our study emphasises the need for further African inclusion, specifically to elucidate the potential of kataegis and APOBEC3 enzymes as biomarkers of targeted cancer therapy. Collectively, by elucidating the occurrence of kataegis from tumorigenesis to later subclonal stage in African and European patients, we highlight the significance of different underlying mutational processes between ancestries, which provides a valuable resource for targeted therapeutic interventions and emphasises the need for continued exploration of biological behaviours and environmental exposures in African patients.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eA-EJ\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ealternative end-joining\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eA3A\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAPOBEC3A\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eA3B\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAPOBEC3B\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eAFR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAfrican ancestry\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eASI\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAsian ancestry\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eBCR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ebiochemical relapse\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eBIR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ebreak-induced replication\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eBRCA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ebreast cancer\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eCN\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ecopy number\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eCNVs\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ecopy number variants\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eCOSMIC\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eCatalogue Of Somatic Mutations In Cancer\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eCPGEA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eChinese Prostate Cancer Genome and Epigenome Atlas\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eDRIP\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eDNA/RNA immunoprecipitation sequencing\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eDSBs\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003edouble-strand breaks\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eEGA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eEuropean Genome-Phenome Archive\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eEUR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eEuropean ancestry\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eFDR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003efalse discovery rate\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eGG\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eGrade Group\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eHR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ehigh-risk, Grade Group 3\u0026ndash;5\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eICGC\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eInternational Cancer Genome Consortium\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eISUP\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eInternational Society of Urological Pathology\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eLR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003elow-risk, Grade Group 1/2\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eMRI\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003emagnetic resonance imaging\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eNHEJ\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003enon-homologous end-joining\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eNMF\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003enonnegative matrix factorisation\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePCa\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eprostate cancer\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePCAWG\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ePan-Cancer Analysis of Whole Genomes\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePCF\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003epiecewise constant fitting\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePGA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003epercentage of genome alteration\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePSA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eprostate specific antigen\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSAPCS\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eSouthern African Prostate Cancer Study\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSBS\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003esingle base substitution\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSNVs\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003esingle nucleotide variants\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003essDNAs\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003esingle-strand DNAs\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSV\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003estructural variants\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eTMB\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003etumour mutational burden\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003ch2\u003eEthics approval and consent to participate\u003c/h2\u003e\u003cp\u003e Conforming to the principles of the Helsinki Declaration, South African patients were recruited as part of the Southern African Prostate Cancer Study (SAPCS) with approval granted by the University of Pretoria Faculty of Health Research Ethics Committee (HRECs, with US Federal wide assurance FWA00002567 and IRB00002235 IORG0001762; #43/2010), while in Australia participant recruitment was approved by the St Vincent\u0026rsquo;s HREC (#SVH/12/231). Samples were shipped to the Garvan Institute of Medical Research and/or the University of Sydney in accordance with institutional Material Transfer Agreements (MTAs) and appropriate Republic of South Africa Department of Health Export Permit (National Health Act 2003; J1/2/4/2 #1/12). Genomic interrogation required for this study was approved by the St. Vincent\u0026rsquo;s HREC (#SVH/15/227), with additional IRB review and approval granted by the Human Research Protection Office of the US Army Medical Research and Development Command E02371 (TARGET Africa) and E03280 (HEROIC PCaPH Africa1K).\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eConsent to publication\u003c/strong\u003e\u003cp\u003eAll data used in this study has been previously published, which for the Southern African Prostate Cancer Study (SAPCS) data was accessed through Data Access Committee approval.\u003c/p\u003e\u003ch2\u003eCompeting Interests\u003c/h2\u003e\u003cp\u003e Hayes is a Member of Active Surveillance Movember Committee and received an honorarium from The Korean Urological Oncology Society for 2024 Annual Conference as a guest speaker.\u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e\u003cp\u003eGenomic sequencing was supported by the National Health and Medical Research Council (NHMRC) of Australia through a Project Grant (2018/GNT1165762 to V.M.H.) and Ideas Grants (2020/GNT2001098 and 2021/GNT2010551 to V.M.H.). Further analytics was supported by the U.S.A. Congressionally Directed Medical Research Programs (CDMRP) Prostate Cancer Research Program (PCRP) Idea Development Award (PC200390, TARGET Africa to V.M.H.), HEROIC Consortium Award (PC210168 and PC23067, HEROIC PCaPH Africa1K to V.M.H. and M.S.R.B., with co-Principal Investigators Professors Gail Prins, University of Illinois at Chicago, U.S.A. and Mungai Peter Ngugi, University of Nairobi, Kenya), U.S.A. National Institute of Health (NIH) National Cancer Institute (NCI) Award (1R01CA285772-01 to V.M.H.), U.S.A. Prostate Cancer Foundation (PCF) 2023 Challenge Award (2023CHAL4150 to V.M.H.) and NHMRC Ideas grant (2024/GNT2037298 to W.J.). V.M.H. is supported by the Petre Foundation via the University of Sydney Foundation, while J.J. is supported by a U.S.A. Prostate Cancer Foundation (PCF) Scholarship as part of the 2023 Challenge Award.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eJJ analysed and interpreted the data, wrote and reviewed the manuscript. AT elaborated the methods regarding signature analyses, reviewed and edited the draft. RH curated the analysed data regarding telomere lengths. MSRB collected data, investigated and conceptualised the study. PDS collected the data, investigated. DCW elaborated the methods regarding enrichment of structural variants, supervised the study, reviewed and edited the draft. WJ supervised and conceptualised the study, curated the data regarding evolution of kataegis and clustering of patients, wrote and reviewed the draft. VMH supervised and conceptualised the study, collected and curated data, wrote and reviewed the manuscript, provided resources and funding acquisition. All authors reviewed the manuscript.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003e We are forever grateful to the patients who contribute their time and samples to make this study possible, as well as the clinical staff who have participated in patient recruitment and maintenance of the SAPCS (South Africa) and the Garvan/St Vincent\u0026rsquo;s Hospital (Australia) Bioresources, with specific acknowledgement for Bioresource Managers Ms Tumisang Mbeke (University of Pretoria, South Africa) and Sr Anne-Maree Haynes (Garvan Institute of medical Research, Australia), respectively. We are thankful to Dr Pamela X.Y. Soh (University of Sydney, Australia) for providing ancestral clarifications for the study participants. We also acknowledge the use of the National Computational Infrastructure (NCI), which is supported by the Australian Government and accessed through the National Computational Merit Allocation Scheme (V.M.H., J.J. and W.J.), as well as the Sydney Informatics Hub, Core Research Facility at the University of Sydney. This work will form part of a Ph.D. thesis for J.J.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThe analysed sequence data of the study cohort are available through the European Genome‐Phenome Archive (EGA; https://ega‐archive.org) under overarching accession EGAS00001006425, available from the authors upon reasonable request with the permission of Southern African Prostate Cancer Study (SAPCS) (EGAD00001009067) and Garvan/St Vincent\u0026rsquo;s Dataset (EGAD00001009066). The analysed variant data of two public cohorts are available from the ICGC Data Portal ( [http://dcc.icgc.org/](http:/dcc.icgc.org) ) for a European cohort and from the Genome Sequence Archive for Human (http://bigd.big.ac.cn/gsa-human/) under accession number PRJCA001124 for an Asian cohort.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eBray, F., et al., \u003cem\u003eGlobal cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries\u003c/em\u003e. CA Cancer J Clin, 2024. 74(3): p. 229\u0026ndash;263.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLee, K.M., et al., \u003cem\u003eAssociation between prediagnostic prostate-specific antigen and prostate cancer probability in Black and non-Hispanic White men\u003c/em\u003e. Cancer, 2024. 130(2): p. 224\u0026ndash;231.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNair, S.S., et al., \u003cem\u003eWhy do African-American men face higher risks for lethal prostate cancer?\u003c/em\u003e Curr Opin Urol, 2022. 32(1): p. 96\u0026ndash;101.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNik-Zainal, S., et al., \u003cem\u003eThe life history of 21 breast cancers\u003c/em\u003e. Cell, 2012. 149(5): p. 994\u0026ndash;1007.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAaltonen, L.A., et al., \u003cem\u003ePan-cancer analysis of whole genomes\u003c/em\u003e. Nature, 2020. 578(7793): p. 82\u0026ndash;93.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTaylor, B.J., et al., \u003cem\u003eDNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis\u003c/em\u003e. elife, 2013. 2: p. e00534.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVeerla, S. and J. Staaf, \u003cem\u003eKataegis in clinical and molecular subgroups of primary breast cancer\u003c/em\u003e. npj Breast Cancer, 2024. 10(1): p. 32.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, L., et al., \u003cem\u003eDeep whole-genome analysis of 494 hepatocellular carcinomas\u003c/em\u003e. Nature, 2024. 627(8004): p. 586\u0026ndash;593.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAlexandrov, L.B., et al., \u003cem\u003eSignatures of mutational processes in human cancer\u003c/em\u003e. Nature, 2013. 500(7463): p. 415\u0026ndash;421.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAnsari-Pour, N., et al., \u003cem\u003eWhole-genome analysis of Nigerian patients with breast cancer reveals ethnic-driven somatic evolution and distinct genomic subtypes\u003c/em\u003e. Nature communications, 2021. 12(1): p. 6946.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJakobsdottir, G.M., et al., \u003cem\u003eAPOBEC3 mutational signatures are associated with extensive and diverse genomic instability across multiple tumour types\u003c/em\u003e. BMC Biology, 2022. 20(1): p. 117.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eD\u0026rsquo;Antonio, M., et al., \u003cem\u003eKataegis expression signature in breast cancer is associated with late onset, better prognosis, and higher HER2 levels\u003c/em\u003e. Cell reports, 2016. 16(3): p. 672\u0026ndash;683.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaciejowski, J., et al., \u003cem\u003eAPOBEC3-dependent kataegis and TREX1-driven chromothripsis during telomere crisis\u003c/em\u003e. Nature genetics, 2020. 52(9): p. 884\u0026ndash;890.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaciejowski, J., et al., \u003cem\u003eChromothripsis and kataegis induced by telomere crisis\u003c/em\u003e. Cell, 2015. 163(7): p. 1641\u0026ndash;1654.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCooper, C.S., et al., \u003cem\u003eAnalysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue\u003c/em\u003e. Nature Genetics, 2015. 47(4): p. 367\u0026ndash;372.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJaratlerdsiri, W., et al., \u003cem\u003eAfrican-specific molecular taxonomy of prostate cancer\u003c/em\u003e. Nature, 2022. 609: p. 552\u0026ndash;559.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatrick, S.M., et al., \u003cem\u003eProstate cancer clinicopathological presentation in South-East Africa during the 2010 decade\u003c/em\u003e. JNCI: Journal of the National Cancer Institute, 2025: p. djaf117.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJiang, J., et al., \u003cem\u003eScaling for African Inclusion in High-Throughput Whole Cancer Genome Bioinformatic Workflows\u003c/em\u003e. Cancers, 2025. 17(15): p. 2481.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, J., et al., \u003cem\u003eThe international cancer genome consortium data portal\u003c/em\u003e. Nature biotechnology, 2019. 37(4): p. 367\u0026ndash;369.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFraser, M., et al., \u003cem\u003eGenomic hallmarks of localized, non-indolent prostate cancer\u003c/em\u003e. Nature, 2017. 541(7637): p. 359\u0026ndash;364.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi, J., et al., \u003cem\u003eA genomic and epigenomic atlas of prostate cancer in Asian populations\u003c/em\u003e. Nature, 2020. 580(7801): p. 93\u0026ndash;99.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLin, X., et al., \u003cem\u003ekataegis: an R package for identification and visualization of the genomic localized hypermutation regions using high-throughput sequencing\u003c/em\u003e. BMC genomics, 2021. 22(1): p. 440.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTeam, R.C., \u003cem\u003eR: A language and environment for statistical computing. R Foundation for Statistical Computing\u003c/em\u003e. (No Title), 2013.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKassambara, A., \u003cem\u003eggpubr:'ggplot2'based publication ready plots.\u003c/em\u003e R package version, 2018: p. 2.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eArmenia, J., et al., \u003cem\u003eThe long tail of oncogenic drivers in prostate cancer\u003c/em\u003e. Nat Genet, 2018. 50(5): p. 645\u0026ndash;651.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDing, L., et al., \u003cem\u003eThe roles of cyclin-dependent kinases in cell-cycle progression and therapeutic strategies in human breast cancer.\u003c/em\u003e International journal of molecular sciences, 2020. 21(6): p. 1960.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTherneau, T.M. and T. Lumley, \u003cem\u003ePackage \u0026lsquo;survival\u0026rsquo;\u003c/em\u003e. R Top Doc, 2015. 128(10): p. 28\u0026ndash;33.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKassambara, A., et al., \u003cem\u003esurvminer: Drawing Survival Curves using \u0026lsquo;ggplot2\u0026rsquo;.\u003c/em\u003e R package version 0.4, 2021. 9: p. 2021.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIslam, S.A., et al., \u003cem\u003eUncovering novel mutational signatures by de novo extraction with SigProfilerExtractor\u003c/em\u003e. Cell genomics, 2022. 2(11).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKuhn, R.M., D. Haussler, and W.J. Kent, \u003cem\u003eThe UCSC genome browser and associated tools\u003c/em\u003e. Briefings in bioinformatics, 2013. 14(2): p. 144\u0026ndash;161.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEverall, A., et al., \u003cem\u003eComprehensive repertoire of the chromosomal alteration and mutational signatures across 16 cancer types from 10,983 cancer patients.\u003c/em\u003e medRxiv, 2023: p. 2023.06. 07.23290970.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChan, K., et al., \u003cem\u003eAn APOBEC3A hypermutation signature is distinguishable from the signature of background mutagenesis by APOBEC3B in human cancers\u003c/em\u003e. Nature Genetics, 2015. 47(9): p. 1067\u0026ndash;1072.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRoberts, S.A., et al., \u003cem\u003eAn APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers\u003c/em\u003e. Nature Genetics, 2013. 45(9): p. 970\u0026ndash;976.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLaw, E.K., et al., \u003cem\u003eAPOBEC3A catalyzes mutation and drives carcinogenesis in vivo\u003c/em\u003e. J Exp Med, 2020. 217(12).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMerriel, S.W.D., et al., \u003cem\u003eSystematic review and meta-analysis of the diagnostic accuracy of prostate-specific antigen (PSA) for the detection of prostate cancer in symptomatic patients\u003c/em\u003e. BMC Medicine, 2022. 20(1): p. 54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMilonas, D., et al., \u003cem\u003eThe significance of prostate specific antigen persistence in prostate cancer risk groups on long-term oncological outcomes\u003c/em\u003e. Cancers, 2021. 13(10): p. 2453.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNik-Zainal, S., et al., \u003cem\u003eAssociation of a germline copy number polymorphism of APOBEC3A and APOBEC3B with burden of putative APOBEC-dependent mutations in breast cancer\u003c/em\u003e. Nature Genetics, 2014. 46(5): p. 487\u0026ndash;491.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMiddlebrooks, C.D., et al., \u003cem\u003eAssociation of germline variants in the APOBEC3 region with cancer risk and enrichment with APOBEC-signature mutations in tumors\u003c/em\u003e. Nature Genetics, 2016. 48(11): p. 1330\u0026ndash;1338.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFares, J., et al., \u003cem\u003eMolecular principles of metastasis: a hallmark of cancer revisited\u003c/em\u003e. Signal Transduction and Targeted Therapy, 2020. 5(1): p. 28.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePerron, G., et al., \u003cem\u003ePan-cancer analysis of mRNA stability for decoding tumour post-transcriptional programs\u003c/em\u003e. Communications Biology, 2022. 5(1): p. 851.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe Laere, B., et al., \u003cem\u003eTP53 outperforms other androgen receptor biomarkers to predict abiraterone or enzalutamide outcome in metastatic castration-resistant prostate cancer\u003c/em\u003e. Clinical cancer research, 2019. 25(6): p. 1766\u0026ndash;1773.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWong, S.K., et al., \u003cem\u003eProstate cancer and bone metastases: the underlying mechanisms\u003c/em\u003e. International journal of molecular sciences, 2019. 20(10): p. 2587.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAshrafizadeh, M., et al., \u003cem\u003eMolecular panorama of therapy resistance in prostate cancer: a pre-clinical and bioinformatics analysis for clinical translation\u003c/em\u003e. Cancer and Metastasis Reviews, 2024. 43(1): p. 229\u0026ndash;260.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eElango, R., et al., \u003cem\u003eRepair of base damage within break-induced replication intermediates promotes kataegis associated with chromosome rearrangements\u003c/em\u003e. Nucleic acids research, 2019. 47(18): p. 9666\u0026ndash;9684.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSakofsky, C.J., et al., \u003cem\u003eBreak-induced replication is a source of mutation clusters underlying kataegis\u003c/em\u003e. Cell reports, 2014. 7(5): p. 1640\u0026ndash;1648.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGreen, A.M. and M.D. Weitzman, \u003cem\u003eThe spectrum of APOBEC3 activity: From anti-viral agents to anti-cancer opportunities\u003c/em\u003e. DNA repair, 2019. 83: p. 102700.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGelot, C., I. Magdalou, and B.S. Lopez, \u003cem\u003eReplication stress in Mammalian cells and its consequences for mitosis\u003c/em\u003e. Genes, 2015. 6(2): p. 267\u0026ndash;298.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHuang, R., et al., \u003cem\u003eThe impact of telomere length on prostate cancer aggressiveness, genomic instability and health disparities\u003c/em\u003e. Sci Rep, 2024. 14(1): p. 7706.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMcCann, J.L., et al., \u003cem\u003eAPOBEC3B regulates R-loops and promotes transcription-associated mutagenesis in cancer\u003c/em\u003e. Nature Genetics, 2023. 55(10): p. 1721\u0026ndash;1734.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeplyarskiy, V.B., et al., \u003cem\u003eAPOBEC-induced mutations in human cancers are strongly enriched on the lagging DNA strand during replication\u003c/em\u003e. Genome research, 2016. 26(2): p. 174\u0026ndash;182.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSaxena, S. and L. Zou, \u003cem\u003eHallmarks of DNA replication stress\u003c/em\u003e. Mol Cell, 2022. 82(12): p. 2298\u0026ndash;2314.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"genome-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [Genome Medicine](https://genomemedicine.biomedcentral.com/)","snPcode":"13073","submissionUrl":"https://submission.springernature.com/new-submission/13073/3","title":"Genome Medicine","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"kataegis, prostate cancer, ancestral disparity, APOBEC, cancer evolution","lastPublishedDoi":"10.21203/rs.3.rs-7624142/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7624142/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eKataegis, the focal hypermutation of single base positions in tumour genomes, has received little attention with regards to prostate cancer (PCa) molecular features, tumour evolution and associated clinical presentation. Most notably, the impact of this phenomenon is yet to be explored across ancestral lineages representing the extremities of PCa presentation and outcomes, with men of African ancestry disproportionately disadvantaged. The purpose of this study is to address the knowledge gap through African inclusive multi-ancestral interrogation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe assessed for ancestrally shared and unique molecular, evolutionary and clinical features of kataegis in 669 multi-ancestral whole PCa genomes. Access to raw whole-genome sequenced data allowed for direct single-pipeline comparative analysis between 109 southern African and 57 European derived treatment naïve high-risk-biased primary tumours (74% and 88%) with paired blood samples, further assessed against publicly available 207 Asian high-risk-leaning comparative (65%) and 296 European low-risk-biased alternative (79%) resources. Comparisons between ancestries and risk groups were through Wilcoxon’s rank sum test and Fisher’s exact tests, with \u003cem\u003eP\u003c/em\u003e values adjusted by false discovery rate.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConfirming relatively low burdens, we found kataegis to be significantly associated with genomic instability, cancer drivers, and clinical adversity across ancestries (false discovery rate = 7 x 10\u003csup\u003e-6 \u003c/sup\u003e- 0.04). Notably, kataegis-postive tumours were associated with elevated prostate-specific antigen levels at presentation in African (false discovery rate = 0.002) and higher risk for metastatic progression in European patients (Kaplan-Meier estimator, P=0.03). Enrichment of APOBEC’s context preferences showed more attribution from APOBEC3B than APOBEC3A. Further through analyses of evolution and structural variant (SV) cooccurrence, commonly the ancestry agnostic SV-associated kataegis predominated in the clonal evolutionary state, while the less common the SV-independent kataegis (P=0.002) and subclonal kataegis (P=0.03) showed African specificity.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe found kataegis-positivity to be associated with poor PCa presentation and prognosis, irrespective of patient ancestry. Kataegis-related genomic instability occurring early and late during African derived tumourigenesis, may partly explain the heightened tumour and clinical heterogeneity observed for patients of African ancestry.\u003c/p\u003e","manuscriptTitle":"Single base focal hypermutation cooccurs with structural variation as an early event in advanced prostate tumourigenesis with ancestry specific independence: a multi-ancestral observational study","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-22 19:20:21","doi":"10.21203/rs.3.rs-7624142/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-12-09T19:42:38+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-11T11:20:33+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-06T02:32:27+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"316540894661508316920277629121213999049","date":"2025-10-27T03:25:42+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"117617886972036237035634784554270460281","date":"2025-10-23T20:55:45+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-23T18:50:54+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"116499713770424758572195954103330532888","date":"2025-10-09T15:04:15+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-08T16:43:53+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-26T18:03:43+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-16T07:33:49+00:00","index":"","fulltext":""},{"type":"submitted","content":"Genome Medicine","date":"2025-09-15T21:50:49+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"genome-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [Genome Medicine](https://genomemedicine.biomedcentral.com/)","snPcode":"13073","submissionUrl":"https://submission.springernature.com/new-submission/13073/3","title":"Genome Medicine","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"fe595faa-375f-4b5d-9f42-dbc105fdcc72","owner":[],"postedDate":"October 22nd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-01T20:53:43+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-22 19:20:21","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7624142","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7624142","identity":"rs-7624142","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.