Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response

doi:10.21203/rs.3.rs-4481664/v1

Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response

2024 · doi:10.21203/rs.3.rs-4481664/v1

preprint OA: closed

Full text JSON View at publisher

Full text 211,846 characters · extracted from preprint-html · click to expand

Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response Wazim Ismail Mohammed, Jenna Fernandez, Moritz Binder, Terra Lasho, and 21 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4481664/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background DNMT3A and TET2 are epigenetic regulator genes commonly mutated in age-related clonal hematopoiesis (CH). Despite having opposing epigenetic functions, these mutations are associated with increased all-cause mortality and a low risk for progression to hematological neoplasms. While individual impacts on the epigenome have been described using different model systems, the phenotypic complexity in humans remains to be elucidated. Results Here we make use of a natural inflammatory response occurring during coronavirus disease 2019 (COVID-19), to understand the association of these mutations with inflammatory morbidity and mortality. We demonstrate the age-independent, negative impact of DNMT3A mutant CH on COVID-19-related cytokine release severity and mortality. Using single cell proteogenomics we show that DNMT3A mutations involve myeloid and lymphoid cells. Using single cell multiomics sequencing, we identify cell-specific gene expression changes associated with DNMT3A mutations, along with significant epigenomic deregulation affecting enhancer accessibility, resulting in overexpression of IL32, a proinflammatory cytokine that can result in inflammasome activation in monocytes and macrophages. Finally, we show with single cell resolution that the loss of function of DNMT3A is directly associated with increased chromatin accessibility in mutant cells. Conclusions We demonstrate the negative prognostic impact of DNMT3A mt CH on COVID-19 related inflammatory morbidity and mortality. DNMT3A mt CH involves myeloid and lymphoid cells and in the context of COVID-19, was associated with inflammatory transcriptional priming, resulting in overexpression of IL32. This overexpression was secondary to increased chromatic accessibility, specific to DNMT3A mt CH cells. DNMT3Amt CH can serve as a potential biomarker for adverse inflammatory outcomes. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 INTRODUCTION DNMT3A and TET2 are key epigenetic regulator genes with opposing effects on DNA methylation. While DNMT3A is responsible for the de novo conversion of cytosine (C) to methylcytosine (mC), resulting in gene silencing, TET2 catalyzes the conversion of mC to 5-hydroxy-mC and subsequent oxidative metabolites, resulting in gene activation. 1 DNMT3A and TET2 are also the two most frequently mutated genes in age related clonal hematopoiesis (CH; >70%) and in spite of opposing epigenetic effects, have a convergent impact with regards to hematopoietic stem and progenitor cell (HSPC) fitness, inflammaging, low rates of progression to hematological neoplasms and increased all-cause mortality, largely due to atherosclerotic cardiovascular disease (ASCD). 2 , 3 , 4 In CH, while TET2 mutations have been associated with a myeloid lineage bias, DNMT3A mutations have a broader distribution, affecting myeloid and lymphoid lineage cells. 5 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen responsible for coronavirus disease 2019 (COVID-19), has resulted in an ongoing pandemic associated with significant morbidity, mortality, and long-term sequelae. 6 , 7 , 8 While infections can vary from asymptomatic carrier states to severe cytokine release syndrome (CRS), acute respiratory distress (ARDS) and associated multi organ dysfunction syndrome (MODS), reasons for clinical heterogeneity have partially been investigated, with several viral (strain type) and host factors (age, comorbidities, immune status, ACE2 receptor polymorphisms, among others) congruently being involved. 9 In a large cohort of 515 patients with COVID-19, CH was associated with severe COVID-19 outcomes, including increased mortality. 10 However, many of these patients had underlying visceral malignancies, with several getting chemo- / immunotherapy, and except for a non-significant trend with PPM1D mutations, there were no clear mutational associations detected. 10 In a subsequent study of non-cancer patients, CH was identified in 33% of 568 patients affected by COVID-19, with DNMT3A and TET2 mutant (mt) CH being most frequent. In this study, neither did the presence of DNMT3A/TET2 mt CH, nor their variant allele fractions (VAF) impact COVID-19 related outcomes 11 . Studies with smaller sample sizes and varying COVID-19 severity have demonstrated similar prevalence rates of CH, with no clear impact on outcomes 12 , 13 . Autosomal mosaic chromosomal abnormalities, considered a form of CH, have been documented across large biobanks (n = 768,762) and have been associated with increased infections, including a higher incidence of sepsis, respiratory tract infections, gastrointestinal and genitourinary infections. 14 In a cohort of 243 community patients, we demonstrate an age- and comorbidity-independent adverse impact of DNMT3A mt CH on COVID-19 related inflammatory outcomes and over-all survival (OS). DNMT3A mt CH was the most frequent CH subtype, with DNA methylation studies showing a decrease in CpG site DNA methylation, mostly in distal enhancer-like elements, compared to other CH genotypes and wildtype cases. Single cell proteogenomics indicated that DNMT3A mutations were distributed across lymphoid and myeloid lineage cells, unlike in TET2 mt CH, where the mutations were enriched in monocytes/macrophages. Single-cell transcriptomics demonstrated an increased expression of IL32 , originating from NK and T lymphocytes and to some extent from classical monocytes. IL32 is a proinflammatory cytokine that can lead to monocyte/macrophage related inflammasome activation and production of cytokines such as TNF-alpha, IL8 and MIP-2. 15 . Using proximity extension assay-based proteomics (O-link), we demonstrate a negative impact of IL32 expression levels on mortality. Finally, using a combination of single cell multiome and genotyping of targeted loci with chromatin accessability (GoTChA), we identify putative epigenetic mechanisms regulating this response. These data highlight the role of DNMT3A mt CH in enhancing immune cell dysregulation in the context of COVID-19 infection, which may account for increased disease severity. RESULTS DNMT3Amt CH in patients with COVID-19 is associated with increased severity of cytokine release and increased age- and comorbidity-independent mortality A cohort of 243 community-based patients with COVID-19 (alpha strain- preimmunization era) was included in the study, median age 60 years (range 19–99 years), of which 72 (29.6%) patients had evidence of CH (Supp. table 1 and Fig. 1 A). Apart from the fact that patients with both COVID-19 and CH were older (median age for CH with COVID-19 68.5 years versus 57 years for CH without COVID-19; p < 0.0001), there were no significant differences in sex (p = 0.82), race/ethnicity (p = 0.07), hospitalization rates (p = 0.99), oxygen requirements (p = 0.79), or incidence of CRS (p = 0.53). There were differences in the distribution of comorbidities between the two groups (p = 0.008), with the non-CH group having a higher frequency of obesity (Supp. table 1). There, however, were no differences in baseline blood indices (mean corpuscular volume and red cell distribution width) between the two groups, in hemoglobin levels (p = 0.099), absolute neutrophil counts (ANC, p = 0.15), absolute monocyte counts (AMC; p = 0.86), or platelet counts (p = 0.41), respectively (Supp. table 2). Apart from elevated MCP-1 (monocyte chemoattractant protein-1) levels obtained at COVID-19 diagnosis in the COVID-19 CH cohort compared to the COVID-19 cohort without CH (p = 0.045), there were no other significant differences in clinically measured cytokines / chemokines, including IL1b, IL6, GM-CSF and TNF-alpha, or inflammatory surrogates like C-reactive protein (p = 0.087) and serum ferritin (p = 0.62) (Supp. table 2 and 3). Ninety-seven CH mutations were seen in 72 patients, with 21 (29%) having 2 CH mutations, 1 (1.3%) having 3 CH mutations and 2 (2.7%) having 4 CH mutations, respectively (Fig. 1 A). While none of these patients underwent bone marrow (BM) biopsies, they did not have an underlying diagnosed hematological neoplasm at the time of COVID-19 detection and none of these patients demonstrated disease evolution at last follow-up (median 27 months). The most common CH mutations encountered included DNMT3A (n = 30, 30%), TET2 (n = 26, 28%), ASXL1 (n = 7, 9.7%), SF3B1 (n = 6, 8%), TP53 (n = 3, 4%) and PPM1D (n = 3, 4%), respectively (Fig. 1 A). None of the TP53 and PPM1D mutated patients had prior exposures to cytotoxic chemotherapy or ionizing radiation therapy. Four patients had 2 DNMT3A mutations, 2 patients had 2 TET2 mutations, and one patient had both a TET2 and a DNMT3A mutation. There were 4 patients with DNMT3A R882 hot spot mutations, commonly seen in AML and MDS. 16 Patients with CH and COVID-19 had higher grades of CRS in comparison to those without CH as documented by CTCAE v5.0 criteria (p = 0.006; Fig. 1 B) and by the WHO COVID-19 severity criteria (p = 0.023). 17, 18 There were no significant differences between the CH and no-CH groups with regards to incidence rates of acute lung injury (p = 0.8), ARDS (p = 0.49), acute kidney injury (p = 0.5), MODS (p = 0.09) and venous thromboembolism (p = 0.76) (Supp. table 4). We then compared these outcomes in the two most common CH mutant groups, DNMT3A and TET2 . DNMT3A mt CH patients had a higher frequency of ARDS in comparison to TET2 mt CH (p = 0.007; Fig. 1 C). The DNMT3A mt CH group also had significantly higher levels of MCP-1 in comparison to the TET2 mt CH group (p = 0.014; Fig. 1 D). Both groups received similar therapeutic interventions including remdesivir, IL-6 directed monoclonal antibody therapies (tocilizumab and siltuximab), corticosteroids, and access to clinical trials (p = 0.45, Supp. table 4) ). At last follow-up (median 27 months), 16 deaths (6.5%) have been documented, 10 (4%) in COVID-19 patients with CH and 6 (2.4%) in COVID-19 patients without CH. On a univariate and multivariate survival analysis that included several clinical and laboratory variables, the presence of CH negatively impacted OS in patients with COVID-19 (p = 0.001, Fig. 1 E). The one-month estimated survival rate was 97.6% for COVID-19 patients without CH and 81.8% in COVID-19 patients with CH (median OS not reached versus 13 months). We then analyzed the impact of DNMT3A and TET2 mutations ( Supp. Figure 1 ), the two most common somatic mosaic states in our cohort on COVID-19 related morbidity and mortality. There were no significant differences between the two mutational cohorts regarding age and other comorbidities. While both TET2 mt CH and DNMT3A mt CH negatively impacted survival, after adjustment for age and comorbidities, only DNMT3A mt CH retained an independent prognostic effect (p < 0.001; Fig. 1 F). We hence demonstrate the prevalence, clinical characteristics, and the age- and comorbidity-independent impact of DNMT3Amt CH on inflammatory morbidity and overall mortality, in community dwelling patients infected by the alpha strain of SARS-Cov-2. DNMT3Amt CH in the context of COVID-19 is associated with decreased DNA methylation at CpG residues in contrast to patients with TET2mt CH and COVID-19 Given that both DNMT3A and TET2 have opposing impacts on DNA methylation, we first assessed DNA methylation status using the Illumina Infinium Methylation EPIC array on peripheral blood mononuclear cells (PBMC) from the COVID-19 and CH cohort. DNMT3A mutations have been associated with DNA hypomethylation at key enhancer sites in granulocytes and mononuclear cells in patients with CH, with these elements known to regulate leukocyte function, inflammation, and adaptive immune responses. 19 We included 7 patients with CH and COVID-19 ( TET2mt – 4 and DNMT3Amt – 3). Even though there were no significant global changes in DNA methylation between the two groups (p = 0.057, Fig. 2 A), DNMT3A mt patients with COVID-19 demonstrated decreased methylation at highly methylated CpG sites (b > 0.75) (Kolmogorov-Smirnov p < 2.2x10 − 16 ; Fig. 2 B). 20 Site-specific differential methylation analysis also revealed an increased number of hypomethylated sites in DNMT3A mt patients with COVID-19 in comparison to TET2 mt patients with COVID-19, with 10,944 hypomethylated sites and 1,160 hypermethylated sites (Db > 0.1 and p < 0.01; Fig. 2 C). We then annotated the differentially methylated regions using the ENCODE Epigenomics Roadmap reference data. 21 We found that actively transcribed states (Tx, TxWk) were more commonly hypomethylated in DNMT3A mt CH compared to TET2 mt CH. Even though there were fewer hypermethylated sites, these were more common at enhancers (Enh) and promoters (TssA, TssAFlnk) (Fig. 2 D). Pathway analysis revealed that the hypomethylated sites are in or near genes involved in many diseases and functions related to inflammation and immune response, especially in leukocyte function ( Supp. Figure 2 A, B). Hence, we demonstrate site specific differential methylation between DNMT3A mt CH and TET2 mt CH in patients with COVID-19, with more prominent hypomethylation occurring in actively transcribed regions in DNMT3A mt CH. Single cell proteogenomics reveal that DNMT3A mutations involve myeloid and lymphoid cell lineages, unlike TET2 mutations which bias hematopoiesis towards monocytosis We carried out comprehensive proteogenomic assessments at single-cell resolution on PBMC, on 5 patients with COVID-19 and CH ( DNMT3A mt– 3, TET2 mt–1, DNMT3A and TET2 co-mutant–1) and 8 patients with COVID-19 and no CH (Fig. 3 A). Given that the single cell DNA assay is an amplicon based assay and the fact that mutations in TET2 do not have common hot spot regions, we did have TET2 mt patients in our cohort where the mutant regions were not covered by the amplicons used in our assay, limiting the number of COVID-19 + TET2 mt CH cases that we could genomically profile at the single cell level. 22 Two of 3 (66%) COVID-19 + DNMT3A mt patients had 2 DNMT3A mutations each, while the COVID-19 + TET2 mt patient also had a concomitant CBL mutation, with normal blood counts and no monocytosis. In total, we included 28,941 single cells in the final analysis, after rigorous quality control and exclusion of cells with allele drop out, as previously described (Fig. 3 B and methods section). 23 Of these 28,941 sequenced cells, 2,004 (6.9%) had detectable CH mutations, of which 1,811 (90%) were DNMT3A mt, 361 (18%) were TET2 mt (Fig. 3 C) and 168 (8%) were co-mutated with both TET2 mt and DNMT3A mt. In comparison to TET2 mt CH, where CH mutations were largely present in classical and intermediate monocytes, in DNMT3A mt CH, the mutations were commonly seen in lymphoid lineage cells including CD4 + and CD8 + T-lymphocytes, T-regulatory cells and gamma delta T cells (Fig. 3 D-E), a lineage bias that has previously been described. 5 Given the smaller sample size of CH mutant cells in the COVID-19 cohort, we conducted single cell proteogenomics on an additional 4 patients with CH ( Supp. Figure 3 A) who did not have COVID-19 and cumulatively re-analyzed the data. Of 36,557 single cells successfully sequenced, 3,314 (9%) had detectable CH mutations ( Supp. Figure 3 B-C). Among the mutated cells 1,503 (45%) were TET2 mt, 1,643 (50%) were DNMT3A mt, and 168 (5%) were co-mutant cells ( Supp. Figure 3 C). In this larger data set we once again demonstrate an enrichment of TET2 mutations in classical and intermediate monocytes, while DNMT3A mutations were commonly seen in lymphoid and myeloid lineage cells, especially T-lymphocytes ( Supp. Figure 3 C-E). Based on these findings, we conclude that DNMT3A mutations are distributed in myeloid and lymphoid lineage cells, whereas TET2 mutations have a clear myeloid (monocytic) biased distribution. Single cell transcriptomic analysis of DNMT3A and TET2 mutant patient samples during inflammatory response To explore the underlying differences in expression between DNMT3A and TET2 mutants in the context of SARS-CoV-2 infection we used single cell RNA-seq from PBMC from a cohort of 24 patients, which included 15 patients with COVID-19 and no CH, 6 COVID-19 patients with TET2 mutations and 3 COVID-19 patients with DNMT3A mutations (Fig. 4 A). By pooling 78,083 cells from all 24 patients and subjecting them to dimensionality reduction using PCA and UMAP, followed by cell type identification using SingleR (see methods section and Supp. Figure 4 A), we identified the typical repertoire of lymphoid and myeloid cells (Fig. 4 B-C). For the comparison of the different cell types, we also included published data from healthy individuals separated into two age groups (under and over 50 years). 24 The most noticeable difference was the enrichment of classical and intermediate monocytes in patients with TET2 mutations, while patients with DNMT3A mutations show enrichment of CD8 + and gamma delta T lymphocytes, plasma blasts, NK cells and non-classical monocytes (Fig. 4 C and Supp. Figure 4 B). To further investigate a potential association of DNMT3A mutations with COVID-19 severity, we performed differential gene expression analysis within each cell type, first comparing COVID-19 + DNMT3A mt CH cells with COVID-19 + no CH cells and found an overall increase in IL32 expression in CD4 + and CD8 + T lymphocytes, regulatory T cells and NK cells, a potential biomarker of severity that has not been documented in plasma samples from patients with COVID-19. 25 (Fig. 4 D). We then compared COVID-19 + DNMT3A mt CH cells with COVID-19 + TET2 mt CH cells and found the same cell type specific overexpression pattern for IL32 , as described above (Fig. 4 E, F). Pathway analysis of differentially expressed genes in this comparison revealed an enrichment of genes involved in lymphocyte proliferation, migration of blood cells, cytotoxicity of lymphocytes and NK cells and joint inflammation ( Supp. Figure 4 C). Given that IL32 has not specifically been implicated in COVID-19 severity, we analyzed the abundance of cytokines in patients with COVID-19 from our cohort of 223 assessable patients using a multiplex proteogenomic panel (O-link- methods section for details). While there were no significant differences in IL32 levels between DNMT3A mt and TET2 mt CH COVID-19 patients, or between COVID-19 patients with and without CH, when we assessed this cohort for COVID-19 related morbidity and mortality by looking at relative levels of IL32, we found that COVID-19 patients with higher IL32 levels had a higher mortality, in comparison to those with lower levels ( Supp. Figure 4 D). Here we demonstrate a lymphoid lineage enrichment for DNMT3A mt CH in comparison to TET2 mt CH, along with an inflammatory transcriptional signature resulting in overexpression of IL32 in DNMT3A mt CH, with higher IL32 protein levels correlating with increased mortality in the context of COVID-19. Epigenetic up-regulation of IL32 occurs due to increased chromatin accessibility of a transcriptional program seen in CH patients with DNMT3A mutations Since DNMT3A and TET2 are known to regulate chromatin accessibility with opposing effects across the genome, we conducted single-cell profiling of both gene expression (scRNA-seq) and open chromatin (scATAC-seq) from the same PBMC samples using the 10X Genomics Multiome platform (methods for details). 26 From a cohort of 11 COVID-19 patients, which included 6 patients without CH, 3 patients with TET2 mt CH and 2 patients with DNMT3A mt CH were selected ( Supp. Figure 5 A). We pooled 25,725 single cells and performed dimensionality reduction using PCA and UMAP analysis, followed by cell type identification by mapping the expression data from Multiome onto the scRNA-seq data and transferring labels using Azimuth ( Supp. Figure 5 C). From each cell type, we were able to gather expression signatures and open chromatin profiles, expressed in cut site counts (number of sites cut by the Tn5 transposase – a direct measure of accessibility of chromatin to the transposase). Using this technology, we were able to validate the lymphoid lineage enrichment in DNMT3A mt CH, comprising of CD4 + T-lymphocytes, regulatory T cells, B-lymphocytes, and plasma blasts, in comparison to TET2 mt CH, where monocytic enrichment was more prominent (classical and intermediate monocytes, dendritic cells and CD8 + T-lymphocytes; Fig. 5 A, Supp. Figure 5 B). As chromatin accessibility is reflective of the active enhancer and promoter structure and is strongly associated with DNA methylation status, we aimed to provide mechanistic insights on the deregulation of the epigenetic landscape in DNMT3A mt CH in comparison with TET2 mt CH. Notably, analysis of global distribution of cut sites and differentially accessible peaks showed that there was increased chromatin accessibility in DNMT3A mt CH, especially in the two main cell types overexpressing IL32 , CD4 + T lymphocytes and NK cells (Fig. 5 B, Supp. Figure 5 D). Moreover, co-accessibility analysis revealed an enrichment of cis -regulatory interactions associated with expression of up-regulated genes such as IL32 in DNMT3A mt CH cells by identifying several candidate enhancers linked to the transcription start site of IL32 (Fig. 5 C). 27 The epigenomic landscape around IL32 containing the ENCODE cis -regulatory elements mapped with cell specific open chromatin regions identified by scATAC analysis also allowed us to identify overlapping hypomethylated CpG regions in DNMT3A mt CH patients (cg01100763, cg09294055, cg04519177) obtained from bulk PBMC DNA methylation data, suggesting a direct link between loss of methylation and increased chromatin accessibility. (Fig. 5 C,D). To understand transcription factors (TF) likely to be affected by DNMT3A and TET2 CH mutations, we first performed analysis of differentially accessible TF binding sites by differential enrichment analysis directly from ChIP-seq datasets found in the literature. 28 Expression of some of these transcription factors was also analyzed in the same cell types to ensure that the chromatin accessibility analysis could also reflect changes in expression levels ( Supp Fig. 5 E, F). Interestingly, the IRF family of transcription factors was enriched both in transcription (scRNA) and in TF activity (scATAC) in DNMT3A mt CH, suggesting that a specific transcriptional program driven by IRF is involved in the pro-inflammatory response, particularly in CD4 + T lymphocytes. Finally, to assess if the epigenetic dysregulation in DNMT3A mt CH can be traced back to the mutant cells, we performed genotyping of targeted loci with single-cell chromatin accessibility (GoTChA) in two patient samples with high VAF for a known DNMT3A loss of function mutation ( Supp. Figure 6–8 ). 29 Comparing number of cutsites in wild type and mutant cells from the same sample revealed a similar pattern of increased open chromatin in DNMT3A mt CH cells both globally (Fig. 5 E) and in a genomic locus specific manner, at CpG sites affected around the IL32 locus (Fig. 5 F). With one exception (Locus A in patient 2), DNMT3A mutations were directly associated with higher ATAC signal, indicating a direct link between loss of function of the DNA methyltransferase activity and increased chromatin accessibility. DISCUSSION Clonal hematopoiesis is defined by the acquisition of somatic mutations in HSPC, with the capacity to expand over time, with evolving cell intrinsic and extrinsic selection pressures. 2 , 3 , 4 CH is ubiquitous in the aging population, with studies demonstrating hematopoiesis to be largely oligoclonal in the elderly. 30 DNMT3A and TET2 are the two most common age-related CH mutated genes, with both being critical regulators of DNA methylation. 2 , 3 , 4 , 31 While age related CH is associated with a low risk of hematological neoplasms, its presence is associated with increased all-cause mortality, largely due to cardiovascular disease. 2 , 3 , 4 , 31 , 32 This is believed to be secondary to pervasive inflammatory transcriptional priming and inflammasome activation associated with these mutations. 1 , 33 , 34 While the clinical impact of these two mutations is convergent, their impact on the epigenome is not. DNMT3A mutations are mostly loss of function mutations that lead to protein instability and loss of methyltransferase activity, resulting in DNA hypomethylation, whereas TET2 mutations are either truncating or hypomorphic, abrogating the catalytic activity of TET2, resulting in DNA hypermethylation. 19 , 33 , 35 , 36 In addition, lineage restriction analysis has shown that while DNMT3A mutations involve myeloid and lymphoid cell lineages, TET2 mutations are more myeloid restricted with a clear myelomonocytic bias. 5 , 37 In this study, using COVID-19 as a model for severe inflammation, using bulk and single cell multiomics on patient samples obtained prior to the advent of the SARS-CoV-2 vaccine, we demonstrate the negative impact of CH on inflammatory morbidity (CRS) and mortality. We show that while this was accounted for by increasing age in TET2 mt CH patients; DNMT3A mt CH remained an independent adverse prognosticator. While several host and viral factors impacting COVID-19 severity have been described, we demonstrate the relevance of DNMT3A mt CH in this context. 6 , 7 , 8 , 10 , 38 Prior studies have shown a conflicting impact of CH on COVID-19 related morbidity and mortality. 10 , 11 , 12 , 13 While there could be several confounding factors explaining these discrepancies, our study population included unselected community dwelling individuals infected by the alpha strain of SARS-CoV-2, prior to immunization and without any underlying hematological or visceral neoplasms, or immunodeficiency states. The frequency of CH in the COVID-19 cohort was 29.6%, with DNMT3A (30%) and TET2 (28%) being the two most mutated CH genes, consistent with prior observations. 2 , 3 While the DNMT3A and TET2 mt CH groups were well balanced with regards to baseline blood counts and comorbidities, DNMT3A mt CH patients had higher levels of MCP-1, had a higher likelihood of ARDS and had higher grades of CRS as judged with CTCAE v5.0 criteria and by the WHO COVID-19 severity criteria. 17 , 18 There was no further stratification of this effect based on CH mutational VAF or the number of CH mutations. None of the patients included in this cohort, at last follow-up, had evidence of a hematological neoplasm. MCP-1, also called CCL2, is a key chemokine that regulates the migration of monocytes and macrophages in response to inflammation and has been implicated as a biomarker of COVID-19 severity in the recent past. 39 , 40 We then conducted methylation studies using the Illumina Methylation EPIC array in select cases, and while there were no global differences in DNA methylation, site-specific analysis revealed an increased number of hypomethylated sites in DNMT3A mt versus TET2 mt patients with COVID-19. Using the ENCODE Epigenomics Roadmap reference data, 21 we demonstrate that actively transcribed states (Tx, TxWk) were more commonly hypomethylated in DNMT3A mt CH compared to TET2 mt CH, with pathway analysis revealing that the hypomethylated sites were in or near genes involved in several diseases and functions related to inflammation. This is consistent with prior observations assessing DNA methylation on whole blood samples in patients with CH, CH associated cytopenias and AML. 19 Given the lack of a scalable single cell methylation assay, we were not able to validate these findings at the single cell level, appropriately adjusting for somatic mosaicism and duly acknowledge this limitation. We then assessed the distribution of DNMT3A and TET2 mt CH in the COVID-19 cohort using single-cell proteogenomics and validate observations from prior lineage restriction analyses that while DNMT3A mt CH involved myeloid and lymphoid lineage cells, TET2 mt CH had a clear myeloid restriction, with a myelomonocytic bias. 5 These observations were also validated with single cell RNA and multiome-seq data. Enrichment of TET2 mutations were in classical and intermediate monocytes, reflective of a granulocyte monocyte-biased hematopoiesis (GMP-bias) and classical monopoiesis, which has been well documented in TET2 -driven hematological neoplasms such as chronic myelomonocytic leukemia and might explain the differential impact seen on inflammatory morbidity and mortality seen in the context of COVID-19. 41 , 42 , 43 Differential gene expression analysis comparing DNMT3A mt CH patients with those without CH, and those with TET2 mt CH, demonstrated an overall increase in IL32 expression in CD4 + and CD8 + T lymphocytes, regulatory T cells and NK cells in patients with DNMT3A mt CH versus those without CH and those with TET2 mt CH. Pathway analysis of differentially expressed genes revealed an enrichment of genes involved in lymphocyte proliferation, migration of blood cells, cytotoxicity of lymphocytes and NK cells and joint inflammation. On O-link based cytokine analysis, while there were no significant differences in IL32 levels between DNMT3A mt and TET2 mt CH COVID-19 patients, or between COVID-19 patients with and without CH, relative increments in IL32 levels were associated with a higher mortality, in comparison to those with lower levels. IL32 is a proinflammatory cytokine, initially detected in activated NK cells and T-lymphocytes, whose expression is strongly enhanced by microbes, mitogens, and inflammatory stimuli. 44 It can amplify production of other inflammatory cytokines including IL1b, IL6 and TNF-a and has not been reported as a biomarker of severity in COVID-19, or CH, thus far. We speculate that IL32 expression in DNMT3A mt CH is enhanced in the context of inflammatory stimuli such as COVID-19. To better understand the regulatory mechanism behind IL32 overexpression in DNMT3A mt CH, we conducted single cell multiome profiling using the 10X Genomics Multiome platform. On a global distribution analysis of cut sites and differentially accessible peaks, we found increased chromatin accessibility in DNMT3A mt CH, especially in CD4 + T lymphocytes and NK cells, the two cell types with predominant IL32 overexpression. Co-accessibility analysis revealed an enrichment of cis -regulatory interactions associated with expression of IL32 in DNMT3A mt CH cells, identifying candidate enhancers linked to the transcription start site of IL32 . We found that the IRF family of transcription factors was enriched both in transcription and in TF activity in DNMT3A mt CH, particularly in CD4 + T lymphocytes. Finally, to address issues with somatic mosaicism, we performed genotyping of targeted loci with single-cell chromatin accessibility (GoTChA) in two patients with DNMT3A mt clonal cytopenia's and found a similar pattern of increased open chromatin in DNMT3A mt CH cells both, globally and in a genomic locus specific manner, at CpG sites affected around the IL32 locus. CONCLUSION In summary, our results validate DNMT3A mt CH as an age and comorbidity independent risk factor for severe COVID-19. DNMT3A mt CH involves myeloid and lymphoid lineage cells and in the context of inflammatory stimuli, results in the overexpression of IL32 , a highly proinflammatory cytokine, predominantly in T and NK cells. This regulation in part is mediated by DNMT3A mt associated changes in chromatin accessibility, allowing for transcription factors like the IRF family of TF, to mediate transcriptional activation at IL32 promotor sites. METHODS Authorizations, patient cohorts, cell collection and sorting This study was conducted at the Mayo Clinic in Rochester, Minnesota, after approval from the Mayo Clinic Institutional Review Board (IRB #20-005400, IRB #16-004173). In all cases, diagnosis was according to the 2016 iteration of the WHO classification of myeloid malignancies. 45 PB and BM samples were collected in EDTA tubes after informed consent. Target Capture Chip Assay for Bulk Sequencing Sample DNA was extracted from peripheral blood mononuclear cells isolated by gradient centrifugation and re-suspended in a concentration of 500 ng in 50µl of low TE buffer. Paired-end indexed libraries were prepared using the Sureselect XT Low Input Library prep protocol on the Agilent Bravo liquid handler following the manufacturer’s protocol (NewEngland Biolabs, Ipswitch, MA, and Agilent Technologies, Ankeny, IA). Briefly, 200ng of target DNA was fragmented using the Covaris LE220 plus sonicator. The settings of duty factor 30%, peak incident power (PIP) 450, cycles per burst 200, time 180 seconds, generated double-stranded DNA fragments with blunt or sticky ends with a fragment size mode of between 150-200bp. The ends were repaired using the Sureselect End-Repair-A-Tailing enzyme mix. Adapter ligated DNA fragments were size-selected to enrich for 200 bp inserts (~ 320 bp total library size) using AMPURE XP bead purification. The size-selected adapter-modified fragments were enriched, and specific indexes were added by 12 cycles of PCR using universal index primers. The concentration and size distribution of the libraries was determined on an Agilent Bioanalyzer DNA 1000 chip. The Custom Capture hybrid-target enrichment probes were designed using Agilent SureSelect design software (Agilent Technologies, Santa Clara, CA). The targeted gene panel was comprised of 62,962 single probes with size 1.805Mbp, and covered the coding regions, UTRs, and overlapping intron/exon regions for 205 genes described and / or enriched for CH mutations (see Appendix: Target Gene Panel). The custom capture was carried out using the Agilent Bravo liquid handler following Agilent’s SureSelect XT Low. Purified capture products were then amplified using the SureSelect Post-Capture primer mix for 14 cycles. Libraries were validated and quantified on the Agilent Bioanalyzer. Samples were sequenced by 150 paired end reads, 21 samples to a Flow Cell, on an Illumina NovaSeq 6000 SP (Illumina, SanDiego, CA). Secondary bioinformatics analysis included quality assessment and alignment to the hg19 build reference genome using Novoalign (Novocraft Technologies, Malaysia), followed by GATK based single nucleotide and small insertion / deletion variant calling, structural variation discovery, and annotation. The quality of sequencing chemistry was evaluated using FastQC. After alignment, PCR duplication rates and percent reads mapped on target were used to assess the quality of the sample preparations. Realignment and recalibration steps were implemented in the GATK. 46 Somatic single nucleotide variations (SNVs) were then genotyped using SomaticSniper, whereas insertions and deletions were called by GATK Somatic Indel Detector. 46 , 47 Each variant in coding regions was functionally annotated by SnpEff, SAVANT, ClinVar, dbNSFP, OMIM, and the Human Gene Annotation Database to predict biological effects. 48 , 49 , 50 , 51 , 52 , 53 Each variant was also annotated with allele frequency from the Exoma Aggregation Consortium. 54 Variants of significant interest were visually inspected using IGV. 55 The total list of all variants was internally compared for internal duplicates indicative of false positives, and variants of concern are additionally hand annotated for identification using Alamut Software. Interpretation for relevant alterations included absence in international normal variant allele databases (GnomAD, ExAC), deleterious effect on protein function by multiple phenotype prediction models, somatic and functional annotation in literature, consequence of variant (nonsense, truncating, etc.) and location proximal to important domains. Single-Cell Proteogenomics Single-cell DNAseq + proteogenomics was performed using the Mission Bio Tapestri platform according to the manufacturer’s specifications. Briefly, a cocktail of pre-titered oligo-linked antibodies targeting 42 unique cell-surface markers and 3 antibodies for isotype control, TotalSeq™ D Human Heme Oncology Kit (BioLegend®, San Diego, CA [ https://www.biolegend.com/en-us/totalseq/single-cell-dna ]) was utilized to capture the cell immunophenotype, along with targeted myeloid scDNAseq genotyping panel targeted against 45 genes with 312 amplicons covering relevant genes dysregulated in myeloid disorders (designed and manufactured by Mission Bio, San Francisco, CA [ https://designer.missionbio.com/catalogpanels/Myeloid ]). See Appendices: Myeloid Panel Coverage and Myeloid Panel Design for specific genomic coordinates. Cryopreserved PBMC patient samples were gently thawed, washed, and run through a Dead Cell Removal Kit (Miltenyi, Auburn, CA). The resulting viable cell fraction was counted, blocked with Human TruStain FcX™ (BioLegend®) and re-suspended at a concentration of 25,000 cells/µL. This fraction was incubated with BioLegend TotalSeq Kit Cocktail described, washed, and filtered through a Flowmi cell strainer (MilliporeSigma, St. Louis, MO), and quantified using a Countess II cell counter. The cells were then diluted to a concentration of 4,000 cells per µL in Cell Buffer. Next, 35 µL of cell suspension was loaded onto a microfluidics cartridge and cells were encapsulated on the Tapestri instrument followed by cell lysis with protease digestion followed by heat inactivation using a thermal cycler. The cell lysate was reintroduced into the cartridge and processed such that each cell possessed a unique molecular barcode. Amplification of the targeted DNA regions was performed by incubating the barcoded DNA emulsions in a thermocycler as follows: 98°C for 6 min (4°C per s); ten cycles of 95°C for 30s, 72°C for 10s, 61°C for 9 min, 72°C for 20s (1°C per s); ten cycles of 95°C for 30s, 72°C for 10s, 48°C for 9 min, 72°C for 20s (1°C per s); and 72°C for 6 min (4°C per s). Emulsions were broken, DNA digested and purified with 0.72X AMPure XP beads (Beckman Coulter). The beads were pelleted and washed with 80% ethanol and the generated by amplifying DNA libraries with Mission Bio V2 index primers in the thermocycler using the following program: 95°C for 3 min; 10 cycles of 98°C for 20s for DNA libraries and 16–20 cycles for protein libraries, 62°C for 20s, 72°C for 45s; 72°C for 2 min. Final libraries were purified with 0.69X AMPure XP beads. All libraries were sized and quantified using an Agilent Bioanalyzer and pooled for sequencing on an Illumina NovaSeq 6000 SP with 2 x 150bp multiplexed runs. FASTQ files generated by sequencers were processed using the Tapestri Pipeline V2 which handles adapter trimming, sequence alignment (BWA), barcode correction, cell finding and variant calling (using GATK 4.1.7 haplotype caller). Generated loom and H5 files were then processed with Tapestri Insights v2.2 (Mission Bio) and/or the Python-based Mosaic package (GitHub). Tapestri Insights analysis used default filter criteria (for example, genotype quality ≥ 30 and reads per cell per target ≥ 10) or whitelisting of known variants and annotation-based information (including ClinVar and DANN). Only cells with complete genotype information for all variants (previously detected in bulk sample) were included for downstream processing. Cell type identification for the proteogenomics platform A rigorous quality control was performed on antibody-derived tag (ADT) count profiles of the Tapestri single-cell proteogenomic data. We first filtered out cells with less than 200 or more than 50,000 total ADT counts, and then removed cells with less than 10 unique measured ADTs. After ADT-based filtering, one sample was excluded from the analysis as there were less than 500 cells remaining for the sample. ADT counts for each surface protein were then scaled using counts per 10 million with a pseudocount of + 1 and normalized using the centered-log ratio (CLR) transformation. Subsequently, following step II of the “dsb” (denoised and scaled by background) protocol, normalized ADT profiles of antibodies for isotype control were used to remove cell-to-cell technical noise. 56 In result, we obtained a normalized and denoised ADT count matrix representing 42 different surface protein expression profiles of 36,557 single cells from 17 patients. This full matrix was used to generate UMAP plots following dimensionality reduction using PCA. To identify cell types for cells profiled by the Tapestri platform, we compared ADT count profiles of the Tapestri data with that of a reference CITE-seq data, i.e., Azimuth reference. 57 Among the 42 cell-surface markers targeted by our Tapestri analysis, 36 were also targeted by the reference CITE-seq analysis. In the CITE-seq analysis, expression of 8 out of the 36 shared markers were assayed using two different antibodies targeting the same antigen (CD3, CD4, CD11b, CD38, CD44, CD45, CD56, and CD138). For these antigens, we used geometric means of ADT counts for both antibodies. CD5, CD7, CD10, CD33, CD62L, and FceRIa were the markers that were targeted by the Tapestri platform but not by the reference CITE-seq analysis. ADT count matrices of the 36 shared markers for the Tapestri and CITE-Seq analyses were scaled, normalized, and denoised as described above (for denoising CITE-seq data, 3 IgG antibodies were selected as isotype controls). Subsequently, for each cell from the Tapestri data, 20 nearest neighbor cells from the CITE-seq data were identified using Euclidian distance, and cell type was called based on their identities using majority vote. Single-cell profiling of PBMCs (single-cell RNA-seq and 10X Genomics Multiome) Frozen PBMCs were thawed in a 37°C water bath for 3–5 min until no ice was visible. Cells were washed twice with 1 mL PBS + 0.04% BSA and pelleted (300×g for 5 min at 4°C). Dead cells were removed according to the Demonstrated Protocol: Removal of Dead Cells from Single Cell Suspensions for Single Cell RNA Sequencing (10X Genomics). Using the MACS Dead Cell Removal Kit (Miltenyi Biotec), the pellet was resuspended in 100 µL Dead Cell Removal MicroBeads and incubated for 15 minutes at room temperature. After incubation, the cell suspension was diluted with 1X Binding Buffer and applied to an MS column. The dead cells were retained on the column and the live cells passed through the column and were collected. After dead cell removal, the samples were washed twice with 1 mL PBS + 0.04% BSA and the cell concentration was determined using a Cellometer K2 cell counter (Nexcelom Biosciences). The cells were then aliquoted for scRNA-seq and Multiome (see Methods section). Single-cell RNA-seq The cells were first counted and measured for viability using either the Vi-Cell XR Cell Viability Analyzer (Beckman-Coulter, Brea, CA) or a basic hemocytometer and light microscope. The barcoded Gel Beads were thawed from − 80°C and the cDNA master mix was prepared according to the manufacturer’s instruction for Chromium Next GEM Single Cell 3’ Library and Gel Bead Kit (10x Genomics, Pleasanton, CA). Based on the desired number of cells to be captured for each sample, a volume of live cells was mixed with the cDNA master mix. A per sample concentration of 400,000 cells per milliliter or better is required for the standard targeted cell recovery of approximately 4000 cells. The stock concentration requirements would not change for higher cell recovery numbers. The cell suspension and master mix, thawed Gel Beads and partitioning oil were added to a Chromium Single Cell G chip. The filled chip was loaded into the Chromium Controller, where each sample was processed and the individual cells within the sample were captured into uniquely labeled GEMs (Gel Beads-In-Emulsion). The GEMs were collected from the chip and taken to the bench for reverse transcription, GEM dissolution, and cDNA clean-up. The resulting cDNA contained a pool of uniquely barcoded molecules. A portion of the cleaned and measured pooled cDNA continued to library construction, where standard Illumina sequencing primers and a 10x Genomics unique i7 Sample index were added to each cDNA pool. All cDNA pools and resulting libraries were measured using Qubit High Sensitivity assays (Thermo Fisher Scientific, Waltham, MA) and Agilent Bioanalyzer High Sensitivity chips (Agilent, Santa Clara, CA). Libraries were sequenced at between 40,000 and 50,000 fragment reads per cell following Illumina’s standard protocol using the Illumina cBot and HiSeq 3000/4000 PE Cluster Kit (Illumina, San Diego, CA). The flow cells were sequenced as 100 X 2 paired end reads on an Illumina HiSeq 4000 HD using HiSeq 3000 / 4000 sequencing kit and HCS v3.4.0.38 collection software. Base-calling was performed using Illumina’s RTA version 2.7.7. Single-cell Multiome After approximately 4,000 cells were aliquoted for scRNA-seq, the remainder were used for single-cell Multiome ATAC + Gene Expression (10X Genomics). Nuclei were isolated according to the Demonstrated Protocol: Nuclei Isolation for Single Cell Multiome ATAC + Gene Expression (10x Genomics, CG000365 Rev A). Briefly, cells were added to a 2.0 mL low binding tube and centrifuged (300×g for 5 min at 4°C) using a swinging bucket rotor. The supernatant was removed, and the cell pellet was resuspended in 100 µL of chilled 10x Genomics Lysis Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl 2 , 0.1% Tween-20, 0.1% NP-40 Substitute, 0.01% digitonin, 1% BSA, 1 mM DTT, 1 U/µL RNase inhibitor 40 U/mL) by pipette-mixing 10 times. Cells were incubated on ice for 3 min, followed by dilution with 1 mL of chilled Wash Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl 2 , 0.1% Tween-20, 1% BSA, 1 mM DTT, 1 U/mL RNase inhibitor 40 U/mL). Nuclei were then centrifuged (500×g for 3 min at 4°C), and the supernatant was slowly removed. The nuclei were washed one additional time with 1 mL Wash Buffer. Nuclei were resuspended in chilled diluted nuclei buffer (1X Nuclei Buffer (10x Genomics), 1 mM DTT, 1 U/mL RNase inhibitor 40 U/mL); the concentration was determined using a Cellometer K2 cell counter (Nexcelom Biosciences) and the samples were adjusted to a concentration appropriate for our targeted nuclei recovery. The single-cell ATAC library construction and gene expression library construction were carried out as described in the Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Guide (CG000338 Rev A). ATAC and GEX libraries were sequenced separately on an HiSeq 4000 (Illumina) before demultiplexing, alignment to the reference genome, and post-alignment quality control. DNA Methylation Genomic DNA was isolated and checked for quality by standard protocols. 1 µg genomic DNA then underwent bisulfite treatment using the TrueMethyl oxBS Module (Tecan Genomics, Männedorf, Switzerland) according to the manufacturer’s specifications. Briefly, DNA was purified using magnetic beads, denatured, and underwent bisulfite conversion followed by desulfonation and purification. The TrueMethyl converted DNA samples were then eluted in 10 µL and then processed through the Illumina Infinium MethylationEPIC BeadChip array (Illumina, San Diego, CA) protocol. In brief, 7 µL of converted DNA was denatured with 1 µL of 0.4N sodium hydroxide prior to whole genome amplification on the MSA4 plate. All other steps were followed as per the manufacturer’s guidelines. Quality control of Infinium MethylationEPIC BeadChips was performed via the Genome Studio Methylation Module (Illumina). Subset-quantile Within Array Normalization (SWAN) was performed on the Infinium MethylationEPIC BeadChip IDAT files via the R package “minfi”. 58 , 59 The resultant β-values are used to identify changes in DNA methylation (Δβ) between groups. Unless otherwise noted, a change in absolute methylation level of 10% (Δβ > |0.1|) and a p value of < 0.01 were considered significant. CpG site relation within chromatin states was annotated using Bedtools v2.27.1 to the genome annotations provided for PBMCs by the Roadmap Epigenomics project. 60 The functional annotation of differentially methylated CpGs located within gene bodies and promoters was generated through the use of QIAGEN IPA (QIAGEN Inc., https://digitalinsights.qiagen.com/IPA ). Gene ontology associated with the differentially methylated CpGs within non-coding regions was performed using GREAT. 61 Genotyping of Targeted loci with single-cell Chromatin Accessibility (GoT-ChA) The assay was performed following the published protocol (Myers et al., 2022) with some modifications. 10,000 nuclei were captured for each sample. The following primers were utilized to specifically amplify genotyping fragment (DNMT3A R882). GoT-ChA R1N-F, 5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGATGACTGGCACGCTCCAT-3’; GoT-ChA-R, 5’-CTAAGCAGGCGTCAGAGGAG-3’; GoT-ChA_nested-R, 5’-BiosG/CCTTGGCACCCGAGAATTCCATCCTGCTGTGTGGTTAGACG-3’. The underlined sequences are locus-specific. After index PCR, DNAs were digested with AatII and MscI to monitor the specificity of amplified DNAs. Final libraries were quantified using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, #Q32854) and were analyzed by the Fragment Analyzer (Advanced Analytical Technologies; AATI; Ankeny, IA) using the High Sensitivity NGS Fragment Analysis Kit (Cat. #DNF-486). The libraries were sequenced on a NovaSeq 6000 at the Mayo Clinic Genome Analysis Core. ATAC libraries were sequenced to a depth of 25,000 read pairs per nucleus and GoT-ChA libraries were sequenced to 5,000 read pairs per nucleus. Olink We used Olink Explore 1536 panel assay (Olink Proteomics [Uppsala, Sweden]), which uses proximity extension assay technology coupled to a readout methodology based on next generation sequencing (Illumina NovaSeq 6000, NextSeq 550, and NextSeq 2000; all manufactured by Illumina; appendix 1 p 2), to quantify protein targets, as previously described. 62 Statistical analysis Distribution of continuous variables was statistically compared using Mann-Whitney or Kruskal-Wallis tests, while nominal or categorical variables were compared using the Chi-Square or Fischer’s exact test. Time to event analyses used the method of Kaplan-Meier for univariate comparisons using the log-rank test. OS was calculated from the date of diagnosis to date of death or last follow-up, while AML-free survival (LFS) was calculated from date of diagnosis to date of AML transformation or death. Poisson regression models were fit to compare the numbers of different cell types across groups. These Poisson models incorporated an offset term to reflect the total number of cells for a given patient (to be able to compare cell types across consistent intervals in the setting of varying total cell counts: \(\text{ln}\left(Y\right)= {\text{ln}\left(N\right)+\beta }_{0}+{\beta }_{1}{X}_{1}+ϵ\) ). Single-cell data analysis Single-cell RNA-seq: Sequenced reads from the droplet libraries were processed using 10x Genomics Cell Ranger v6.0.1. 63 The reads were aligned to the pre-built human reference transcriptome GRCh38 - v2020-A (July 7, 2020) provided by 10X Genomics. Read trimming, alignment, UMI counting, and cell calling were performed by Cell Ranger. Doublet prediction on scRNA-seq data was done using Scrublet v0.2.1 with default parameters. 64 Downstream processing was done using Seurat v4.0.4. 65 Count matrices from all samples were combined and batch-corrected using Seurat v4 integration method. Count matrix from each sample was log-normalized, scaled to mean 0 and variance 1, and dimensionality reduced using PCA on the top 2000 variable genes across all samples. The reciprocal PCA and reference-based integration options were applied in the anchor finding step due to large data size. Four patient samples, one from each sex and each condition (COVID-19 + / CH − and COVID-19 + / CH + ), were chosen as references, and PCs 1–50 were used for the reference-based integration. The integrated data included a total of 85,019 cells from 24 patient samples. Cells with more than 50% of reads mapped to mitochondrial genes, those with less than 200 unique genes detected and those that were predicted as doublets by Scrublet were removed. After QC filtering the number of cells was reduced to 78,083. Genes that were not expressed in at least 3 cells were also removed from downstream analysis. Uniform manifold approximation and projection (UMAP) was made using the top 50 PCs obtained by running PCA on the integrated (batch-corrected) gene expression matrix. Cell type identification was done with SingleR v1.10.0 using immune data from celldex v1.6.0 as reference. 66 , 67 Single-cell Multiome: Sequenced reads from the gene expression (GEX) and DNA accessibility (ATAC) droplet libraries of the Multiome assay were processed using 10x Genomics Cell Ranger ARC v2.0.0. The reads were aligned to the pre-built human reference genome GRCh38 - v2020-A-2.0.0 (May 3, 2021) provided by 10X Genomics. Read trimming, alignment, duplicate marking (ATAC), UMI counting (GEX), peak calling (ATAC) and joint cell calling were performed by Cell Ranger. Downstream processing was done using Seurat v4.0.4 and Signac v1.4.0. 68 GEX and ATAC count matrices were integrated (batch-corrected) independently using Seurat. Sample GEX count matrices were integrated following the same steps as used for the scRNA-seq data. Default options (CCA and pairwise anchor-finding) were used in the integration anchor finding step, since the data size was smaller than our scRNA-seq data. To merge the ATAC data from all samples, a common peak set was created by merging peaks from all samples using the reduce function from the R package GenomicRanges. Peaks that were smaller than 20 base pairs or larger than 10000 base pairs after merging were removed. The count matrix for each sample with the new common peak set was then recalculated using Signac. The count matrices were normalized using Term Frequency - Inverse Document Frequency (TF-IDF) and dimensionality reduced using singular value decomposition (SVD) using only peaks with non-zero counts in at least 20 cells, these two steps together known as latent semantic indexing (LSI) generating LSI components. The samples were then integrated using Seurat V4 integration with the reciprocal LSI (rLSI) method used on LSI components 2 to 50 (since the first LSI component correlates with sequencing depth) in the pairwise anchor finding step. The UMAP was calculated using the integrated LSI components 2 to 50. Seurat’s weighted nearest neighbor (WNN) algorithm was used on principal components 1 to 50 (GEX) and integrated LSI components 2 to 50 (ATAC) together to obtain a combined UMAP projection of both GEX and ATAC counterparts of the complete scMultiome dataset. The integrated data included 36,343 cells from 11 patient samples. Cells with more than 50% of reads mapped to mitochondrial genes, those with less than 200 unique genes detected (GEX), those with less than 200 unique peaks detected (ATAC) and those with transcription start site (TSS) enrichment score (as calculated by Signac) less than 1 were removed. After QC filtering the number of cells was reduced to 25,725. Cell type identification was done by using Azimuth algorithm to map the scMultiome GEX data to the scRNA-seq data since they were generated from the same cohort. The labels were then transferred from the scRNA-seq data to the single-cell Multiome data. Differential gene expression analysis: Differential gene expression testing was done on the log-normalized counts using Seurat’s FindMarkers function with default parameters unless specified otherwise. The statistical test applied was Wilcoxon Rank Sum test with p-values adjusted using Bonferroni correction based on the total number of genes in the dataset. Statistically significant differentially expressed genes were selected by keeping only genes that fall below the adjusted p-value threshold of 0.05. Differential gene expression testing comparing any two conditions (e.g., COVID-19 + / DNMT3A MT versus COVID-19 + / TET2 MT ) was always done for each cell type independently (although shown together in the volcano plots for efficient visualization), unless specified otherwise. To make sure that the results were not driven by a single patient sample we applied a leave-one-out approach on all tests where we removed cells from one sample at a time redoing the tests and keeping only the genes that passed the adjusted p-value threshold in all tests. Pathway analysis was done using the Ingenuity Pathway Analysis platform on differentially expressed genes. Differential DNA accessibility and motif analysis: The differentially accessible peaks were identified by comparing the TF-IDF normalized cut-site counts of any two pair of cell populations using Seurat’s FindMarkers function. Here, the method used was the logistic regression framework along with testing for the number of fragments in peaks as a latent variable. Motif enrichments in peaks were estimated using ChromVAR enrichment scores on the JASPAR2020 motif matrix set. 69 , 70 Enrichment of cut-sites in monocyte and macrophage specific ChIP-seq peaks of select DNA binding proteins were also estimated using ChromVAR by providing ChIP-seq peaks from ReMap2022 as input. 28 Co-accessibility scores between pairs of peaks were calculated using Cicero v1.3.5. 27 GoT-ChA 29 : Sequencing reads from the GoT-Cha experiments on two samples (DNMT3A mutants with the R882H mutation) were examined for sufficient sequencing depth. The GoT-Cha experiments yielded 168 million and 151 million reads for samples 1 and 2, respectively. For each sample, we generated four FASTQ files (*_I1_001.fastq.gz, *_R1_001.fastq.gz, *_R2_001.fastq.gz, and *_R3_001.fastq.gz). To precisely locate the cell barcode and mutation site, we randomly selected 5M reads from each FASTQ file and utilized the "sc_seqLogo.py" script from our RSeQC package to generate sequence logos. 71 Our analysis revealed that the "I1" FASTQ file contains an 8-nt sample barcode ( Supp. Figure 6B ), while the "R1" FASTQ file contains the 51-nt targeted DNA sequences, with the last three nucleotides representing the genotype of the R882 codon ( Supp. Figure 6C ). Notably, the codon is CGC (Arginine) for the wildtype and CAC (Histidine) for the mutant, given that the DNMT3A gene is located on the reverse strand. We confirmed that reads from the "R1" file could uniquely map to the targeted DNMT3A locus (chr2:25234324–25234374) on the human reference genome GRCh38/hg38. Furthermore, the "R2" FASTQ file contains the 16-nt cell barcode ( Supp. Figure 6D ), while the sequences from the "R3" FASTQ files are unknown ( Supp. Figure 6E ). We then combined the "R1" and "R2" FASTQ files into a single FASTA file, using the cell barcodes as the DNA sequence identifiers, thereby explicitly linking the cell barcode to the DNA sequence ( Supp. Figure 7 ). Any DNA sequence with Phred-scaled quality scores < 30 at the mutation site will be discarded. This preprocessing step not only significantly reduces the file size but also enhances computation speed. For read-level genotyping we used the following method. We utilized an IUPAC (International Union of Pure and Applied Chemistry) string to represent both wildtype and mutation genotypes. In this study, "C" and "T" are denoted as "Y" ( Supp. Figure 6C ). Then, we employed the motility C + + library ( https://github.com/ctb/motility ) to match this IUPAC string to each read. Utilizing IUPAC instead of a PWM (Position Weight Matrix) could dramatically enhance computation speed since the differences between wildtype and mutant reads are minimal (only 1 nucleotide change). We found that allowing for 1-mismatch could rescue an additional 10 to 12 million reads (~ 7%) in each sample, as compared exact match. Consequently, 77.5% and 75.6% of reads were successfully genotyped for the two samples, respectively. To distinguish real cells from background noise, we employed a methodology similar to CellRanger's approach in single-cell RNA sequencing analysis. After segregating reads based on cell barcode, we conducted nonparametric kernel density estimation using the "gaussian" kernel function ( Supp. Figure 8A, B ). The cell barcode density plot revealed three distinct modes likely corresponding to "cell", "cell-free DNA", and "empty droplets", respectively. To determine the cutoff point for cell calling, we identified the local minima (i.e., the changing point where the curve bends) from the density plot. For instance, in sample 1, the local minima for the cell mode is 3.0712, corresponding to 10^3.0712 = 1178 reads. This suggests that a cell with fewer than 1178 reads will be categorized as background ( Supp. Figure 8A ). A comparison with the Knee plot (or barcode rank plot) demonstrates that the detected threshold aligns well with the elbow point ( Supp. Figure 8A, B ). As a result, we identified 15185 and 5679 valid cells from samples 1 and 2, respectively. For cell-level genotyping we used the following method. We first calculated the mutant allele fraction (MAF) for each cell using formula described in code deposited on GitHub. Cells with MAF ≤ 0.2 were categorized as homozygous wildtype (WT/WT), cells with MAF ≥ 0.8 were classified as homozygous mutants (Mut/Mut), likely due to Loss of Heterozygosity (LOH), and the remaining cells were designated as heterozygotes (WT/Mut). 40% and 38% of cells in Samples 1 and 2, respectively, were classified as WT/WT ( Supp. Figure 8C ). The scATAC-seq data from the two samples were analyzed using the cellranger-atac workflow (version 2.1.0), using the reference file (refdata-cellranger-arc-GRCh38-2020-A-2.0.0) downloaded from https://support.10xgenomics.com/single-cell-atac/software/downloads/latest . 8858 and 6255 high-quality cells were identified in Samples 1 and 2, respectively. Declarations Ethics approval and consent to participate Samples were collected after informed consent and approval by Mayo Clinic's Institutional Review Board (IRB #20-005400, IRB #16-004173). Consent for publication Not applicable Availability of data and materials The bulk sequencing, scDNA-seq + proteogenomics, DNA Methylation, scRNA-seq, Multiome (GEX + ATAC) and GoTChA datasets generated and analyzed in this study have been deposited into the NCBI Gene Expression Omnibus (GEO) data base (https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE210435 (reviewers access token: qfcxaoecnjijxct). O-link data has been provided as Supplementary Data. Code and scripts used for analysis are made available in the GitHub repository https://github.com/LabFunEpi/CC_multiomics. Competing interests M.M.P., has received research funding from Kura Oncology, Epigenetix, Solutherapeutics, Polaris and StemLine Pharmaceuticals. Funding This work was supported by the Mayo Clinic Center for Individualized Medicine, by the Mayo Clinic Center for Biomedical Discovery to W.M.I. and A.G.M. and the DOD Ovarian Cancer Research Program (W81XWH2110475 to A.G.M.). MMP would also like to acknowledge the NCI for R01 grant R01CA272496. Author contributions M.P., A.G.M., N.C., conceived and designed the study with the help of M.B., T.L., W.M.I., J.F., and J.J.H. J.F., T.L., A.Mazzone, C.M.F., K.H.K., V.A.S., F.R.R., A.Munankarmy, S.K.B., M.R.S. and J-H.L. performed experiments. W.M.I., T.L., M.B., J.F. M.K., S.M.G., A.A.M., S.M.S. and L.W. analyzed the data. M.P., A.G.M. and M.B., wrote the manuscript with the help of W.M.I., J.F., K.R., N.C., A.P., and E.W. All authors critically revised and approved the final version of the manuscript. Acknowledgements The authors thank the Genome Analysis Core and the Biospecimens Accessioning and Processing (Mayo Clinic) for technical support, and Alessandro Gardini for help with access to their published datasets. References Yang L, Rau R, Goodell MA. DNMT3A in haematological malignancies. Nat Rev Cancer. 2015;15:152–65. Jaiswal S, Ebert BL. Clonal hematopoiesis in human aging and disease. Science 366, (2019). Jaiswal S, et al. Age-related clonal hematopoiesis associated with adverse outcomes. N Engl J Med. 2014;371:2488–98. Jaiswal S, et al. Clonal Hematopoiesis and Risk of Atherosclerotic Cardiovascular Disease. N Engl J Med. 2017;377:111–21. Buscarlet M, et al. Lineage restriction analyses in CHIP indicate myeloid bias for TET2 and multipotent stem cell origin for DNMT3A. Blood. 2018;132:277–80. Goyal P, et al. Clinical Characteristics of Covid-19 in New York City. N Engl J Med. 2020;382:2372–4. Moore JB, June CH. Cytokine release syndrome in severe COVID-19. Science. 2020;368:473–4. Onder G, Rezza G, Brusaferro S. Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy. JAMA. 2020;323:1775–6. Sah P et al. Asymptomatic SARS-CoV-2 infection: A systematic review and meta-analysis. Proc Natl Acad Sci U S A 118, (2021). Bolton KL, et al. Clonal hematopoiesis is associated with risk of severe Covid-19. Nat Commun. 2021;12:5975. Zhou Y, et al. Clonal hematopoiesis is not significantly associated with COVID-19 disease severity. Blood. 2022;140:1650–5. Duployez N et al. Clinico-Biological Features and Clonal Hematopoiesis in Patients with Severe COVID-19. Cancers (Basel) 12, (2020). Hameister E, et al. Clonal Hematopoiesis in Hospitalized Elderly Patients With COVID-19. Hemasphere. 2020;4:e453. Zekavat SM, et al. Hematopoietic mosaic chromosomal alterations increase the risk for diverse types of infection. Nat Med. 2021;27:1012–24. Netea MG, et al. IL-32 synergizes with nucleotide oligomerization domain (NOD) 1 and NOD2 ligands for IL-1beta and IL-6 production through a caspase 1-dependent mechanism. Proc Natl Acad Sci U S A. 2005;102:16309–14. Patnaik MM, et al. DNMT3A mutations are associated with inferior overall and leukemia-free survival in chronic myelomonocytic leukemia. Am J Hematol. 2017;92:56–61. National Cancer Institutes. Common Terminology Criteria for Adverse Events (CTCAE) v5.0.). (2017). World Health Organization. Clinical management of COVID-19: Living guideline.). (2020). Tulstrup M, et al. TET2 mutations are associated with hypermethylation at key regulatory enhancers in normal and malignant hematopoiesis. Nat Commun. 2021;12:6061. Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. Coltro G et al. Clinical, molecular, and prognostic correlates of number, type, and functional localization of TET2 mutations in chronic myelomonocytic leukemia (CMML)-a study of 1084 patients. Leukemia, (2019). Miles LA, et al. Single-cell mutation analysis of clonal evolution in myeloid malignancies. Nature. 2020;587:477–82. Stephenson E, et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat Med. 2021;27:904–16. Silvin A, et al. Elevated Calprotectin and Abnormal Myeloid Cell Subsets Discriminate Severe from Mild COVID-19. Cell. 2020;182:1401–e14181418. Sandoval L et al. Characterization and Optimization of Multiomic Single-Cell Epigenomic Profiling. Genes (Basel) 14, (2023). Pliner HA, et al. Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. Mol Cell. 2018;71:858–e871858. Hammal F, de Langen P, Bergon A, Lopez F, Ballester B. ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments. Nucleic Acids Res. 2022;50:D316–25. Myers RM, Izzo F, Prieto T, Mimitou E, Raviram R, Chaligne R, Hoffman R, Stahl O, Marcellino B, Smibert P, Landau D. High Throughput Single-Cell Simultaneous Genotyping and Chromatin Accessibility Reveals Genotype to Phenotype Relationship in Human Myeloproliferation. Blood 138, 678 (2021). Mitchell E, et al. Clonal dynamics of haematopoiesis across the human lifespan. Nature. 2022;606:343–50. Kusne Y, Xie Z, Patnaik MM. Clonal hematopoiesis: Molecular and clinical implications. Leuk Res. 2022;113:106787. Sano S, Oshima K, Wang Y, Katanasaka Y, Sano M, Walsh K. CRISPR-Mediated Gene Editing to Assess the Roles of Tet2 and Dnmt3a in Clonal Hematopoiesis and Cardiovascular Disease. Circ Res. 2018;123:335–41. Huang YH, et al. Systematic Profiling of DNMT3A Variants Reveals Protein Instability Mediated by the DCAF8 E3 Ubiquitin Ligase Adaptor. Cancer Discov. 2022;12:220–35. Nam AS, et al. Single-cell multi-omics of human clonal hematopoiesis reveals that DNMT3A R882 mutations perturb early progenitor states through selective hypomethylation. Nat Genet. 2022;54:1514–26. Yamazaki J, et al. Effects of TET2 mutations on DNA methylation in chronic myelomonocytic leukemia. Epigenetics: official J DNA Methylation Soc. 2012;7:201–7. Zhang Q, et al. Tet2 is required to resolve inflammation by recruiting Hdac2 to specifically repress IL-6. Nature. 2015;525:389–93. Lasho T, et al. Single cell proteogenomic analysis of aberrant monocytosis in TET2 mutant premalignant and malignant hematopoiesis. Leukemia. 2023;37:1384–7. Melenotte C, et al. Immune responses during COVID-19 infection. Oncoimmunology. 2020;9:1807836. Deshmane SL, Kremlev S, Amini S, Sawaya BE. Monocyte chemoattractant protein-1 (MCP-1): an overview. J Interferon Cytokine Res. 2009;29:313–26. Chen Y, et al. IP-10 and MCP-1 as biomarkers associated with disease severity of COVID-19. Mol Med. 2020;26:97. Selimoglu-Buet D, et al. A miR-150/TET3 pathway regulates the generation of mouse and human non-classical monocyte subset. Nat Commun. 2018;9:5455. Selimoglu-Buet D, et al. Accumulation of classical monocytes defines a subgroup of MDS that frequently evolves into CMML. Blood. 2017;130:832–5. Patnaik MM, Tefferi A. Chronic myelomonocytic leukemia: 2024 update on diagnosis, risk stratification and management. Am J Hematol, (2024). Kim SH, Han SY, Azam T, Yoon DY, Dinarello CA. Interleukin-32: a cytokine and inducer of TNFalpha. Immunity. 2005;22:131–42. Arber DA, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127:2391–405. McKenna A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303. Larson DE, et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics. 2012;28:311–7. Cingolani P, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92. Fiume M, Williams V, Brook A, Brudno M. Savant: genome browser for high-throughput sequencing data. Bioinformatics. 2010;26:1938–44. Landrum MJ, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–985. Liu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum Mutat. 2011;32:894–9. Stenson PD, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003;21:577–81. Tryggvadottir L, et al. Prostate cancer progression and survival in BRCA2 mutation carriers. J Natl Cancer Inst. 2007;99:929–35. Robinson D, et al. Integrative clinical genomics of advanced prostate cancer. Cell. 2015;161:1215–28. Robinson JT, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6. Mule MP, Martins AJ, Tsang JS. Normalizing and denoising protein expression data from droplet-based single cell profiling. Nat Commun. 2022;13:2099. Stuart T, et al. Comprehensive Integration of Single-Cell Data. Cell. 2019;177:1888–e19021821. Aryee MJ, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–9. Maksimovic J, Gordon L, Oshlack A, SWAN. Subset-quantile within array normalization for illumina infinium HumanMethylation450 BeadChips. Genome Biol. 2012;13:R44. Bernstein BE, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045–8. McLean CY, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28:495–501. Garapati K, et al. Multiomics single timepoint measurements to predict severe COVID-19 - Authors' reply. Lancet Digit Health. 2023;5:e57. Zheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049. Wolock SL, Lopez R, Klein AM, Scrublet. Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst. 2019;8:281–e291289. Hao Y, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–e35873529. Aran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol. 2019;20:163–72. Monaco G, et al. RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types. Cell Rep. 2019;26:1627–e16401627. Stuart T, Srivastava A, Madad S, Lareau CA, Satija R. Single-cell chromatin state analysis with Signac. Nat Methods. 2021;18:1333–41. Castro-Mondragon JA et al., JASPAR. 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res 50, D165-D173 (2022). Schep AN, Wu B, Buenrostro JD, Greenleaf WJ. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat Methods. 2017;14:975–8. Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28:2184–5. Additional Declarations Competing interest reported. M.M.P., has received research funding from Kura Oncology, Epigenetix, Solutherapeutics, Polaris and StemLine Pharmaceuticals. Supplementary Files SupplementaryData.xlsx SupplementaryFigures.pdf SupplementaryTables.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4481664","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":310943011,"identity":"9b38d291-d0ee-4cc6-8931-3653a0c9583d","order_by":0,"name":"Wazim Ismail Mohammed","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Wazim","middleName":"Ismail","lastName":"Mohammed","suffix":""},{"id":310943012,"identity":"7fc6be63-cb29-479c-8836-368de989bcc9","order_by":1,"name":"Jenna Fernandez","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Jenna","middleName":"","lastName":"Fernandez","suffix":""},{"id":310943013,"identity":"5e8c56c2-13ff-406a-85ba-86cb2adb9e88","order_by":2,"name":"Moritz Binder","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Moritz","middleName":"","lastName":"Binder","suffix":""},{"id":310943014,"identity":"27e4d18d-41f3-45a2-8e8f-70ea0f752339","order_by":3,"name":"Terra Lasho","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Terra","middleName":"","lastName":"Lasho","suffix":""},{"id":310943015,"identity":"4e3c85b2-3f69-429f-aae4-f50b30286827","order_by":4,"name":"Minsuk Kim","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Minsuk","middleName":"","lastName":"Kim","suffix":""},{"id":310943016,"identity":"6eb5055a-a936-4378-9dbd-42033bd83036","order_by":5,"name":"Susan Geyer","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Susan","middleName":"","lastName":"Geyer","suffix":""},{"id":310943018,"identity":"fc77c4ac-b4c1-498f-bba9-e3db90289f8d","order_by":6,"name":"Amelia Mazzone","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Amelia","middleName":"","lastName":"Mazzone","suffix":""},{"id":310943020,"identity":"7a16e50e-40b2-4d30-8055-7f260f40bda4","order_by":7,"name":"Christy Finke","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Christy","middleName":"","lastName":"Finke","suffix":""},{"id":310943021,"identity":"7cf83812-24cb-463e-85fd-36c060fb71ed","order_by":8,"name":"Abhishek Mangaonkar","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Abhishek","middleName":"","lastName":"Mangaonkar","suffix":""},{"id":310943028,"identity":"f417c92f-667c-4936-9312-26f7a9600968","order_by":9,"name":"Jeong-Heon Lee","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Jeong-Heon","middleName":"","lastName":"Lee","suffix":""},{"id":310943029,"identity":"c0ad05ed-89d0-4d46-b94d-f766cf6a4a29","order_by":10,"name":"Liguo Wang","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Liguo","middleName":"","lastName":"Wang","suffix":""},{"id":310943030,"identity":"81acb95e-8eb8-44e5-874b-7cd9f7749a0f","order_by":11,"name":"Kwan Hyun Kim","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Kwan","middleName":"Hyun","lastName":"Kim","suffix":""},{"id":310943035,"identity":"af8d9816-3ebf-41d9-9ce5-6b26cb755f4c","order_by":12,"name":"Vernadette Simon","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Vernadette","middleName":"","lastName":"Simon","suffix":""},{"id":310943041,"identity":"4354f183-fe37-4b01-84a5-93af1e55b8a2","order_by":13,"name":"Fariborz Rakhshan Rohakthar","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Fariborz","middleName":"Rakhshan","lastName":"Rohakthar","suffix":""},{"id":310943044,"identity":"273edcb9-381e-4201-a19a-0b1001f8eb91","order_by":14,"name":"Amik Munankarmy","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Amik","middleName":"","lastName":"Munankarmy","suffix":""},{"id":310943045,"identity":"250e3e2a-c451-4ff7-b613-43b7541567a9","order_by":15,"name":"Seul Kee Byeon","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Seul","middleName":"Kee","lastName":"Byeon","suffix":""},{"id":310943047,"identity":"9bfa88ab-74d9-48c1-aed1-9f93daee937b","order_by":16,"name":"Susan Schwager","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Susan","middleName":"","lastName":"Schwager","suffix":""},{"id":310943049,"identity":"e18ae325-7975-4a75-bd51-965f709cb812","order_by":17,"name":"Jonathan Harrington","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Jonathan","middleName":"","lastName":"Harrington","suffix":""},{"id":310943052,"identity":"f24bc0d7-dc0c-4567-94c9-4b4d64c09048","order_by":18,"name":"Melissa Snyder","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Melissa","middleName":"","lastName":"Snyder","suffix":""},{"id":310943053,"identity":"dff19dd5-cfdd-476e-8016-76f7b4e397d8","order_by":19,"name":"Keith Robertson","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Keith","middleName":"","lastName":"Robertson","suffix":""},{"id":310943055,"identity":"5531749a-afa6-47b6-ad22-0b6d08ed7475","order_by":20,"name":"Akhilesh Pandey","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Akhilesh","middleName":"","lastName":"Pandey","suffix":""},{"id":310943059,"identity":"e6d04bb4-b641-464a-bd70-7e54d25df04b","order_by":21,"name":"Eric Wieben","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Eric","middleName":"","lastName":"Wieben","suffix":""},{"id":310943062,"identity":"76948193-2ab4-424f-8632-b863478c9b3c","order_by":22,"name":"Nicholas Chia","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Nicholas","middleName":"","lastName":"Chia","suffix":""},{"id":310943063,"identity":"5b96f579-26b1-4670-b917-ab608526b591","order_by":23,"name":"Alexandre Gaspar-Maia","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Alexandre","middleName":"","lastName":"Gaspar-Maia","suffix":""},{"id":310943066,"identity":"78bcbfbb-201a-4ca4-a461-e8003a3975d9","order_by":24,"name":"Mrinal Patnaik","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA90lEQVRIiWNgGAWjYNACgwQgwdjAwFBgAxVhw6ecGVmLQRqxWhgSYNYdJqzFnP38sQc/CtIYdGcfbnvww+B8NP+M5AMMH8oO49Ri2ZPMbthjkMNgdi6xHci4nTvjRloC44xzuLUYHEhmk+AxqGAwO8PYBmTczm24kWPAzNuGR8v5x2ySf6BagIxzufNBWv7i03IjmU2aB+QwoBYg40DuBpAWRjxaLGc8NpOWMUjjAWuRMUjO3XjmWcLBnnPpOLWY8yc+k3zzJ1nO7Aw7kFFhlzvvePLBBz/KrHE7DErzIIQEEhgO4FSPpAUJ8OPVMApGwSgYBSMQAADbqFVOLUbg8QAAAABJRU5ErkJggg==","orcid":"","institution":"Mayo Clinic","correspondingAuthor":true,"prefix":"","firstName":"Mrinal","middleName":"","lastName":"Patnaik","suffix":""}],"badges":[],"createdAt":"2024-05-27 01:16:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4481664/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4481664/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":58244945,"identity":"319a7550-82c7-4323-b52f-51bfba8d4de0","added_by":"auto","created_at":"2024-06-13 02:18:09","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":114945,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eClinical features, prevalence of clonal hematopoiesis, and disease-related outcomes of 243 patients hospitalized with COVID-19 in the pre-vaccination era\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, Heatmap showing the spectrum of clonal hematopoiesis (CH) mutations, sex distribution, COVID-19 related complications, prevalence of cytokine release syndrome (CRS), and serum cytokines and inflammatory markers. NIV, Non-Invasive Ventilation; IMV, Invasive Mechanical Ventilation; AKI, acute kidney injury; ALI, Acute Lung Injury; ARDS, acute respiratory distress syndrome; MODS, multiple organ dysfunction syndrome; CLOT, venous thromboembolism; CRP, C-reactive protein. \u003cstrong\u003eB\u003c/strong\u003e, Bar plots comparing the distribution of cytokine release syndrome (CRS) severity (WHO CRS severity scale) between patients with CH and without CH (No CH). There was a decrease in mild cases and an increase in fatal occurrences of CRS among patients with underlying CH (Fisher’s exact test, \u003cem\u003ep\u003c/em\u003e = 0.006). \u003cstrong\u003eC\u003c/strong\u003e, Bar plots comparing the prevalence of Acute Respiratory Distress Syndrome (ARDS) among COVID-19 patients with \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH. ARDS exclusively occurred in COVID-19 patients with underlying \u003cem\u003eDNMT3A\u003c/em\u003emt CH but not \u003cem\u003eTET2\u003c/em\u003emt CH (Mann-Whitney test, \u003cem\u003ep\u003c/em\u003e = 0.007). \u003cstrong\u003eD\u003c/strong\u003e, Box plots comparing the serum Monocyte Chemoattractant Protein 1 (MCP-1) concentrations among COVID-19 patients with \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH at the time of hospitalization. There was an increase in serum MCP-1 concentration in COVID-19 patients with underlying \u003cem\u003eDNMT3A\u003c/em\u003emt CH compared to those with \u003cem\u003eTET2\u003c/em\u003emt CH (Mann-Whitney test, \u003cem\u003ep\u003c/em\u003e = 0.014).\u003cstrong\u003e E\u003c/strong\u003e, Kaplan-Meier plot showing the overall survival estimates for 243 COVID-19 patients, stratified by CH status. There was increased all-cause mortality among COVID-19 patients with underlying CH (log-rank test, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001). \u003cstrong\u003eF\u003c/strong\u003e, Kaplan-Meier plot showing the overall survival estimates for 218 COVID-19 patients, stratified by CH status (further stratified into \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH). The increased all-cause mortality among COVID-19 patients with underlying CH was mainly driven by \u003cem\u003eDNMT3A\u003c/em\u003emt CH (log-rank test, \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.001). This association remained consistent after adjusting for age at COVID-19 diagnosis: HR 2.84 (95% CI 1.16 - 6.94\u003cstrong\u003e,\u003c/strong\u003e \u003cem\u003ep\u003c/em\u003e = 0.022).\u003c/p\u003e","description":"","filename":"Picture1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/7b23b3d8492fd7f2ac36b2e0.jpg"},{"id":58244370,"identity":"8102ccf3-490e-4b56-8c33-a20bdad07715","added_by":"auto","created_at":"2024-06-13 02:10:09","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":134576,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDNA methylation changes across the genome in \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eTET2\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003emt CH and \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eDNMT3A\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003emt CH patients.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, Boxplot showing the comparison of global DNA methylation between \u003cem\u003eDNMT3A\u003c/em\u003emt and \u003cem\u003eTET2\u003c/em\u003emt CH. There was no significant difference by Wilcoxon signed-rank test (\u003cem\u003ep\u003c/em\u003e = 0.057). \u003cstrong\u003eB\u003c/strong\u003e, Density plot demonstrating DNA methylation differences between \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH, primarily affecting highly methylated CpGs (β \u0026gt; 0.75, Kolmogorov-Smirnov test \u003cem\u003ep\u003c/em\u003e \u0026lt; 2.2x10\u003csup\u003e-16\u003c/sup\u003e). \u003cstrong\u003eC\u003c/strong\u003e, Circos plot showing the number, genomic location, and density of differentially methylated regions between \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH. There was an increased number of hypomethylated sites in \u003cem\u003eDNMT3A\u003c/em\u003emt CH compared to \u003cem\u003eTET2\u003c/em\u003emt CH. \u003cstrong\u003eD\u003c/strong\u003e, Functional annotation of the differentially methylated regions (with 𝛥𝛽 \u0026gt; 10% and \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.010) using the ENCODE Epigenomics Roadmap peripheral blood mononuclear cell reference data. Hypermethylation of enhancers (Enh) and promoters (TssA, TssAFlnk) was more commonly observed in \u003cem\u003eDNMT3A\u003c/em\u003emt CH (compared to \u003cem\u003eTET2\u003c/em\u003emt CH), whereas the hypomethylation observed in \u003cem\u003eDNMT3A\u003c/em\u003e CH was predominately found at actively transcribed states (Tx, TxWk).\u003c/p\u003e","description":"","filename":"Picture2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/434d19303c578bced34c6f44.jpg"},{"id":58243020,"identity":"8b6f38a8-1a7e-4654-acff-f166e85caafd","added_by":"auto","created_at":"2024-06-13 02:02:09","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":193172,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIdentification of cell types carrying CH mutations by combined single-cell surface protein and genotype analysis.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, Overview of COVID-19 patient cohorts with \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH, \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH, \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e (co-mutant) CH, and without CH (CH-), analyzed using single cell proteogenomics (Tapestri assay). Mutations and variant allele frequencies are shown above, and patient ages are shown below the figurines. \u003cstrong\u003eB\u003c/strong\u003e, UMAP projections showing the distribution of 28,941 cells from single cell proteogenomics analysis from 13 patient samples, colored by cell types. Key: Treg, regulatory T-cells; gdT, gamma delta T-cells; MAIT, Mucosal-associated invariant T cells; NK, natural killer cells; Mono, monocytes; Int, intermediate; cDC, classical dendritic cells; pDC, plasmacytoid dendritic cells; HSPC, hematopoietic stem and progenitor cells. The bar below shows the proportion of cells in each cell type. \u003cstrong\u003eC\u003c/strong\u003e, UMAP projections of the single cell proteogenomics data showing only mutated cells which are then stratified by \u003cem\u003eTET2\u003c/em\u003e and \u003cem\u003eDNMT3A\u003c/em\u003e mutated cells, demonstrating the myeloid and lymphoid lineage restriction in \u003cem\u003eTET2\u003c/em\u003emt and \u003cem\u003eDNMT3A\u003c/em\u003emt cells respectively. The bars below show the proportion of cells in each cell type. \u003cstrong\u003eD-E\u003c/strong\u003e, Bar plots showing the proportion of cells in each cell type stratifying cells by sample/patient genotype (\u003cstrong\u003eD\u003c/strong\u003e) and by cell genotype (\u003cstrong\u003eE\u003c/strong\u003e). While \u003cem\u003eTET2\u003c/em\u003e mutations had a clear myeloid lineage restriction bias, \u003cem\u003eDNMT3A\u003c/em\u003e mutations were seen in myeloid and lymphoid lineages, respectively.\u003c/p\u003e","description":"","filename":"Picture3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/59fcf522d8f1e50105f2965a.jpg"},{"id":58243017,"identity":"2e9e880d-98a5-48af-a9e0-5f99c63f8dbd","added_by":"auto","created_at":"2024-06-13 02:02:09","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":101209,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003e\u0026nbsp;Identification of biomarkers of COVID-19 severity associated with CH mutations. A\u003c/strong\u003e, Overview of COVID-19 patient cohorts without CH (CH-), \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH and \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH, analyzed using scRNA-seq. Mutations and variant allele frequencies are shown above, and patient ages are shown below the figurines. \u003cstrong\u003eB\u003c/strong\u003e, UMAP projections showing the distribution of 78,083 cells from single cell RNA-seq analysis from 24 patient samples, colored by cell types identified using SingleR. Key: Treg, regulatory T-cells; gdT, gamma delta T-cells; MAIT, Mucosal-associated invariant T cells; NK, natural killer cells; Mono, monocytes; Int, intermediate; cDC, classical dendritic cells; pDC, plasmacytoid dendritic cells; HSPC, hematopoietic stem and progenitor cells. The bar below shows the proportion of cells in each cell type. \u003cstrong\u003eC\u003c/strong\u003e, Proportion of cells in each cell type stratified by 5 conditions as shown in y-axis, in the scRNA-seq analysis. The healthy cohort (from Stephenson et al. PMID: 33879890) is further stratified by age: below 50 and above 50 years, respectively. \u003cstrong\u003eD-E\u003c/strong\u003e, Volcano plots showing significantly differentially expressed genes (adjusted \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.05, Wilcoxon rank sum test) in comparisons between CH- and \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH, (\u003cstrong\u003eD\u003c/strong\u003e) and between \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH and \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH (\u003cstrong\u003eE\u003c/strong\u003e), both in the context of COVID-19. Cells from each cell type were tested independently. Bars below the volcano plots show the proportion of genes per cell type that are down- and up-regulated in the \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH in each comparison. \u003cstrong\u003eF\u003c/strong\u003e, Violin plots showing expression of \u003cem\u003eIL32\u003c/em\u003e in cell types where \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients had significantly higher expression compared to \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH. Black dots show mean expression. ****: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.0001 (Wilcoxon rank sum test).\u003c/p\u003e","description":"","filename":"Picture4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/26ddf8cc1571b917d46726ec.jpg"},{"id":58243021,"identity":"c44e18b0-3acc-49a6-8062-c647ae5f7a1a","added_by":"auto","created_at":"2024-06-13 02:02:09","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":224170,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCharacterization of epigenetic deregulation in CH patients with \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eDNMT3A\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e mutations. A\u003c/strong\u003e, Proportion of each cell type identified in the scRNA-seq data from the 10X Multiome platform stratified by 5 conditions as shown in y-axis. The healthy cohort is further stratified by age: below 50 and above 50 years, respectively. Key: Treg, regulatory T-cells; gdT, gamma delta T-cells; MAIT, Mucosal-associated invariant T cells; NK, natural killer cells; Mono, monocytes; Int, intermediate; cDC, classical dendritic cells; pDC, plasmacytoid dendritic cells; HSPC, hematopoietic stem and progenitor cells. \u003cstrong\u003eB\u003c/strong\u003e, Violin plots showing cell type specific changes in chromatin accessibility measured as the total number of cut sites (sum of TF-IDF normalized cut site counts across all peaks; scATAC-seq data from Multiome) in each cell type. Only cell types with more than 100 cells are shown. Black dots show the mean value. ns: \u003cem\u003ep\u003c/em\u003e \u0026gt; 0.05, *: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.05, **: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.01, ***: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.001, ****: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.0001 (Wilcoxon rank sum test). NK and CD4 T cells showed significant increase in chromatin accessibility in \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients compared to \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH in the context of COVID-19. \u003cstrong\u003eC\u003c/strong\u003e, Coverage plot showing epigenomic dysregulation of \u003cem\u003eIL32\u003c/em\u003e in \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH compared to \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH, through multiomics analysis. The plot shows the co-accessible peaks with \u003cem\u003eIL32\u003c/em\u003e transcription start site (TSS) in \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients and \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients, both with COVID-19 (co-accessibility score \u0026gt; 0.1; blue and red arcs), the chromatin accessibility signal per group of cells, \u003cem\u003eIL32\u003c/em\u003e gene expression (violin plots; right), candidate \u003cem\u003ecis\u003c/em\u003e-regulatory elements predicted by ENCODE (colored-coded bars), open-chromatin peaks (grey bars), differentially accessible peaks that are more accessible in \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients compared to \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients in CD4 T cells (light blue bars), CpG sites hypomethylated in \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients compared to \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients (dark blue bars) and CpG sites overlapping open-chromatin regions (black bars) around the \u003cem\u003eIL32\u003c/em\u003e gene locus. Labeled loci A and B are chr16:3123999-3124965 and chr16:3263558-3264913, respectively. These loci are regions where \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH patients gained accessibility in CD4 T cells, overlapped with hypomethylated CpG sites and gained co-accessibility with \u003cem\u003eIL32\u003c/em\u003e transcription start site (TSS). \u003cstrong\u003eD\u003c/strong\u003e, Box plots showing methylation levels (β values) per patient in \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT \u003c/sup\u003eCH and \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e CH cohorts (both with COVID-19) at three hypomethylated CpG sites shown in panel C. The middle line represents the median, the lower and upper edges of the rectangle represent the first and third quartiles and the lower and upper whiskers represent the interquartile range (IQR) × 1.5. The groups were compared using Wilcoxon rank sum test. \u003cstrong\u003eE-F\u003c/strong\u003e, Violin plots showing significant increase in chromatin accessibility in \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e cells compared to \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eWT\u003c/sup\u003e cells as determined by GoTChA analysis. The data shown are the total number of cut sites (\u003cstrong\u003eE\u003c/strong\u003e) and the number of cut sites at loci A and B from panel C (\u003cstrong\u003eF\u003c/strong\u003e), in \u003cem\u003eDNMT3A\u003c/em\u003e wild-type and \u003cem\u003eDNMT3A\u003c/em\u003e mutant cells from two samples (\u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003eclonal cytopenias of undetermined significance) profiled using GoTChA. Red dots show the mean value. Mutation site in the \u003cem\u003eDNMT3A\u003c/em\u003e gene is shown in the bottom of panel E. ns: \u003cem\u003ep\u003c/em\u003e \u0026gt; 0.05, *: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.05, **: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.01, ***: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.001, ****: \u003cem\u003ep\u003c/em\u003e \u0026lt;= 0.0001 (Wilcoxon rank sum test).\u003c/p\u003e","description":"","filename":"Picture5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/234aa9c2ce3c9b08d0f558ed.jpg"},{"id":58623086,"identity":"a47aad04-0835-4972-a80f-5b4dc31beb47","added_by":"auto","created_at":"2024-06-19 03:52:22","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1843740,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/bd054809-4e5f-4814-828e-3bae99c5e678.pdf"},{"id":58243023,"identity":"4d0f3214-9698-4e28-8de8-826c983bdcd3","added_by":"auto","created_at":"2024-06-13 02:02:09","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":10564626,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryData.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/f800d306708d501010724156.xlsx"},{"id":58244379,"identity":"00661f02-0dd0-45fa-a406-e8fc87fe1023","added_by":"auto","created_at":"2024-06-13 02:10:09","extension":"pdf","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":25307244,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigures.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/e70bc41143f5bbbab2ac6de5.pdf"},{"id":58243022,"identity":"ac6904f8-b157-4975-91d7-27e23d39ed3e","added_by":"auto","created_at":"2024-06-13 02:02:09","extension":"docx","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":35269,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTables.docx","url":"https://assets-eu.researchsquare.com/files/rs-4481664/v1/3339d27a56bdf09ca406b1a9.docx"}],"financialInterests":"Competing interest reported. M.M.P., has received research funding from Kura Oncology, Epigenetix, Solutherapeutics, Polaris and StemLine Pharmaceuticals.","formattedTitle":"Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003e \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e are key epigenetic regulator genes with opposing effects on DNA methylation. While DNMT3A is responsible for the \u003cem\u003ede novo\u003c/em\u003e conversion of cytosine (C) to methylcytosine (mC), resulting in gene silencing, TET2 catalyzes the conversion of mC to 5-hydroxy-mC and subsequent oxidative metabolites, resulting in gene activation.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e are also the two most frequently mutated genes in age related clonal hematopoiesis (CH; \u0026gt;70%) and in spite of opposing epigenetic effects, have a convergent impact with regards to hematopoietic stem and progenitor cell (HSPC) fitness, inflammaging, low rates of progression to hematological neoplasms and increased all-cause mortality, largely due to atherosclerotic cardiovascular disease (ASCD).\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e In CH, while \u003cem\u003eTET2\u003c/em\u003e mutations have been associated with a myeloid lineage bias, \u003cem\u003eDNMT3A\u003c/em\u003e mutations have a broader distribution, affecting myeloid and lymphoid lineage cells.\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen responsible for coronavirus disease 2019 (COVID-19), has resulted in an ongoing pandemic associated with significant morbidity, mortality, and long-term sequelae.\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e While infections can vary from asymptomatic carrier states to severe cytokine release syndrome (CRS), acute respiratory distress (ARDS) and associated multi organ dysfunction syndrome (MODS), reasons for clinical heterogeneity have partially been investigated, with several viral (strain type) and host factors (age, comorbidities, immune status, ACE2 receptor polymorphisms, among others) congruently being involved.\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn a large cohort of 515 patients with COVID-19, CH was associated with severe COVID-19 outcomes, including increased mortality.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e However, many of these patients had underlying visceral malignancies, with several getting chemo- / immunotherapy, and except for a non-significant trend with \u003cem\u003ePPM1D\u003c/em\u003e mutations, there were no clear mutational associations detected.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003eIn a subsequent study of non-cancer patients, CH was identified in 33% of 568 patients affected by COVID-19, with \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e mutant (mt) CH being most frequent. In this study, neither did the presence of \u003cem\u003eDNMT3A/TET2\u003c/em\u003emt CH, nor their variant allele fractions (VAF) impact COVID-19 related outcomes\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. Studies with smaller sample sizes and varying COVID-19 severity have demonstrated similar prevalence rates of CH, with no clear impact on outcomes\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. Autosomal mosaic chromosomal abnormalities, considered a form of CH, have been documented across large biobanks (n\u0026thinsp;=\u0026thinsp;768,762) and have been associated with increased infections, including a higher incidence of sepsis, respiratory tract infections, gastrointestinal and genitourinary infections.\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn a cohort of 243 community patients, we demonstrate an age- and comorbidity-independent adverse impact of \u003cem\u003eDNMT3A\u003c/em\u003emt CH on COVID-19 related inflammatory outcomes and over-all survival (OS). \u003cem\u003eDNMT3A\u003c/em\u003emt CH was the most frequent CH subtype, with DNA methylation studies showing a decrease in CpG site DNA methylation, mostly in distal enhancer-like elements, compared to other CH genotypes and wildtype cases. Single cell proteogenomics indicated that \u003cem\u003eDNMT3A\u003c/em\u003e mutations were distributed across lymphoid and myeloid lineage cells, unlike in \u003cem\u003eTET2\u003c/em\u003e mt CH, where the mutations were enriched in monocytes/macrophages. Single-cell transcriptomics demonstrated an increased expression of \u003cem\u003eIL32\u003c/em\u003e, originating from NK and T lymphocytes and to some extent from classical monocytes. IL32 is a proinflammatory cytokine that can lead to monocyte/macrophage related inflammasome activation and production of cytokines such as TNF-alpha, IL8 and MIP-2.\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. Using proximity extension assay-based proteomics (O-link), we demonstrate a negative impact of IL32 expression levels on mortality. Finally, using a combination of single cell multiome and genotyping of targeted loci with chromatin accessability (GoTChA), we identify putative epigenetic mechanisms regulating this response. These data highlight the role of \u003cem\u003eDNMT3A\u003c/em\u003emt CH in enhancing immune cell dysregulation in the context of COVID-19 infection, which may account for increased disease severity.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003e \u003cem\u003eDNMT3Amt CH in patients with COVID-19 is associated with increased severity of cytokine release and increased age- and comorbidity-independent mortality\u003c/em\u003e \u003c/p\u003e \u003cp\u003eA cohort of 243 community-based patients with COVID-19 (alpha strain- preimmunization era) was included in the study, median age 60 years (range 19\u0026ndash;99 years), of which 72 (29.6%) patients had evidence of CH (Supp. table 1 and Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). Apart from the fact that patients with both COVID-19 and CH were older (median age for CH with COVID-19 68.5 years versus 57 years for CH without COVID-19; p\u0026thinsp;\u0026lt;\u0026thinsp;0.0001), there were no significant differences in sex (p\u0026thinsp;=\u0026thinsp;0.82), race/ethnicity (p\u0026thinsp;=\u0026thinsp;0.07), hospitalization rates (p\u0026thinsp;=\u0026thinsp;0.99), oxygen requirements (p\u0026thinsp;=\u0026thinsp;0.79), or incidence of CRS (p\u0026thinsp;=\u0026thinsp;0.53). There were differences in the distribution of comorbidities between the two groups (p\u0026thinsp;=\u0026thinsp;0.008), with the non-CH group having a higher frequency of obesity (Supp. table 1). There, however, were no differences in baseline blood indices (mean corpuscular volume and red cell distribution width) between the two groups, in hemoglobin levels (p\u0026thinsp;=\u0026thinsp;0.099), absolute neutrophil counts (ANC, p\u0026thinsp;=\u0026thinsp;0.15), absolute monocyte counts (AMC; p\u0026thinsp;=\u0026thinsp;0.86), or platelet counts (p\u0026thinsp;=\u0026thinsp;0.41), respectively (Supp. table 2). Apart from elevated MCP-1 (monocyte chemoattractant protein-1) levels obtained at COVID-19 diagnosis in the COVID-19 CH cohort compared to the COVID-19 cohort without CH (p\u0026thinsp;=\u0026thinsp;0.045), there were no other significant differences in clinically measured cytokines / chemokines, including IL1b, IL6, GM-CSF and TNF-alpha, or inflammatory surrogates like C-reactive protein (p\u0026thinsp;=\u0026thinsp;0.087) and serum ferritin (p\u0026thinsp;=\u0026thinsp;0.62) (Supp. table 2 and 3).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eNinety-seven CH mutations were seen in 72 patients, with 21 (29%) having 2 CH mutations, 1 (1.3%) having 3 CH mutations and 2 (2.7%) having 4 CH mutations, respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). While none of these patients underwent bone marrow (BM) biopsies, they did not have an underlying diagnosed hematological neoplasm at the time of COVID-19 detection and none of these patients demonstrated disease evolution at last follow-up (median 27 months). The most common CH mutations encountered included \u003cem\u003eDNMT3A\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;30, 30%), \u003cem\u003eTET2\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;26, 28%), \u003cem\u003eASXL1\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;7, 9.7%), \u003cem\u003eSF3B1\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;6, 8%), \u003cem\u003eTP53\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;3, 4%) and \u003cem\u003ePPM1D\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;3, 4%), respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). None of the \u003cem\u003eTP53\u003c/em\u003e and \u003cem\u003ePPM1D\u003c/em\u003e mutated patients had prior exposures to cytotoxic chemotherapy or ionizing radiation therapy. Four patients had 2 \u003cem\u003eDNMT3A\u003c/em\u003e mutations, 2 patients had 2 \u003cem\u003eTET2\u003c/em\u003e mutations, and one patient had both a \u003cem\u003eTET2\u003c/em\u003e and a \u003cem\u003eDNMT3A\u003c/em\u003e mutation. There were 4 patients with \u003cem\u003eDNMT3A\u003c/em\u003e R882 hot spot mutations, commonly seen in AML and MDS. \u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003ePatients with CH and COVID-19 had higher grades of CRS in comparison to those without CH as documented by CTCAE v5.0 criteria (p\u0026thinsp;=\u0026thinsp;0.006; Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB) and by the WHO COVID-19 severity criteria (p\u0026thinsp;=\u0026thinsp;0.023).\u003csup\u003e17, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e There were no significant differences between the CH and no-CH groups with regards to incidence rates of acute lung injury (p\u0026thinsp;=\u0026thinsp;0.8), ARDS (p\u0026thinsp;=\u0026thinsp;0.49), acute kidney injury (p\u0026thinsp;=\u0026thinsp;0.5), MODS (p\u0026thinsp;=\u0026thinsp;0.09) and venous thromboembolism (p\u0026thinsp;=\u0026thinsp;0.76) (Supp. table 4). We then compared these outcomes in the two most common CH mutant groups, \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e. \u003cem\u003eDNMT3A\u003c/em\u003emt CH patients had a higher frequency of ARDS in comparison to \u003cem\u003eTET2\u003c/em\u003emt CH (p\u0026thinsp;=\u0026thinsp;0.007; Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). The \u003cem\u003eDNMT3A\u003c/em\u003emt CH group also had significantly higher levels of MCP-1 in comparison to the \u003cem\u003eTET2\u003c/em\u003emt CH group (p\u0026thinsp;=\u0026thinsp;0.014; Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD). Both groups received similar therapeutic interventions including remdesivir, IL-6 directed monoclonal antibody therapies (tocilizumab and siltuximab), corticosteroids, and access to clinical trials (p\u0026thinsp;=\u0026thinsp;0.45, Supp. table 4) ). At last follow-up (median 27 months), 16 deaths (6.5%) have been documented, 10 (4%) in COVID-19 patients with CH and 6 (2.4%) in COVID-19 patients without CH.\u003c/p\u003e \u003cp\u003eOn a univariate and multivariate survival analysis that included several clinical and laboratory variables, the presence of CH negatively impacted OS in patients with COVID-19 (p\u0026thinsp;=\u0026thinsp;0.001, Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eE). The one-month estimated survival rate was 97.6% for COVID-19 patients without CH and 81.8% in COVID-19 patients with CH (median OS not reached versus 13 months).\u003c/p\u003e \u003cp\u003eWe then analyzed the impact of \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e mutations (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), the two most common somatic mosaic states in our cohort on COVID-19 related morbidity and mortality. There were no significant differences between the two mutational cohorts regarding age and other comorbidities. While both \u003cem\u003eTET2\u003c/em\u003emt CH and \u003cem\u003eDNMT3A\u003c/em\u003emt CH negatively impacted survival, after adjustment for age and comorbidities, only \u003cem\u003eDNMT3A\u003c/em\u003emt CH retained an independent prognostic effect (p\u0026thinsp;\u0026lt;\u0026thinsp;0.001; Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eF). We hence demonstrate the prevalence, clinical characteristics, and the age- and comorbidity-independent impact of \u003cem\u003eDNMT3Amt\u003c/em\u003e CH on inflammatory morbidity and overall mortality, in community dwelling patients infected by the alpha strain of SARS-Cov-2.\u003c/p\u003e \u003cp\u003e \u003cem\u003eDNMT3Amt CH in the context of COVID-19 is associated with decreased DNA methylation at CpG residues in contrast to patients with TET2mt CH and COVID-19\u003c/em\u003e \u003c/p\u003e \u003cp\u003eGiven that both \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e have opposing impacts on DNA methylation, we first assessed DNA methylation status using the Illumina Infinium Methylation EPIC array on peripheral blood mononuclear cells (PBMC) from the COVID-19 and CH cohort. \u003cem\u003eDNMT3A\u003c/em\u003e mutations have been associated with DNA hypomethylation at key enhancer sites in granulocytes and mononuclear cells in patients with CH, with these elements known to regulate leukocyte function, inflammation, and adaptive immune responses.\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e We included 7 patients with CH and COVID-19 (\u003cem\u003eTET2mt\u003c/em\u003e \u0026ndash; 4 and \u003cem\u003eDNMT3Amt\u003c/em\u003e \u0026ndash; 3). Even though there were no significant global changes in DNA methylation between the two groups (p\u0026thinsp;=\u0026thinsp;0.057, Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA), \u003cem\u003eDNMT3A\u003c/em\u003emt patients with COVID-19 demonstrated decreased methylation at highly methylated CpG sites (b\u0026thinsp;\u0026gt;\u0026thinsp;0.75) (Kolmogorov-Smirnov p\u0026thinsp;\u0026lt;\u0026thinsp;2.2x10\u003csup\u003e\u0026minus;\u0026thinsp;16\u003c/sup\u003e; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB).\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e Site-specific differential methylation analysis also revealed an increased number of hypomethylated sites in \u003cem\u003eDNMT3A\u003c/em\u003emt patients with COVID-19 in comparison to \u003cem\u003eTET2\u003c/em\u003emt patients with COVID-19, with 10,944 hypomethylated sites and 1,160 hypermethylated sites (Db\u0026thinsp;\u0026gt;\u0026thinsp;0.1 and p\u0026thinsp;\u0026lt;\u0026thinsp;0.01; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). We then annotated the differentially methylated regions using the ENCODE Epigenomics Roadmap reference data.\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e We found that actively transcribed states (Tx, TxWk) were more commonly hypomethylated in \u003cem\u003eDNMT3A\u003c/em\u003emt CH compared to \u003cem\u003eTET2\u003c/em\u003emt CH. Even though there were fewer hypermethylated sites, these were more common at enhancers (Enh) and promoters (TssA, TssAFlnk) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). Pathway analysis revealed that the hypomethylated sites are in or near genes involved in many diseases and functions related to inflammation and immune response, especially in leukocyte function (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA, B). Hence, we demonstrate site specific differential methylation between \u003cem\u003eDNMT3A\u003c/em\u003emt CH and \u003cem\u003eTET2\u003c/em\u003emt CH in patients with COVID-19, with more prominent hypomethylation occurring in actively transcribed regions in \u003cem\u003eDNMT3A\u003c/em\u003emt CH.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cem\u003eSingle cell proteogenomics reveal that DNMT3A mutations involve myeloid and lymphoid cell lineages, unlike TET2 mutations which bias hematopoiesis towards monocytosis\u003c/em\u003e \u003c/p\u003e \u003cp\u003eWe carried out comprehensive proteogenomic assessments at single-cell resolution on PBMC, on 5 patients with COVID-19 and CH (\u003cem\u003eDNMT3A\u003c/em\u003emt\u0026ndash; 3, \u003cem\u003eTET2\u003c/em\u003emt\u0026ndash;1, \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e co-mutant\u0026ndash;1) and 8 patients with COVID-19 and no CH (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Given that the single cell DNA assay is an amplicon based assay and the fact that mutations in \u003cem\u003eTET2\u003c/em\u003e do not have common hot spot regions, we did have \u003cem\u003eTET2\u003c/em\u003emt patients in our cohort where the mutant regions were not covered by the amplicons used in our assay, limiting the number of COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eTET2\u003c/em\u003emt CH cases that we could genomically profile at the single cell level.\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e Two of 3 (66%) COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eDNMT3A\u003c/em\u003emt patients had 2 \u003cem\u003eDNMT3A\u003c/em\u003e mutations each, while the COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eTET2\u003c/em\u003emt patient also had a concomitant \u003cem\u003eCBL\u003c/em\u003e mutation, with normal blood counts and no monocytosis.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn total, we included 28,941 single cells in the final analysis, after rigorous quality control and exclusion of cells with allele drop out, as previously described (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB and \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003emethods\u003c/span\u003e section).\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e Of these 28,941 sequenced cells, 2,004 (6.9%) had detectable CH mutations, of which 1,811 (90%) were \u003cem\u003eDNMT3A\u003c/em\u003emt, 361 (18%) were \u003cem\u003eTET2\u003c/em\u003emt (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC) and 168 (8%) were co-mutated with both \u003cem\u003eTET2\u003c/em\u003emt and \u003cem\u003eDNMT3A\u003c/em\u003emt. In comparison to \u003cem\u003eTET2\u003c/em\u003emt CH, where CH mutations were largely present in classical and intermediate monocytes, in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, the mutations were commonly seen in lymphoid lineage cells including CD4\u0026thinsp;+\u0026thinsp;and CD8\u0026thinsp;+\u0026thinsp;T-lymphocytes, T-regulatory cells and gamma delta T cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD-E), a lineage bias that has previously been described.\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eGiven the smaller sample size of CH mutant cells in the COVID-19 cohort, we conducted single cell proteogenomics on an additional 4 patients with CH (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA) who did not have COVID-19 and cumulatively re-analyzed the data. Of 36,557 single cells successfully sequenced, 3,314 (9%) had detectable CH mutations (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB-C). Among the mutated cells 1,503 (45%) were \u003cem\u003eTET2\u003c/em\u003emt, 1,643 (50%) were \u003cem\u003eDNMT3A\u003c/em\u003emt, and 168 (5%) were co-mutant cells (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). In this larger data set we once again demonstrate an enrichment of \u003cem\u003eTET2\u003c/em\u003e mutations in classical and intermediate monocytes, while \u003cem\u003eDNMT3A\u003c/em\u003e mutations were commonly seen in lymphoid and myeloid lineage cells, especially T-lymphocytes (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC-E).\u003c/p\u003e \u003cp\u003eBased on these findings, we conclude that \u003cem\u003eDNMT3A\u003c/em\u003e mutations are distributed in myeloid and lymphoid lineage cells, whereas \u003cem\u003eTET2\u003c/em\u003e mutations have a clear myeloid (monocytic) biased distribution.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSingle cell transcriptomic analysis of DNMT3A and TET2 mutant patient samples during inflammatory response\u003c/h2\u003e \u003cp\u003eTo explore the underlying differences in expression between \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e mutants in the context of SARS-CoV-2 infection we used single cell RNA-seq from PBMC from a cohort of 24 patients, which included 15 patients with COVID-19 and no CH, 6 COVID-19 patients with \u003cem\u003eTET2\u003c/em\u003e mutations and 3 COVID-19 patients with \u003cem\u003eDNMT3A\u003c/em\u003e mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). By pooling 78,083 cells from all 24 patients and subjecting them to dimensionality reduction using PCA and UMAP, followed by cell type identification using SingleR (see \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003emethods\u003c/span\u003e section and \u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA), we identified the typical repertoire of lymphoid and myeloid cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB-C). For the comparison of the different cell types, we also included published data from healthy individuals separated into two age groups (under and over 50 years).\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e The most noticeable difference was the enrichment of classical and intermediate monocytes in patients with \u003cem\u003eTET2\u003c/em\u003e mutations, while patients with \u003cem\u003eDNMT3A\u003c/em\u003e mutations show enrichment of CD8\u0026thinsp;+\u0026thinsp;and gamma delta T lymphocytes, plasma blasts, NK cells and non-classical monocytes (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC and \u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo further investigate a potential association of \u003cem\u003eDNMT3A\u003c/em\u003e mutations with COVID-19 severity, we performed differential gene expression analysis within each cell type, first comparing COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eDNMT3A\u003c/em\u003emt CH cells with COVID-19\u0026thinsp;+\u0026thinsp;no CH cells and found an overall increase in \u003cem\u003eIL32\u003c/em\u003e expression in CD4\u0026thinsp;+\u0026thinsp;and CD8\u0026thinsp;+\u0026thinsp;T lymphocytes, regulatory T cells and NK cells, a potential biomarker of severity that has not been documented in plasma samples from patients with COVID-19.\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). We then compared COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eDNMT3A\u003c/em\u003emt CH cells with COVID-19\u0026thinsp;+\u0026thinsp;\u003cem\u003eTET2\u003c/em\u003emt CH cells and found the same cell type specific overexpression pattern for \u003cem\u003eIL32\u003c/em\u003e, as described above (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eE, F). Pathway analysis of differentially expressed genes in this comparison revealed an enrichment of genes involved in lymphocyte proliferation, migration of blood cells, cytotoxicity of lymphocytes and NK cells and joint inflammation (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC).\u003c/p\u003e \u003cp\u003eGiven that IL32 has not specifically been implicated in COVID-19 severity, we analyzed the abundance of cytokines in patients with COVID-19 from our cohort of 223 assessable patients using a multiplex proteogenomic panel (O-link- \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003emethods\u003c/span\u003e section for details). While there were no significant differences in IL32 levels between \u003cem\u003eDNMT3A\u003c/em\u003emt and \u003cem\u003eTET2\u003c/em\u003emt CH COVID-19 patients, or between COVID-19 patients with and without CH, when we assessed this cohort for COVID-19 related morbidity and mortality by looking at relative levels of IL32, we found that COVID-19 patients with higher IL32 levels had a higher mortality, in comparison to those with lower levels (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD).\u003c/p\u003e \u003cp\u003eHere we demonstrate a lymphoid lineage enrichment for \u003cem\u003eDNMT3A\u003c/em\u003emt CH in comparison to \u003cem\u003eTET2\u003c/em\u003emt CH, along with an inflammatory transcriptional signature resulting in overexpression of \u003cem\u003eIL32\u003c/em\u003e in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, with higher IL32 protein levels correlating with increased mortality in the context of COVID-19.\u003c/p\u003e \u003cp\u003e \u003cem\u003eEpigenetic up-regulation of IL32 occurs due to increased chromatin accessibility of a transcriptional program seen in CH patients with DNMT3A mutations\u003c/em\u003e \u003c/p\u003e \u003cp\u003eSince \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e are known to regulate chromatin accessibility with opposing effects across the genome, we conducted single-cell profiling of both gene expression (scRNA-seq) and open chromatin (scATAC-seq) from the same PBMC samples using the 10X Genomics Multiome platform (methods for details).\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e From a cohort of 11 COVID-19 patients, which included 6 patients without CH, 3 patients with \u003cem\u003eTET2\u003c/em\u003emt CH and 2 patients with \u003cem\u003eDNMT3A\u003c/em\u003emt CH were selected (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA). We pooled 25,725 single cells and performed dimensionality reduction using PCA and UMAP analysis, followed by cell type identification by mapping the expression data from Multiome onto the scRNA-seq data and transferring labels using Azimuth (\u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC). From each cell type, we were able to gather expression signatures and open chromatin profiles, expressed in cut site counts (number of sites cut by the Tn5 transposase \u0026ndash; a direct measure of accessibility of chromatin to the transposase). Using this technology, we were able to validate the lymphoid lineage enrichment in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, comprising of CD4\u0026thinsp;+\u0026thinsp;T-lymphocytes, regulatory T cells, B-lymphocytes, and plasma blasts, in comparison to \u003cem\u003eTET2\u003c/em\u003emt CH, where monocytic enrichment was more prominent (classical and intermediate monocytes, dendritic cells and CD8\u0026thinsp;+\u0026thinsp;T-lymphocytes; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA, \u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eB). As chromatin accessibility is reflective of the active enhancer and promoter structure and is strongly associated with DNA methylation status, we aimed to provide mechanistic insights on the deregulation of the epigenetic landscape in \u003cem\u003eDNMT3A\u003c/em\u003emt CH in comparison with \u003cem\u003eTET2\u003c/em\u003emt CH. Notably, analysis of global distribution of cut sites and differentially accessible peaks showed that there was increased chromatin accessibility in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, especially in the two main cell types overexpressing \u003cem\u003eIL32\u003c/em\u003e, CD4\u0026thinsp;+\u0026thinsp;T lymphocytes and NK cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eB, \u003cb\u003eSupp.\u003c/b\u003e Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD). Moreover, co-accessibility analysis revealed an enrichment of \u003cem\u003ecis\u003c/em\u003e-regulatory interactions associated with expression of up-regulated genes such as \u003cem\u003eIL32\u003c/em\u003e in \u003cem\u003eDNMT3A\u003c/em\u003emt CH cells by identifying several candidate enhancers linked to the transcription start site of \u003cem\u003eIL32\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC).\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e The epigenomic landscape around \u003cem\u003eIL32\u003c/em\u003e containing the ENCODE \u003cem\u003ecis\u003c/em\u003e-regulatory elements mapped with cell specific open chromatin regions identified by scATAC analysis also allowed us to identify overlapping hypomethylated CpG regions in \u003cem\u003eDNMT3A\u003c/em\u003emt CH patients (cg01100763, cg09294055, cg04519177) obtained from bulk PBMC DNA methylation data, suggesting a direct link between loss of methylation and increased chromatin accessibility. (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC,D). To understand transcription factors (TF) likely to be affected by \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e CH mutations, we first performed analysis of differentially accessible TF binding sites by differential enrichment analysis directly from ChIP-seq datasets found in the literature.\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e Expression of some of these transcription factors was also analyzed in the same cell types to ensure that the chromatin accessibility analysis could also reflect changes in expression levels (\u003cb\u003eSupp\u003c/b\u003e Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE, F). Interestingly, the IRF family of transcription factors was enriched both in transcription (scRNA) and in TF activity (scATAC) in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, suggesting that a specific transcriptional program driven by IRF is involved in the pro-inflammatory response, particularly in CD4\u0026thinsp;+\u0026thinsp;T lymphocytes. Finally, to assess if the epigenetic dysregulation in \u003cem\u003eDNMT3A\u003c/em\u003emt CH can be traced back to the mutant cells, we performed genotyping of targeted loci with single-cell chromatin accessibility (GoTChA) in two patient samples with high VAF for a known \u003cem\u003eDNMT3A\u003c/em\u003e loss of function mutation (\u003cb\u003eSupp. Figure\u0026nbsp;6\u0026ndash;8\u003c/b\u003e).\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e Comparing number of cutsites in wild type and mutant cells from the same sample revealed a similar pattern of increased open chromatin in \u003cem\u003eDNMT3A\u003c/em\u003emt CH cells both globally (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE) and in a genomic locus specific manner, at CpG sites affected around the \u003cem\u003eIL32\u003c/em\u003e locus (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eF). With one exception (Locus A in patient 2), \u003cem\u003eDNMT3A\u003c/em\u003e mutations were directly associated with higher ATAC signal, indicating a direct link between loss of function of the DNA methyltransferase activity and increased chromatin accessibility.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eClonal hematopoiesis is defined by the acquisition of somatic mutations in HSPC, with the capacity to expand over time, with evolving cell intrinsic and extrinsic selection pressures.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e CH is ubiquitous in the aging population, with studies demonstrating hematopoiesis to be largely oligoclonal in the elderly.\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e are the two most common age-related CH mutated genes, with both being critical regulators of DNA methylation.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e While age related CH is associated with a low risk of hematological neoplasms, its presence is associated with increased all-cause mortality, largely due to cardiovascular disease.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e This is believed to be secondary to pervasive inflammatory transcriptional priming and inflammasome activation associated with these mutations.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e While the clinical impact of these two mutations is convergent, their impact on the epigenome is not. \u003cem\u003eDNMT3A\u003c/em\u003e mutations are mostly loss of function mutations that lead to protein instability and loss of methyltransferase activity, resulting in DNA hypomethylation, whereas \u003cem\u003eTET2\u003c/em\u003e mutations are either truncating or hypomorphic, abrogating the catalytic activity of TET2, resulting in DNA hypermethylation.\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e In addition, lineage restriction analysis has shown that while \u003cem\u003eDNMT3A\u003c/em\u003e mutations involve myeloid and lymphoid cell lineages, \u003cem\u003eTET2\u003c/em\u003e mutations are more myeloid restricted with a clear myelomonocytic bias.\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn this study, using COVID-19 as a model for severe inflammation, using bulk and single cell multiomics on patient samples obtained prior to the advent of the SARS-CoV-2 vaccine, we demonstrate the negative impact of CH on inflammatory morbidity (CRS) and mortality. We show that while this was accounted for by increasing age in \u003cem\u003eTET2\u003c/em\u003emt CH patients; \u003cem\u003eDNMT3A\u003c/em\u003emt CH remained an independent adverse prognosticator. While several host and viral factors impacting COVID-19 severity have been described, we demonstrate the relevance of \u003cem\u003eDNMT3A\u003c/em\u003emt CH in this context.\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e Prior studies have shown a conflicting impact of CH on COVID-19 related morbidity and mortality.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e While there could be several confounding factors explaining these discrepancies, our study population included unselected community dwelling individuals infected by the alpha strain of SARS-CoV-2, prior to immunization and without any underlying hematological or visceral neoplasms, or immunodeficiency states.\u003c/p\u003e \u003cp\u003eThe frequency of CH in the COVID-19 cohort was 29.6%, with \u003cem\u003eDNMT3A\u003c/em\u003e (30%) and \u003cem\u003eTET2\u003c/em\u003e (28%) being the two most mutated CH genes, consistent with prior observations.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e While the \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003emt CH groups were well balanced with regards to baseline blood counts and comorbidities, \u003cem\u003eDNMT3A\u003c/em\u003emt CH patients had higher levels of MCP-1, had a higher likelihood of ARDS and had higher grades of CRS as judged with CTCAE v5.0 criteria and by the WHO COVID-19 severity criteria.\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e There was no further stratification of this effect based on CH mutational VAF or the number of CH mutations. None of the patients included in this cohort, at last follow-up, had evidence of a hematological neoplasm. MCP-1, also called CCL2, is a key chemokine that regulates the migration of monocytes and macrophages in response to inflammation and has been implicated as a biomarker of COVID-19 severity in the recent past.\u003csup\u003e\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eWe then conducted methylation studies using the Illumina Methylation EPIC array in select cases, and while there were no global differences in DNA methylation, site-specific analysis revealed an increased number of hypomethylated sites in \u003cem\u003eDNMT3A\u003c/em\u003emt versus \u003cem\u003eTET2\u003c/em\u003emt patients with COVID-19. Using the ENCODE Epigenomics Roadmap reference data,\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e we demonstrate that actively transcribed states (Tx, TxWk) were more commonly hypomethylated in \u003cem\u003eDNMT3A\u003c/em\u003emt CH compared to \u003cem\u003eTET2\u003c/em\u003emt CH, with pathway analysis revealing that the hypomethylated sites were in or near genes involved in several diseases and functions related to inflammation. This is consistent with prior observations assessing DNA methylation on whole blood samples in patients with CH, CH associated cytopenias and AML.\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e Given the lack of a scalable single cell methylation assay, we were not able to validate these findings at the single cell level, appropriately adjusting for somatic mosaicism and duly acknowledge this limitation.\u003c/p\u003e \u003cp\u003eWe then assessed the distribution of \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003emt CH in the COVID-19 cohort using single-cell proteogenomics and validate observations from prior lineage restriction analyses that while \u003cem\u003eDNMT3A\u003c/em\u003emt CH involved myeloid and lymphoid lineage cells, \u003cem\u003eTET2\u003c/em\u003emt CH had a clear myeloid restriction, with a myelomonocytic bias.\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e These observations were also validated with single cell RNA and multiome-seq data. Enrichment of \u003cem\u003eTET2\u003c/em\u003e mutations were in classical and intermediate monocytes, reflective of a granulocyte monocyte-biased hematopoiesis (GMP-bias) and classical monopoiesis, which has been well documented in \u003cem\u003eTET2\u003c/em\u003e-driven hematological neoplasms such as chronic myelomonocytic leukemia and might explain the differential impact seen on inflammatory morbidity and mortality seen in the context of COVID-19.\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e Differential gene expression analysis comparing \u003cem\u003eDNMT3A\u003c/em\u003emt CH patients with those without CH, and those with \u003cem\u003eTET2\u003c/em\u003emt CH, demonstrated an overall increase in \u003cem\u003eIL32\u003c/em\u003e expression in CD4\u0026thinsp;+\u0026thinsp;and CD8\u0026thinsp;+\u0026thinsp;T lymphocytes, regulatory T cells and NK cells in patients with \u003cem\u003eDNMT3A\u003c/em\u003emt CH versus those without CH and those with \u003cem\u003eTET2\u003c/em\u003emt CH. Pathway analysis of differentially expressed genes revealed an enrichment of genes involved in lymphocyte proliferation, migration of blood cells, cytotoxicity of lymphocytes and NK cells and joint inflammation. On O-link based cytokine analysis, while there were no significant differences in IL32 levels between \u003cem\u003eDNMT3A\u003c/em\u003emt and \u003cem\u003eTET2\u003c/em\u003emt CH COVID-19 patients, or between COVID-19 patients with and without CH, relative increments in IL32 levels were associated with a higher mortality, in comparison to those with lower levels.\u003c/p\u003e \u003cp\u003eIL32 is a proinflammatory cytokine, initially detected in activated NK cells and T-lymphocytes, whose expression is strongly enhanced by microbes, mitogens, and inflammatory stimuli.\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e It can amplify production of other inflammatory cytokines including IL1b, IL6 and TNF-a and has not been reported as a biomarker of severity in COVID-19, or CH, thus far. We speculate that \u003cem\u003eIL32\u003c/em\u003e expression in \u003cem\u003eDNMT3A\u003c/em\u003emt CH is enhanced in the context of inflammatory stimuli such as COVID-19. To better understand the regulatory mechanism behind \u003cem\u003eIL32\u003c/em\u003e overexpression in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, we conducted single cell multiome profiling using the 10X Genomics Multiome platform. On a global distribution analysis of cut sites and differentially accessible peaks, we found increased chromatin accessibility in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, especially in CD4\u0026thinsp;+\u0026thinsp;T lymphocytes and NK cells, the two cell types with predominant \u003cem\u003eIL32\u003c/em\u003e overexpression. Co-accessibility analysis revealed an enrichment of \u003cem\u003ecis\u003c/em\u003e-regulatory interactions associated with expression of \u003cem\u003eIL32\u003c/em\u003e in \u003cem\u003eDNMT3A\u003c/em\u003emt CH cells, identifying candidate enhancers linked to the transcription start site of \u003cem\u003eIL32\u003c/em\u003e. We found that the IRF family of transcription factors was enriched both in transcription and in TF activity in \u003cem\u003eDNMT3A\u003c/em\u003emt CH, particularly in CD4\u0026thinsp;+\u0026thinsp;T lymphocytes. Finally, to address issues with somatic mosaicism, we performed genotyping of targeted loci with single-cell chromatin accessibility (GoTChA) in two patients with \u003cem\u003eDNMT3A\u003c/em\u003emt clonal cytopenia's and found a similar pattern of increased open chromatin in \u003cem\u003eDNMT3A\u003c/em\u003emt CH cells both, globally and in a genomic locus specific manner, at CpG sites affected around the \u003cem\u003eIL32\u003c/em\u003e locus.\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eIn summary, our results validate \u003cem\u003eDNMT3A\u003c/em\u003emt CH as an age and comorbidity independent risk factor for severe COVID-19. \u003cem\u003eDNMT3A\u003c/em\u003emt CH involves myeloid and lymphoid lineage cells and in the context of inflammatory stimuli, results in the overexpression of \u003cem\u003eIL32\u003c/em\u003e, a highly proinflammatory cytokine, predominantly in T and NK cells. This regulation in part is mediated by \u003cem\u003eDNMT3A\u003c/em\u003emt associated changes in chromatin accessibility, allowing for transcription factors like the IRF family of TF, to mediate transcriptional activation at \u003cem\u003eIL32\u003c/em\u003e promotor sites.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eAuthorizations, patient cohorts, cell collection and sorting\u003c/h2\u003e \u003cp\u003e This study was conducted at the Mayo Clinic in Rochester, Minnesota, after approval from the Mayo Clinic Institutional Review Board (IRB #20-005400, IRB #16-004173). In all cases, diagnosis was according to the 2016 iteration of the WHO classification of myeloid malignancies.\u003csup\u003e\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e PB and BM samples were collected in EDTA tubes after informed consent.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eTarget Capture Chip Assay for Bulk Sequencing\u003c/h3\u003e\n\u003cp\u003eSample DNA was extracted from peripheral blood mononuclear cells isolated by gradient centrifugation and re-suspended in a concentration of 500 ng in 50\u0026micro;l of low TE buffer. Paired-end indexed libraries were prepared using the Sureselect XT Low Input Library prep protocol on the Agilent Bravo liquid handler following the manufacturer\u0026rsquo;s protocol (NewEngland Biolabs, Ipswitch, MA, and Agilent Technologies, Ankeny, IA). Briefly, 200ng of target DNA was fragmented using the Covaris LE220 plus sonicator. The settings of duty factor 30%, peak incident power (PIP) 450, cycles per burst 200, time 180 seconds, generated double-stranded DNA fragments with blunt or sticky ends with a fragment size mode of between 150-200bp. The ends were repaired using the Sureselect End-Repair-A-Tailing enzyme mix. Adapter ligated DNA fragments were size-selected to enrich for 200 bp inserts (~\u0026thinsp;320 bp total library size) using AMPURE XP bead purification. The size-selected adapter-modified fragments were enriched, and specific indexes were added by 12 cycles of PCR using universal index primers. The concentration and size distribution of the libraries was determined on an Agilent Bioanalyzer DNA 1000 chip.\u003c/p\u003e \u003cp\u003eThe Custom Capture hybrid-target enrichment probes were designed using Agilent SureSelect design software (Agilent Technologies, Santa Clara, CA). The targeted gene panel was comprised of 62,962 single probes with size 1.805Mbp, and covered the coding regions, UTRs, and overlapping intron/exon regions for 205 genes described and / or enriched for CH mutations (see Appendix: Target Gene Panel). The custom capture was carried out using the Agilent Bravo liquid handler following Agilent\u0026rsquo;s SureSelect XT Low. Purified capture products were then amplified using the SureSelect Post-Capture primer mix for 14 cycles. Libraries were validated and quantified on the Agilent Bioanalyzer. Samples were sequenced by 150 paired end reads, 21 samples to a Flow Cell, on an Illumina NovaSeq 6000 SP (Illumina, SanDiego, CA).\u003c/p\u003e \u003cp\u003eSecondary bioinformatics analysis included quality assessment and alignment to the hg19 build reference genome using Novoalign (Novocraft Technologies, Malaysia), followed by GATK based single nucleotide and small insertion / deletion variant calling, structural variation discovery, and annotation. The quality of sequencing chemistry was evaluated using FastQC. After alignment, PCR duplication rates and percent reads mapped on target were used to assess the quality of the sample preparations. Realignment and recalibration steps were implemented in the GATK.\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e Somatic single nucleotide variations (SNVs) were then genotyped using SomaticSniper, whereas insertions and deletions were called by GATK Somatic Indel Detector.\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e Each variant in coding regions was functionally annotated by SnpEff, SAVANT, ClinVar, dbNSFP, OMIM, and the Human Gene Annotation Database to predict biological effects.\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e, \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e, \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e, \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e, \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e Each variant was also annotated with allele frequency from the Exoma Aggregation Consortium.\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e Variants of significant interest were visually inspected using IGV.\u003csup\u003e\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e The total list of all variants was internally compared for internal duplicates indicative of false positives, and variants of concern are additionally hand annotated for identification using Alamut Software. Interpretation for relevant alterations included absence in international normal variant allele databases (GnomAD, ExAC), deleterious effect on protein function by multiple phenotype prediction models, somatic and functional annotation in literature, consequence of variant (nonsense, truncating, etc.) and location proximal to important domains.\u003c/p\u003e\n\u003ch3\u003eSingle-Cell Proteogenomics\u003c/h3\u003e\n\u003cp\u003eSingle-cell DNAseq\u0026thinsp;+\u0026thinsp;proteogenomics was performed using the Mission Bio Tapestri platform according to the manufacturer\u0026rsquo;s specifications. Briefly, a cocktail of pre-titered oligo-linked antibodies targeting 42 unique cell-surface markers and 3 antibodies for isotype control, TotalSeq\u0026trade; D Human Heme Oncology Kit (BioLegend\u0026reg;, San Diego, CA [\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.biolegend.com/en-us/totalseq/single-cell-dna\u003c/span\u003e\u003cspan address=\"https://www.biolegend.com/en-us/totalseq/single-cell-dna\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e]) was utilized to capture the cell immunophenotype, along with targeted myeloid scDNAseq genotyping panel targeted against 45 genes with 312 amplicons covering relevant genes dysregulated in myeloid disorders (designed and manufactured by Mission Bio, San Francisco, CA [\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://designer.missionbio.com/catalogpanels/Myeloid\u003c/span\u003e\u003cspan address=\"https://designer.missionbio.com/catalogpanels/Myeloid\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e]).\u003c/span\u003e See Appendices: Myeloid Panel Coverage and Myeloid Panel Design for specific genomic coordinates. Cryopreserved PBMC patient samples were gently thawed, washed, and run through a Dead Cell Removal Kit (Miltenyi, Auburn, CA). The resulting viable cell fraction was counted, blocked with Human TruStain FcX\u0026trade; (BioLegend\u0026reg;) and re-suspended at a concentration of 25,000 cells/\u0026micro;L. This fraction was incubated with BioLegend TotalSeq Kit Cocktail described, washed, and filtered through a Flowmi cell strainer (MilliporeSigma, St. Louis, MO), and quantified using a Countess II cell counter. The cells were then diluted to a concentration of 4,000 cells per \u0026micro;L in Cell Buffer. Next, 35 \u0026micro;L of cell suspension was loaded onto a microfluidics cartridge and cells were encapsulated on the Tapestri instrument followed by cell lysis with protease digestion followed by heat inactivation using a thermal cycler. The cell lysate was reintroduced into the cartridge and processed such that each cell possessed a unique molecular barcode. Amplification of the targeted DNA regions was performed by incubating the barcoded DNA emulsions in a thermocycler as follows: 98\u0026deg;C for 6 min (4\u0026deg;C per s); ten cycles of 95\u0026deg;C for 30s, 72\u0026deg;C for 10s, 61\u0026deg;C for 9 min, 72\u0026deg;C for 20s (1\u0026deg;C per s); ten cycles of 95\u0026deg;C for 30s, 72\u0026deg;C for 10s, 48\u0026deg;C for 9 min, 72\u0026deg;C for 20s (1\u0026deg;C per s); and 72\u0026deg;C for 6 min (4\u0026deg;C per s). Emulsions were broken, DNA digested and purified with 0.72X AMPure XP beads (Beckman Coulter). The beads were pelleted and washed with 80% ethanol and the generated by amplifying DNA libraries with Mission Bio V2 index primers in the thermocycler using the following program: 95\u0026deg;C for 3 min; 10 cycles of 98\u0026deg;C for 20s for DNA libraries and 16\u0026ndash;20 cycles for protein libraries, 62\u0026deg;C for 20s, 72\u0026deg;C for 45s; 72\u0026deg;C for 2 min. Final libraries were purified with 0.69X AMPure XP beads. All libraries were sized and quantified using an Agilent Bioanalyzer and pooled for sequencing on an Illumina NovaSeq 6000 SP with 2 x 150bp multiplexed runs. FASTQ files generated by sequencers were processed using the Tapestri Pipeline V2 which handles adapter trimming, sequence alignment (BWA), barcode correction, cell finding and variant calling (using GATK 4.1.7 haplotype caller). Generated loom and H5 files were then processed with Tapestri Insights v2.2 (Mission Bio) and/or the Python-based Mosaic package (GitHub). Tapestri Insights analysis used default filter criteria (for example, genotype quality\u0026thinsp;\u0026ge;\u0026thinsp;30 and reads per cell per target\u0026thinsp;\u0026ge;\u0026thinsp;10) or whitelisting of known variants and annotation-based information (including ClinVar and DANN). Only cells with complete genotype information for all variants (previously detected in bulk sample) were included for downstream processing.\u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eCell type identification for the proteogenomics platform\u003c/h2\u003e \u003cp\u003eA rigorous quality control was performed on antibody-derived tag (ADT) count profiles of the Tapestri single-cell proteogenomic data. We first filtered out cells with less than 200 or more than 50,000 total ADT counts, and then removed cells with less than 10 unique measured ADTs. After ADT-based filtering, one sample was excluded from the analysis as there were less than 500 cells remaining for the sample. ADT counts for each surface protein were then scaled using counts per 10\u0026nbsp;million with a pseudocount of +\u0026thinsp;1 and normalized using the centered-log ratio (CLR) transformation. Subsequently, following step II of the \u0026ldquo;dsb\u0026rdquo; (denoised and scaled by background) protocol, normalized ADT profiles of antibodies for isotype control were used to remove cell-to-cell technical noise.\u003csup\u003e\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e\u003c/sup\u003e In result, we obtained a normalized and denoised ADT count matrix representing 42 different surface protein expression profiles of 36,557 single cells from 17 patients. This full matrix was used to generate UMAP plots following dimensionality reduction using PCA.\u003c/p\u003e \u003cp\u003eTo identify cell types for cells profiled by the Tapestri platform, we compared ADT count profiles of the Tapestri data with that of a reference CITE-seq data, i.e., Azimuth reference.\u003csup\u003e\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e\u003c/sup\u003e Among the 42 cell-surface markers targeted by our Tapestri analysis, 36 were also targeted by the reference CITE-seq analysis. In the CITE-seq analysis, expression of 8 out of the 36 shared markers were assayed using two different antibodies targeting the same antigen (CD3, CD4, CD11b, CD38, CD44, CD45, CD56, and CD138). For these antigens, we used geometric means of ADT counts for both antibodies. CD5, CD7, CD10, CD33, CD62L, and FceRIa were the markers that were targeted by the Tapestri platform but not by the reference CITE-seq analysis. ADT count matrices of the 36 shared markers for the Tapestri and CITE-Seq analyses were scaled, normalized, and denoised as described above (for denoising CITE-seq data, 3 IgG antibodies were selected as isotype controls). Subsequently, for each cell from the Tapestri data, 20 nearest neighbor cells from the CITE-seq data were identified using Euclidian distance, and cell type was called based on their identities using majority vote.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eSingle-cell profiling of PBMCs (single-cell RNA-seq and 10X Genomics Multiome)\u003c/h2\u003e \u003cp\u003eFrozen PBMCs were thawed in a 37\u0026deg;C water bath for 3\u0026ndash;5 min until no ice was visible. Cells were washed twice with 1 mL PBS\u0026thinsp;+\u0026thinsp;0.04% BSA and pelleted (300\u0026times;g for 5 min at 4\u0026deg;C). Dead cells were removed according to the Demonstrated Protocol: Removal of Dead Cells from Single Cell Suspensions for Single Cell RNA Sequencing (10X Genomics). Using the MACS Dead Cell Removal Kit (Miltenyi Biotec), the pellet was resuspended in 100 \u0026micro;L Dead Cell Removal MicroBeads and incubated for 15 minutes at room temperature. After incubation, the cell suspension was diluted with 1X Binding Buffer and applied to an MS column. The dead cells were retained on the column and the live cells passed through the column and were collected. After dead cell removal, the samples were washed twice with 1 mL PBS\u0026thinsp;+\u0026thinsp;0.04% BSA and the cell concentration was determined using a Cellometer K2 cell counter (Nexcelom Biosciences). The cells were then aliquoted for scRNA-seq and Multiome (see \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003eMethods\u003c/span\u003e section).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eSingle-cell RNA-seq\u003c/h2\u003e \u003cp\u003eThe cells were first counted and measured for viability using either the Vi-Cell XR Cell Viability Analyzer (Beckman-Coulter, Brea, CA) or a basic hemocytometer and light microscope. The barcoded Gel Beads were thawed from \u0026minus;\u0026thinsp;80\u0026deg;C and the cDNA master mix was prepared according to the manufacturer\u0026rsquo;s instruction for Chromium Next GEM Single Cell 3\u0026rsquo; Library and Gel Bead Kit (10x Genomics, Pleasanton, CA). Based on the desired number of cells to be captured for each sample, a volume of live cells was mixed with the cDNA master mix. A per sample concentration of 400,000 cells per milliliter or better is required for the standard targeted cell recovery of approximately 4000 cells. The stock concentration requirements would not change for higher cell recovery numbers. The cell suspension and master mix, thawed Gel Beads and partitioning oil were added to a Chromium Single Cell G chip. The filled chip was loaded into the Chromium Controller, where each sample was processed and the individual cells within the sample were captured into uniquely labeled GEMs (Gel Beads-In-Emulsion). The GEMs were collected from the chip and taken to the bench for reverse transcription, GEM dissolution, and cDNA clean-up. The resulting cDNA contained a pool of uniquely barcoded molecules. A portion of the cleaned and measured pooled cDNA continued to library construction, where standard Illumina sequencing primers and a 10x Genomics unique i7 Sample index were added to each cDNA pool.\u003c/p\u003e \u003cp\u003eAll cDNA pools and resulting libraries were measured using Qubit High Sensitivity assays (Thermo Fisher Scientific, Waltham, MA) and Agilent Bioanalyzer High Sensitivity chips (Agilent, Santa Clara, CA).\u003c/p\u003e \u003cp\u003eLibraries were sequenced at between 40,000 and 50,000 fragment reads per cell following Illumina\u0026rsquo;s standard protocol using the Illumina cBot and HiSeq 3000/4000 PE Cluster Kit (Illumina, San Diego, CA). The flow cells were sequenced as 100 X 2 paired end reads on an Illumina HiSeq 4000 HD using HiSeq 3000 / 4000 sequencing kit and HCS v3.4.0.38 collection software. Base-calling was performed using Illumina\u0026rsquo;s RTA version 2.7.7.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eSingle-cell Multiome\u003c/h2\u003e \u003cp\u003eAfter approximately 4,000 cells were aliquoted for scRNA-seq, the remainder were used for single-cell Multiome ATAC\u0026thinsp;+\u0026thinsp;Gene Expression (10X Genomics). Nuclei were isolated according to the Demonstrated Protocol: Nuclei Isolation for Single Cell Multiome ATAC\u0026thinsp;+\u0026thinsp;Gene Expression (10x Genomics, CG000365 Rev A). Briefly, cells were added to a 2.0 mL low binding tube and centrifuged (300\u0026times;g for 5 min at 4\u0026deg;C) using a swinging bucket rotor. The supernatant was removed, and the cell pellet was resuspended in 100 \u0026micro;L of chilled 10x Genomics Lysis Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl\u003csub\u003e2\u003c/sub\u003e, 0.1% Tween-20, 0.1% NP-40 Substitute, 0.01% digitonin, 1% BSA, 1 mM DTT, 1 U/\u0026micro;L RNase inhibitor 40 U/mL) by pipette-mixing 10 times. Cells were incubated on ice for 3 min, followed by dilution with 1 mL of chilled Wash Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl\u003csub\u003e2\u003c/sub\u003e, 0.1% Tween-20, 1% BSA, 1 mM DTT, 1 U/mL RNase inhibitor 40 U/mL). Nuclei were then centrifuged (500\u0026times;g for 3 min at 4\u0026deg;C), and the supernatant was slowly removed. The nuclei were washed one additional time with 1 mL Wash Buffer. Nuclei were resuspended in chilled diluted nuclei buffer (1X Nuclei Buffer (10x Genomics), 1 mM DTT, 1 U/mL RNase inhibitor 40 U/mL); the concentration was determined using a Cellometer K2 cell counter (Nexcelom Biosciences) and the samples were adjusted to a concentration appropriate for our targeted nuclei recovery. The single-cell ATAC library construction and gene expression library construction were carried out as described in the Chromium Next GEM Single Cell Multiome ATAC\u0026thinsp;+\u0026thinsp;Gene Expression User Guide (CG000338 Rev A). ATAC and GEX libraries were sequenced separately on an HiSeq 4000 (Illumina) before demultiplexing, alignment to the reference genome, and post-alignment quality control.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eDNA Methylation\u003c/h2\u003e \u003cp\u003eGenomic DNA was isolated and checked for quality by standard protocols. 1 \u0026micro;g genomic DNA then underwent bisulfite treatment using the TrueMethyl oxBS Module (Tecan Genomics, M\u0026auml;nnedorf, Switzerland) according to the manufacturer\u0026rsquo;s specifications. Briefly, DNA was purified using magnetic beads, denatured, and underwent bisulfite conversion followed by desulfonation and purification. The TrueMethyl converted DNA samples were then eluted in 10 \u0026micro;L and then processed through the Illumina Infinium MethylationEPIC BeadChip array (Illumina, San Diego, CA) protocol. In brief, 7 \u0026micro;L of converted DNA was denatured with 1 \u0026micro;L of 0.4N sodium hydroxide prior to whole genome amplification on the MSA4 plate. All other steps were followed as per the manufacturer\u0026rsquo;s guidelines.\u003c/p\u003e \u003cp\u003eQuality control of Infinium MethylationEPIC BeadChips was performed via the Genome Studio Methylation Module (Illumina). Subset-quantile Within Array Normalization (SWAN) was performed on the Infinium MethylationEPIC BeadChip IDAT files via the R package \u0026ldquo;minfi\u0026rdquo;.\u003csup\u003e\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e, \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e The resultant β-values are used to identify changes in DNA methylation (Δβ) between groups. Unless otherwise noted, a change in absolute methylation level of 10% (Δβ \u0026gt; |0.1|) and a \u003cem\u003ep\u003c/em\u003e value of \u0026lt;\u0026thinsp;0.01 were considered significant.\u003c/p\u003e \u003cp\u003eCpG site relation within chromatin states was annotated using Bedtools v2.27.1 to the genome annotations provided for PBMCs by the Roadmap Epigenomics project.\u003csup\u003e\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e\u003c/sup\u003e The functional annotation of differentially methylated CpGs located within gene bodies and promoters was generated through the use of QIAGEN IPA (QIAGEN Inc., \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://digitalinsights.qiagen.com/IPA\u003c/span\u003e\u003cspan address=\"https://digitalinsights.qiagen.com/IPA\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Gene ontology associated with the differentially methylated CpGs within non-coding regions was performed using GREAT.\u003csup\u003e\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eGenotyping of Targeted loci with single-cell Chromatin Accessibility (GoT-ChA)\u003c/h2\u003e \u003cp\u003eThe assay was performed following the published protocol (Myers et al., 2022) with some modifications. 10,000 nuclei were captured for each sample. The following primers were utilized to specifically amplify genotyping fragment (DNMT3A R882). GoT-ChA R1N-F, 5\u0026rsquo;-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGATGACTGGCACGCTCCAT-3\u0026rsquo;; GoT-ChA-R, 5\u0026rsquo;-CTAAGCAGGCGTCAGAGGAG-3\u0026rsquo;; GoT-ChA_nested-R, 5\u0026rsquo;-BiosG/CCTTGGCACCCGAGAATTCCATCCTGCTGTGTGGTTAGACG-3\u0026rsquo;. The underlined sequences are locus-specific. After index PCR, DNAs were digested with AatII and MscI to monitor the specificity of amplified DNAs. Final libraries were quantified using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, #Q32854) and were analyzed by the Fragment Analyzer (Advanced Analytical Technologies; AATI; Ankeny, IA) using the High Sensitivity NGS Fragment Analysis Kit (Cat. #DNF-486). The libraries were sequenced on a NovaSeq 6000 at the Mayo Clinic Genome Analysis Core. ATAC libraries were sequenced to a depth of 25,000 read pairs per nucleus and GoT-ChA libraries were sequenced to 5,000 read pairs per nucleus.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eOlink\u003c/h2\u003e \u003cp\u003eWe used Olink Explore 1536 panel assay (Olink Proteomics [Uppsala, Sweden]), which uses proximity extension assay technology coupled to a readout methodology based on next generation sequencing (Illumina NovaSeq 6000, NextSeq 550, and NextSeq 2000; all manufactured by Illumina; appendix 1 p 2), to quantify protein targets, as previously described.\u003csup\u003e\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eStatistical analysis\u003c/h2\u003e \u003cp\u003eDistribution of continuous variables was statistically compared using Mann-Whitney or Kruskal-Wallis tests, while nominal or categorical variables were compared using the Chi-Square or Fischer\u0026rsquo;s exact test. Time to event analyses used the method of Kaplan-Meier for univariate comparisons using the log-rank test. OS was calculated from the date of diagnosis to date of death or last follow-up, while AML-free survival (LFS) was calculated from date of diagnosis to date of AML transformation or death. Poisson regression models were fit to compare the numbers of different cell types across groups. These Poisson models incorporated an offset term to reflect the total number of cells for a given patient (to be able to compare cell types across consistent intervals in the setting of varying total cell counts: \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\text{ln}\\left(Y\\right)= {\\text{ln}\\left(N\\right)+\\beta }_{0}+{\\beta }_{1}{X}_{1}+ϵ\\)\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eSingle-cell data analysis\u003c/h2\u003e \u003cp\u003eSingle-cell RNA-seq: Sequenced reads from the droplet libraries were processed using 10x Genomics Cell Ranger v6.0.1.\u003csup\u003e\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u003c/sup\u003e The reads were aligned to the pre-built human reference transcriptome GRCh38 - v2020-A (July 7, 2020) provided by 10X Genomics. Read trimming, alignment, UMI counting, and cell calling were performed by Cell Ranger. Doublet prediction on scRNA-seq data was done using Scrublet v0.2.1 with default parameters.\u003csup\u003e\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e\u003c/sup\u003e Downstream processing was done using Seurat v4.0.4.\u003csup\u003e\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e\u003c/sup\u003e Count matrices from all samples were combined and batch-corrected using Seurat v4 integration method. Count matrix from each sample was log-normalized, scaled to mean 0 and variance 1, and dimensionality reduced using PCA on the top 2000 variable genes across all samples. The reciprocal PCA and reference-based integration options were applied in the anchor finding step due to large data size. Four patient samples, one from each sex and each condition (COVID-19\u003csup\u003e+\u003c/sup\u003e / CH\u003csup\u003e\u0026minus;\u003c/sup\u003e and COVID-19\u003csup\u003e+\u003c/sup\u003e / CH\u003csup\u003e+\u003c/sup\u003e), were chosen as references, and PCs 1\u0026ndash;50 were used for the reference-based integration. The integrated data included a total of 85,019 cells from 24 patient samples. Cells with more than 50% of reads mapped to mitochondrial genes, those with less than 200 unique genes detected and those that were predicted as doublets by Scrublet were removed. After QC filtering the number of cells was reduced to 78,083. Genes that were not expressed in at least 3 cells were also removed from downstream analysis. Uniform manifold approximation and projection (UMAP) was made using the top 50 PCs obtained by running PCA on the integrated (batch-corrected) gene expression matrix. Cell type identification was done with SingleR v1.10.0 using immune data from celldex v1.6.0 as reference.\u003csup\u003e\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e, \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eSingle-cell Multiome: Sequenced reads from the gene expression (GEX) and DNA accessibility (ATAC) droplet libraries of the Multiome assay were processed using 10x Genomics Cell Ranger ARC v2.0.0. The reads were aligned to the pre-built human reference genome GRCh38 - v2020-A-2.0.0 (May 3, 2021) provided by 10X Genomics. Read trimming, alignment, duplicate marking (ATAC), UMI counting (GEX), peak calling (ATAC) and joint cell calling were performed by Cell Ranger. Downstream processing was done using Seurat v4.0.4 and Signac v1.4.0.\u003csup\u003e\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e\u003c/sup\u003e GEX and ATAC count matrices were integrated (batch-corrected) independently using Seurat. Sample GEX count matrices were integrated following the same steps as used for the scRNA-seq data. Default options (CCA and pairwise anchor-finding) were used in the integration anchor finding step, since the data size was smaller than our scRNA-seq data. To merge the ATAC data from all samples, a common peak set was created by merging peaks from all samples using the reduce function from the R package GenomicRanges. Peaks that were smaller than 20 base pairs or larger than 10000 base pairs after merging were removed. The count matrix for each sample with the new common peak set was then recalculated using Signac. The count matrices were normalized using Term Frequency - Inverse Document Frequency (TF-IDF) and dimensionality reduced using singular value decomposition (SVD) using only peaks with non-zero counts in at least 20 cells, these two steps together known as latent semantic indexing (LSI) generating LSI components. The samples were then integrated using Seurat V4 integration with the reciprocal LSI (rLSI) method used on LSI components 2 to 50 (since the first LSI component correlates with sequencing depth) in the pairwise anchor finding step. The UMAP was calculated using the integrated LSI components 2 to 50. Seurat\u0026rsquo;s weighted nearest neighbor (WNN) algorithm was used on principal components 1 to 50 (GEX) and integrated LSI components 2 to 50 (ATAC) together to obtain a combined UMAP projection of both GEX and ATAC counterparts of the complete scMultiome dataset. The integrated data included 36,343 cells from 11 patient samples. Cells with more than 50% of reads mapped to mitochondrial genes, those with less than 200 unique genes detected (GEX), those with less than 200 unique peaks detected (ATAC) and those with transcription start site (TSS) enrichment score (as calculated by Signac) less than 1 were removed. After QC filtering the number of cells was reduced to 25,725. Cell type identification was done by using Azimuth algorithm to map the scMultiome GEX data to the scRNA-seq data since they were generated from the same cohort. The labels were then transferred from the scRNA-seq data to the single-cell Multiome data.\u003c/p\u003e \u003cp\u003eDifferential gene expression analysis: Differential gene expression testing was done on the log-normalized counts using Seurat\u0026rsquo;s FindMarkers function with default parameters unless specified otherwise. The statistical test applied was Wilcoxon Rank Sum test with p-values adjusted using Bonferroni correction based on the total number of genes in the dataset. Statistically significant differentially expressed genes were selected by keeping only genes that fall below the adjusted p-value threshold of 0.05. Differential gene expression testing comparing any two conditions (e.g., COVID-19\u003csup\u003e+\u003c/sup\u003e / \u003cem\u003eDNMT3A\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e versus COVID-19\u003csup\u003e+\u003c/sup\u003e / \u003cem\u003eTET2\u003c/em\u003e\u003csup\u003eMT\u003c/sup\u003e) was always done for each cell type independently (although shown together in the volcano plots for efficient visualization), unless specified otherwise. To make sure that the results were not driven by a single patient sample we applied a leave-one-out approach on all tests where we removed cells from one sample at a time redoing the tests and keeping only the genes that passed the adjusted p-value threshold in all tests. Pathway analysis was done using the Ingenuity Pathway Analysis platform on differentially expressed genes.\u003c/p\u003e \u003cp\u003eDifferential DNA accessibility and motif analysis: The differentially accessible peaks were identified by comparing the TF-IDF normalized cut-site counts of any two pair of cell populations using Seurat\u0026rsquo;s FindMarkers function. Here, the method used was the logistic regression framework along with testing for the number of fragments in peaks as a latent variable. Motif enrichments in peaks were estimated using ChromVAR enrichment scores on the JASPAR2020 motif matrix set.\u003csup\u003e\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e, \u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e\u003c/sup\u003e Enrichment of cut-sites in monocyte and macrophage specific ChIP-seq peaks of select DNA binding proteins were also estimated using ChromVAR by providing ChIP-seq peaks from ReMap2022 as input.\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e Co-accessibility scores between pairs of peaks were calculated using Cicero v1.3.5.\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eGoT-ChA\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e: Sequencing reads from the GoT-Cha experiments on two samples (DNMT3A mutants with the R882H mutation) were examined for sufficient sequencing depth. The GoT-Cha experiments yielded 168\u0026nbsp;million and 151\u0026nbsp;million reads for samples 1 and 2, respectively. For each sample, we generated four FASTQ files (*_I1_001.fastq.gz, *_R1_001.fastq.gz, *_R2_001.fastq.gz, and *_R3_001.fastq.gz). To precisely locate the cell barcode and mutation site, we randomly selected 5M reads from each FASTQ file and utilized the \"sc_seqLogo.py\" script from our RSeQC package to generate sequence logos.\u003csup\u003e\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e\u003c/sup\u003e Our analysis revealed that the \"I1\" FASTQ file contains an 8-nt sample barcode (\u003cb\u003eSupp. Figure\u0026nbsp;6B\u003c/b\u003e), while the \"R1\" FASTQ file contains the 51-nt targeted DNA sequences, with the last three nucleotides representing the genotype of the R882 codon (\u003cb\u003eSupp. Figure\u0026nbsp;6C\u003c/b\u003e). Notably, the codon is CGC (Arginine) for the wildtype and CAC (Histidine) for the mutant, given that the DNMT3A gene is located on the reverse strand. We confirmed that reads from the \"R1\" file could uniquely map to the targeted DNMT3A locus (chr2:25234324\u0026ndash;25234374) on the human reference genome GRCh38/hg38. Furthermore, the \"R2\" FASTQ file contains the 16-nt cell barcode (\u003cb\u003eSupp. Figure\u0026nbsp;6D\u003c/b\u003e), while the sequences from the \"R3\" FASTQ files are unknown (\u003cb\u003eSupp. Figure\u0026nbsp;6E\u003c/b\u003e). We then combined the \"R1\" and \"R2\" FASTQ files into a single FASTA file, using the cell barcodes as the DNA sequence identifiers, thereby explicitly linking the cell barcode to the DNA sequence (\u003cb\u003eSupp. Figure\u0026nbsp;7\u003c/b\u003e). Any DNA sequence with Phred-scaled quality scores\u0026thinsp;\u0026lt;\u0026thinsp;30 at the mutation site will be discarded. This preprocessing step not only significantly reduces the file size but also enhances computation speed. For read-level genotyping we used the following method. We utilized an IUPAC (International Union of Pure and Applied Chemistry) string to represent both wildtype and mutation genotypes. In this study, \"C\" and \"T\" are denoted as \"Y\" (\u003cb\u003eSupp. Figure\u0026nbsp;6C\u003c/b\u003e). Then, we employed the motility C\u0026thinsp;+\u0026thinsp;+\u0026thinsp;library (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/ctb/motility\u003c/span\u003e\u003cspan address=\"https://github.com/ctb/motility\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) to match this IUPAC string to each read. Utilizing IUPAC instead of a PWM (Position Weight Matrix) could dramatically enhance computation speed since the differences between wildtype and mutant reads are minimal (only 1 nucleotide change). We found that allowing for 1-mismatch could rescue an additional 10 to 12\u0026nbsp;million reads (~\u0026thinsp;7%) in each sample, as compared exact match. Consequently, 77.5% and 75.6% of reads were successfully genotyped for the two samples, respectively. To distinguish real cells from background noise, we employed a methodology similar to CellRanger's approach in single-cell RNA sequencing analysis. After segregating reads based on cell barcode, we conducted nonparametric kernel density estimation using the \"gaussian\" kernel function (\u003cb\u003eSupp. Figure\u0026nbsp;8A, B\u003c/b\u003e). The cell barcode density plot revealed three distinct modes likely corresponding to \"cell\", \"cell-free DNA\", and \"empty droplets\", respectively. To determine the cutoff point for cell calling, we identified the local minima (i.e., the changing point where the curve bends) from the density plot. For instance, in sample 1, the local minima for the cell mode is 3.0712, corresponding to 10^3.0712\u0026thinsp;=\u0026thinsp;1178 reads. This suggests that a cell with fewer than 1178 reads will be categorized as background (\u003cb\u003eSupp. Figure\u0026nbsp;8A\u003c/b\u003e). A comparison with the Knee plot (or barcode rank plot) demonstrates that the detected threshold aligns well with the elbow point (\u003cb\u003eSupp. Figure\u0026nbsp;8A, B\u003c/b\u003e). As a result, we identified 15185 and 5679 valid cells from samples 1 and 2, respectively. For cell-level genotyping we used the following method. We first calculated the mutant allele fraction (MAF) for each cell using formula described in code deposited on GitHub. Cells with MAF\u0026thinsp;\u0026le;\u0026thinsp;0.2 were categorized as homozygous wildtype (WT/WT), cells with MAF\u0026thinsp;\u0026ge;\u0026thinsp;0.8 were classified as homozygous mutants (Mut/Mut), likely due to Loss of Heterozygosity (LOH), and the remaining cells were designated as heterozygotes (WT/Mut). 40% and 38% of cells in Samples 1 and 2, respectively, were classified as WT/WT (\u003cb\u003eSupp. Figure\u0026nbsp;8C\u003c/b\u003e). The scATAC-seq data from the two samples were analyzed using the cellranger-atac workflow (version 2.1.0), using the reference file (refdata-cellranger-arc-GRCh38-2020-A-2.0.0) downloaded from \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://support.10xgenomics.com/single-cell-atac/software/downloads/latest\u003c/span\u003e\u003cspan address=\"https://support.10xgenomics.com/single-cell-atac/software/downloads/latest\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 8858 and 6255 high-quality cells were identified in Samples 1 and 2, respectively.\u003c/p\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cem\u003eEthics approval and consent to participate\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eSamples were collected after informed consent and approval by Mayo Clinic\u0026apos;s Institutional Review Board (IRB #20-005400, IRB #16-004173).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eConsent for publication\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAvailability of data and materials\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe bulk sequencing, scDNA-seq + proteogenomics, DNA Methylation, scRNA-seq, Multiome (GEX + ATAC) and GoTChA datasets generated and analyzed in this study have been deposited into the NCBI Gene Expression Omnibus (GEO) data base (https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE210435 (reviewers access token: qfcxaoecnjijxct). O-link data has been provided as Supplementary Data. Code and scripts used for analysis are made available in the GitHub repository https://github.com/LabFunEpi/CC_multiomics.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eCompeting interests\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eM.M.P., has received research funding from Kura Oncology, Epigenetix, Solutherapeutics, Polaris and StemLine Pharmaceuticals.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eFunding\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the Mayo Clinic Center for Individualized Medicine, by the Mayo Clinic\u0026nbsp;Center for Biomedical Discovery to W.M.I. and A.G.M. and the DOD Ovarian Cancer Research Program (W81XWH2110475 to A.G.M.). MMP would also like to acknowledge the NCI for R01 grant\u0026nbsp;R01CA272496.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAuthor contributions\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eM.P., A.G.M., N.C., conceived and designed the study with the help of M.B., T.L., W.M.I., J.F., and J.J.H. J.F., T.L., A.Mazzone, C.M.F., K.H.K., V.A.S., F.R.R., A.Munankarmy, S.K.B., M.R.S. and J-H.L. performed experiments. W.M.I., T.L., M.B., J.F. M.K., S.M.G., A.A.M., S.M.S. and L.W. analyzed the data. M.P., A.G.M. and M.B., wrote the manuscript with the help of W.M.I., J.F., K.R., N.C., A.P., and E.W. All authors critically revised and approved the final version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAcknowledgements\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe authors thank the Genome Analysis Core and the Biospecimens Accessioning and Processing (Mayo Clinic) for technical support, and Alessandro Gardini for help with access to their published datasets.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eYang L, Rau R, Goodell MA. DNMT3A in haematological malignancies. Nat Rev Cancer. 2015;15:152\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJaiswal S, Ebert BL. Clonal hematopoiesis in human aging and disease. Science 366, (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJaiswal S, et al. Age-related clonal hematopoiesis associated with adverse outcomes. N Engl J Med. 2014;371:2488\u0026ndash;98.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJaiswal S, et al. Clonal Hematopoiesis and Risk of Atherosclerotic Cardiovascular Disease. N Engl J Med. 2017;377:111\u0026ndash;21.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBuscarlet M, et al. Lineage restriction analyses in CHIP indicate myeloid bias for TET2 and multipotent stem cell origin for DNMT3A. Blood. 2018;132:277\u0026ndash;80.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoyal P, et al. Clinical Characteristics of Covid-19 in New York City. N Engl J Med. 2020;382:2372\u0026ndash;4.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMoore JB, June CH. Cytokine release syndrome in severe COVID-19. Science. 2020;368:473\u0026ndash;4.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOnder G, Rezza G, Brusaferro S. Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy. JAMA. 2020;323:1775\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSah P et al. Asymptomatic SARS-CoV-2 infection: A systematic review and meta-analysis. Proc Natl Acad Sci U S A 118, (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBolton KL, et al. Clonal hematopoiesis is associated with risk of severe Covid-19. Nat Commun. 2021;12:5975.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou Y, et al. Clonal hematopoiesis is not significantly associated with COVID-19 disease severity. Blood. 2022;140:1650\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuployez N et al. Clinico-Biological Features and Clonal Hematopoiesis in Patients with Severe COVID-19. Cancers (Basel) 12, (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHameister E, et al. Clonal Hematopoiesis in Hospitalized Elderly Patients With COVID-19. Hemasphere. 2020;4:e453.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZekavat SM, et al. Hematopoietic mosaic chromosomal alterations increase the risk for diverse types of infection. Nat Med. 2021;27:1012\u0026ndash;24.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNetea MG, et al. IL-32 synergizes with nucleotide oligomerization domain (NOD) 1 and NOD2 ligands for IL-1beta and IL-6 production through a caspase 1-dependent mechanism. Proc Natl Acad Sci U S A. 2005;102:16309\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePatnaik MM, et al. DNMT3A mutations are associated with inferior overall and leukemia-free survival in chronic myelomonocytic leukemia. Am J Hematol. 2017;92:56\u0026ndash;61.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNational Cancer Institutes. Common Terminology Criteria for Adverse Events (CTCAE) v5.0.). (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWorld Health Organization. Clinical management of COVID-19: Living guideline.). (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTulstrup M, et al. TET2 mutations are associated with hypermethylation at key regulatory enhancers in normal and malignant hematopoiesis. Nat Commun. 2021;12:6061.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEncode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57\u0026ndash;74.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eConsortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57\u0026ndash;74.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eColtro G et al. Clinical, molecular, and prognostic correlates of number, type, and functional localization of TET2 mutations in chronic myelomonocytic leukemia (CMML)-a study of 1084 patients. Leukemia, (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMiles LA, et al. Single-cell mutation analysis of clonal evolution in myeloid malignancies. Nature. 2020;587:477\u0026ndash;82.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStephenson E, et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat Med. 2021;27:904\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSilvin A, et al. Elevated Calprotectin and Abnormal Myeloid Cell Subsets Discriminate Severe from Mild COVID-19. Cell. 2020;182:1401\u0026ndash;e14181418.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSandoval L et al. Characterization and Optimization of Multiomic Single-Cell Epigenomic Profiling. Genes (Basel) 14, (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePliner HA, et al. Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. Mol Cell. 2018;71:858\u0026ndash;e871858.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHammal F, de Langen P, Bergon A, Lopez F, Ballester B. ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments. Nucleic Acids Res. 2022;50:D316\u0026ndash;25.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMyers RM, Izzo F, Prieto T, Mimitou E, Raviram R, Chaligne R, Hoffman R, Stahl O, Marcellino B, Smibert P, Landau D. High Throughput Single-Cell Simultaneous Genotyping and Chromatin Accessibility Reveals Genotype to Phenotype Relationship in Human Myeloproliferation. \u003cem\u003eBlood\u003c/em\u003e 138, 678 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMitchell E, et al. Clonal dynamics of haematopoiesis across the human lifespan. Nature. 2022;606:343\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKusne Y, Xie Z, Patnaik MM. Clonal hematopoiesis: Molecular and clinical implications. Leuk Res. 2022;113:106787.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSano S, Oshima K, Wang Y, Katanasaka Y, Sano M, Walsh K. CRISPR-Mediated Gene Editing to Assess the Roles of Tet2 and Dnmt3a in Clonal Hematopoiesis and Cardiovascular Disease. Circ Res. 2018;123:335\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang YH, et al. Systematic Profiling of DNMT3A Variants Reveals Protein Instability Mediated by the DCAF8 E3 Ubiquitin Ligase Adaptor. Cancer Discov. 2022;12:220\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNam AS, et al. Single-cell multi-omics of human clonal hematopoiesis reveals that DNMT3A R882 mutations perturb early progenitor states through selective hypomethylation. Nat Genet. 2022;54:1514\u0026ndash;26.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYamazaki J, et al. Effects of TET2 mutations on DNA methylation in chronic myelomonocytic leukemia. Epigenetics: official J DNA Methylation Soc. 2012;7:201\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Q, et al. Tet2 is required to resolve inflammation by recruiting Hdac2 to specifically repress IL-6. Nature. 2015;525:389\u0026ndash;93.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLasho T, et al. Single cell proteogenomic analysis of aberrant monocytosis in TET2 mutant premalignant and malignant hematopoiesis. Leukemia. 2023;37:1384\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMelenotte C, et al. Immune responses during COVID-19 infection. Oncoimmunology. 2020;9:1807836.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeshmane SL, Kremlev S, Amini S, Sawaya BE. Monocyte chemoattractant protein-1 (MCP-1): an overview. J Interferon Cytokine Res. 2009;29:313\u0026ndash;26.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen Y, et al. IP-10 and MCP-1 as biomarkers associated with disease severity of COVID-19. Mol Med. 2020;26:97.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSelimoglu-Buet D, et al. A miR-150/TET3 pathway regulates the generation of mouse and human non-classical monocyte subset. Nat Commun. 2018;9:5455.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSelimoglu-Buet D, et al. Accumulation of classical monocytes defines a subgroup of MDS that frequently evolves into CMML. Blood. 2017;130:832\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePatnaik MM, Tefferi A. Chronic myelomonocytic leukemia: 2024 update on diagnosis, risk stratification and management. Am J Hematol, (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim SH, Han SY, Azam T, Yoon DY, Dinarello CA. Interleukin-32: a cytokine and inducer of TNFalpha. Immunity. 2005;22:131\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArber DA, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127:2391\u0026ndash;405.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcKenna A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297\u0026ndash;303.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLarson DE, et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics. 2012;28:311\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCingolani P, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80\u0026ndash;92.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFiume M, Williams V, Brook A, Brudno M. Savant: genome browser for high-throughput sequencing data. Bioinformatics. 2010;26:1938\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLandrum MJ, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980\u0026ndash;985.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum Mutat. 2011;32:894\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStenson PD, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003;21:577\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTryggvadottir L, et al. Prostate cancer progression and survival in BRCA2 mutation carriers. J Natl Cancer Inst. 2007;99:929\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobinson D, et al. Integrative clinical genomics of advanced prostate cancer. Cell. 2015;161:1215\u0026ndash;28.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobinson JT, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMule MP, Martins AJ, Tsang JS. Normalizing and denoising protein expression data from droplet-based single cell profiling. Nat Commun. 2022;13:2099.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStuart T, et al. Comprehensive Integration of Single-Cell Data. Cell. 2019;177:1888\u0026ndash;e19021821.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAryee MJ, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaksimovic J, Gordon L, Oshlack A, SWAN. Subset-quantile within array normalization for illumina infinium HumanMethylation450 BeadChips. Genome Biol. 2012;13:R44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBernstein BE, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcLean CY, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28:495\u0026ndash;501.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGarapati K, et al. Multiomics single timepoint measurements to predict severe COVID-19 - Authors' reply. Lancet Digit Health. 2023;5:e57.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWolock SL, Lopez R, Klein AM, Scrublet. Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst. 2019;8:281\u0026ndash;e291289.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHao Y, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573\u0026ndash;e35873529.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol. 2019;20:163\u0026ndash;72.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMonaco G, et al. RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types. Cell Rep. 2019;26:1627\u0026ndash;e16401627.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStuart T, Srivastava A, Madad S, Lareau CA, Satija R. Single-cell chromatin state analysis with Signac. Nat Methods. 2021;18:1333\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCastro-Mondragon JA et al., \u003cem\u003eJASPAR. 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res 50, D165-D173 (2022).\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchep AN, Wu B, Buenrostro JD, Greenleaf WJ. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat Methods. 2017;14:975\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28:2184\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-4481664/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4481664/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003e \u003cem\u003eDNMT3A\u003c/em\u003e and \u003cem\u003eTET2\u003c/em\u003e are epigenetic regulator genes commonly mutated in age-related clonal hematopoiesis (CH). Despite having opposing epigenetic functions, these mutations are associated with increased all-cause mortality and a low risk for progression to hematological neoplasms. While individual impacts on the epigenome have been described using different model systems, the phenotypic complexity in humans remains to be elucidated.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eHere we make use of a natural inflammatory response occurring during coronavirus disease 2019 (COVID-19), to understand the association of these mutations with inflammatory morbidity and mortality. We demonstrate the age-independent, negative impact of \u003cem\u003eDNMT3A\u003c/em\u003e mutant CH on COVID-19-related cytokine release severity and mortality. Using single cell proteogenomics we show that \u003cem\u003eDNMT3A\u003c/em\u003e mutations involve myeloid and lymphoid cells. Using single cell multiomics sequencing, we identify cell-specific gene expression changes associated with \u003cem\u003eDNMT3A\u003c/em\u003e mutations, along with significant epigenomic deregulation affecting enhancer accessibility, resulting in overexpression of IL32, a proinflammatory cytokine that can result in inflammasome activation in monocytes and macrophages. Finally, we show with single cell resolution that the loss of function of DNMT3A is directly associated with increased chromatin accessibility in mutant cells.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eWe demonstrate the negative prognostic impact of \u003cem\u003eDNMT3A\u003c/em\u003emt CH on COVID-19 related inflammatory morbidity and mortality. \u003cem\u003eDNMT3A\u003c/em\u003emt CH involves myeloid and lymphoid cells and in the context of COVID-19, was associated with inflammatory transcriptional priming, resulting in overexpression of IL32. This overexpression was secondary to increased chromatic accessibility, specific to \u003cem\u003eDNMT3A\u003c/em\u003emt CH cells. \u003cem\u003eDNMT3Amt\u003c/em\u003e CH can serve as a potential biomarker for adverse inflammatory outcomes.\u003c/p\u003e","manuscriptTitle":"Single cell multiomic analyses reveal divergent effects of DNMT3A and TET2 mutant clonal hematopoiesis in inflammatory response","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-13 02:02:04","doi":"10.21203/rs.3.rs-4481664/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"41235a17-9979-4556-99de-8e3ead8b26fa","owner":[],"postedDate":"June 13th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-06-19T03:44:13+00:00","versionOfRecord":[],"versionCreatedAt":"2024-06-13 02:02:04","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4481664","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4481664","identity":"rs-4481664","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00