DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Method Article DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming Ying Yang, Ralph Patrick, Xiaoli Chen, Stacey Anderson, Jingyu Zhang, and 20 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8400524/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Cell identity is governed by graded transcription factor (TF) activity, yet current single-cell tools do not resolve how TF dosage shapes both gene expression and chromatin accessibility. We present DoseH-seq, a dosage-resolved single-nucleus RNA+ATAC multiome assay built on standard 10x Genomics workflows. DoseH-seq integrates sample hashing with quantitative tracking of continuous lentiviral overexpression and knockdown across multiplexed conditions and time points. We validate DoseH-seq in both single and multiperturb designs, including dual overexpression and overexpression/knockdown experiments, across different cell types and conditions. We apply DoseH-seq to resolve the dose and context-dependent roles of NFIX, a somatic TF enriched at regulatory elements more active in youthful cells, in fibroblasts, pluripotent stem cells (PSCs) and during reprogramming. In fibroblasts, increased NFIX opens regulatory elements whose motifs compete with myofibroblast identity TFs, consistent with counteracting mesenchymal drift and a restricted developmental reversion. During reprogramming, high NFIX overexpression activates AP-1 and stabilizes the somatic state. Conversely, transitory moderate-level NFIX overexpression, when Yamanaka factors are limiting, synergistically opens chromatin transiently to dismantle the somatic network, with potential analogous roles in oncogenic identity remodelling. During NFIX-induced PSC differentiation, transient reprogramming elements bearing NFI and degenerate pluripotency TF motifs (including OCT4) are re-engaged, consistent with developmental roles, mechanistically linking reprogramming with differentiation. Our data reveal graded dosage effects in somatic and pluripotency TF interactions, highlighting DoseH-seq as a generalisable perturbation-multiomics platform for resolving gene-dosage interactions governing cell identity and cell-state transitions. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 INTRODUCTION Single-cell technologies have expanded the ability to unravel cell states and cell fate transitions at unprecedented levels. In particular, the dual profiling of gene expression and chromatin accessibility with single-nucleus (sn)RNA + ATAC-seq has enabled the probing of TF networks underlying cell fate decisions 1 – 3 . To understand the mechanistic role of TFs in regulating cell states, snRNA + ATAC-seq perturbation experiments that either overexpress the TF, or knock it down or out, are required. There has been great progress in developing perturb-seq methods for single-cell (sc)RNA-seq 4 – 6 , including with dosage-sensitivity 6 , 7 ; however, these methods lack the multiomic capacity of snRNA + ATAC-seq assays. Recently, Multiome Perturb-seq 8 and MultiPerturb-seq 9 were developed for profiling perturbation effects using single-cell multiomics. While these methods can capture the binary phenotype of a TF knockdown, TF activity is typically graded, and often sensitive to dosage through graded changes in expression levels 10 , 11 . Dose-dependent TF activity has been linked to context-specific outcomes in cell fate reprogramming 6 and may underlie paradoxical TF roles in cancer 12 , 13 , highlighting the imperative to develop technologies for interrogating graded shifts in TF networks. To resolve these dose-response mechanisms at single-cell, multiomic resolution, we developed Dosage and Hashtag sequencing (DoseH-seq), an expansion of the 10x Genomics snRNA + ATAC-seq platform. To our knowledge, DoseH-seq is the first single-cell multiome (and chromatin level) method that enables sensitive detection of lentiviral perturbations to study dosage effects. DoseH-seq tracks continuous TF overexpression or knockdown (including multi-perturb designs), driven by promoters heterogeneously expressed in different cells, for a targeted panel of TFs at high resolution. This contrasts DoseH-seq to current multiomic perturb-seq methods 8 , 9 , which assay discrete/binary CRISPRi knockdowns across many genes. As a further advance over existing approaches, DoseH-seq integrates sample hashing to enable multiplexed assaying of multiple conditions or time points in a single run. We applied DoseH-seq to interrogate the biological impact of somatic NFI transcription factors, which we previously identified as underpinning the activity of youthful gene regulatory elements, silenced during aging 14 . Here, we leverage DoseH-seq to resolve the dose-dependent impacts of overexpressing NFI factor Nfix in fibroblasts, pluripotent stem cells (PSCs) and during induced PSC (iPSC) reprogramming. We track Nfix activity both by itself, and in dual overexpression contexts ( Nfix plus Oct4 ) as well as combinatorial overexpression plus knockdown ( Nfix plus c- Jun shRNA) experiments. DoseH-seq revealed that Nfix overexpression in fibroblasts induces a dose-dependent restricted developmental reversion, resulting in loss of myofibroblast identity. This mirrors the effects of transient epigenetic reprogramming with Oct4 , Sox2 , Klf4 and c- Myc shown to reverse mesenchymal drift linked to fibroblast aging 15 . While somatic TFs expressed in the starting cell type have been reported to inhibit de-differentiation during reprogramming 16 – 22 , we uncover a dose-dependent switch in NFIX function. Moderate Nfix overexpression synergizes with OKSM early during reprogramming by acting as a chromatin co-opener to boost reprogramming, with potential analogous dosage-interactions in oncogenic dedifferentiation. In contrast, high Nfix blocks reprogramming by activating AP-1 and reinforcing somatic networks. In PSCs, Nfix destabilizes pluripotency by re-engaging transient reprogramming regions comprising degenerate motifs for pluripotency TFs, mechanistically linking reprogramming and differentiation processes. Collectively, DoseH-seq provides a critically needed perturbation-multiomics platform to dissect gene dosage-dependent effects, revealing context and dosage-dependent roles for Nfix in potentiating graded levels of developmental reversion. RESULTS We developed DoseH-seq to enable tracking of continuous perturbation levels in multiplexed snRNA+ATAC-seq 10x Genomics experiments ( Figure 1A ). The DoseH-seq assay consists of two main components: 1) design of dosage trackable lentiviral perturbations with cell-to-cell variation in expression levels and 2) a two-step nuclei hashing process. For design of a dosage trackable perturbation system, we developed barcoded lentiviral constructs harbouring a synthetic barcoded cassette inserted proximally to the “polyA-like” signal within the 3’LTR. To enable cost-effective multiplexing, we developed a two-step lysis procedure for nuclei hashing compatible with snRNA+ATAC-seq ( Figure S1A–E ). An initial “soft” lysis step enables nuclear hashtag antibody labelling by permeabilising only the outer cell membrane; the nuclear membrane is therefore preserved to protect RNA until the second “harsh” lysis step in preparation for the ATAC-seq transposition reaction in the presence of high levels of RNase inhibitor ( Figures 1A and S1E ). The design of DoseH-seq allows for flexible perturbation experiments across multiple experimental conditions (e.g. timepoint studies), including overexpression, knockdown and multi-perturb designs ( Figure 1A ). The capability to perform overexpression experiments and track perturbation dosage-levels distinguishes DoseH-seq from the existing perturb-seq methods Multiome Perturb-seq 8 and MultiPerturb-seq 9 ( Table S1 ). Benchmarking of antibody-hashing against genetic demultiplexing of human samples We first validated the nuclei hashing step by performing a snRNA+ATAC-seq experiment in human fibroblast samples from 14 donors, plus a human embryonic stem cell (ESC) control line (annotated as hPSC hereafter; Figures 1B and S1F–M, Table S 2.1 ). The ability to perform single nucleotide polymorphism (SNP)-based demultiplexing, leveraging genetic heterogeneity between human donors (see Methods ) 23 , provided a ground truth for evaluating hashing efficacy. Our experiment yielded high quality data ( Figures S1G–M ), including clear multiomic discrimination of hPSC and fibroblast cells ( Figure S2A ) and distinct peaks called across regulatory elements ( Figure S2B ). Within the fibroblasts, hashtag-based demultiplexing enabled clear discrimination of samples and donors ( Figure S1M ). We compared the hashtag-based demultiplexing to the SNP-based 23 and found our hashtag-based demultiplexing had a median assignment accuracy of 95% ( Figure s S2C and 1C ). This experiment validated that DoseH-seq enables highly accurate recovery of numerous hashed sample conditions. Tracking variable Nfix overexpression in mouse fibroblasts and during reprogramming We recently reported that activator protein 1 (AP-1) TFs commonly drive chromatin opening with maturation and aging, while closing chromatin is enriched for cell identity TFs for their respective cell types 14 . Of note, pan-somatic NFI factors were also commonly enriched in closing candidate cis-regulatory elements (cCREs) across cell types and were redistributed to opened chromatin as AP-1 co-factors 14 . We therefore applied DoseH-seq to explore the impact of dose-resolved overexpression of widely expressed (but switched off during iPSC reprogramming [ Figure S2D ]) NFI member Nfix in different cellular contexts. Briefly, reprogrammable mouse embryonic fibroblasts (rMEFs with a doxycycline [DOX] inducible Oct4, Klf4, Sox2 and cMyc [OKSM] cassette and an endogenous Oct4 - Gfp reporter to mark fully-reprogrammed iPSCs) 24 , were transduced with either the barcoded EF1α-driven Nfix construct or the barcoded TagBFP2 control construct ( Figure S2E–H and Table S 2.2 ). Cells were collected under Nfix perturbation conditions at multiple time points before, during and after reprogramming (including mouse embryonic stem cells [mESCs] as pluripotency reference) for multiome profiling ( Figures 1D and S2I–M ). Following demultiplexing ( Figure 1E ) and stringent quality control filtering, we progressed with 8,341 nuclei (median of 15,799 ATAC fragments, 8,952 RNA UMIs per nucleus) for analysis ( Figure S2N ). To track Nfix overexpression relative to BFP control, we created a “lenti score”, defined by log-odds values from a logistic regression model classifying Nfix overexpression or control status based on lenti barcode counts. We calculated Uniform Manifold and Approximate Projection (UMAP) coordinates for fibroblasts before OKSM induction and observed a gradient of lenti scores, from negative in BFP control cells to low and then high in Nfix overexpressing cells ( Figure 1F ). We stratified the Nfix overexpressing fibroblasts into ‘low’, ‘mid’ and ‘high’ classifications based on the lenti score, corresponding to increasing levels of normalised lenti counts ( Figure 1G ). We calculated differentially accessible regions (DARs), comparing each overexpression stratification to the BFP rMEF control cells ( Table S3 ). We observed an increasing number of cCREs opening and closing in response to increasing levels of Nfix overexpression ( Figure 1H ), not driven by differences in numbers of ATAC features or fragment counts ( Figures S2O and S2P ). NFI was the top motif enriched in opening DARs ( Table S4 ), with increasing DAR numbers associated with increased enrichment ( Figure 1I) . We observeda similar increase in NFIX activity through calculating ChromVAR motif deviances from the scATAC-seq library ( Figure S2Q ) and confirmed that we could capture increasing levels of 3’ LTR expression from the scRNA-seq library in correspondence to increasing stratifications of lenti score ( Figure S2R ).Together, this established that DoseH-seq captures the impact of dosage level-resolved TF overexpression in single cells. DoseH-seq reveals dose-dependent effects of Nfix overexpression during reprogramming We used DoseH-seq to track the impact of Nfix overexpression levels along the reprogramming time course to pluripotency. Directed Force Layout (DFL) plots displayed a transition from fibroblast to endogenous pluripotency marker expression ( Figures 2A , 2B, S3A and S3B ) and revealed that BFP control cells ( Figure 2A ) followed a different reprogramming trajectory to cells that overexpress Nfix ( Figure 2B ). ChromVAR deviances calculated on NFIX + cCREs confirmed that early reprogramming Nfix overexpressing cells were characterised by highly elevated NFIX motif activity ( Figure S3C ). However, THY1 - cells from the NFIX group progressing beyond day 6 appeared to have silenced lentiviral inserts (unlike the BFP control), suggesting that sustained Nfix overexpression is not compatible with pluripotency progression ( Figures 2E and 2F ). Dendrogram analysis of average transcriptional similarity between conditions revealed that the progressing NFIX late-stage reprogramming (THY1 - ) cells were most similar to the mESCs ( Figure 2C ). In contrast, the late stage BFP THY1 - cells clustered with the earlier day 6 reprogramming time point. CytoTRACE2 25 analysis, which quantifies developmental potential, showed that the late stage NFIX THY1 - cells had the highest pluripotency scores following mESCs ( Figures 2D and S3D ). NFIX THY1 - cells also displayed higher activation for pluripotency marker Epcam compared to late stage BFP THY1 - cells ( Figure S3E ), with the opposite pattern for fibroblast marker Cd44 ( Figure S3F ). As somatic Nfix activity is silenced as cells progress towards pluripotent state ( Figures S3C and 2F ), we explored dosage-specific effects of Nfix overexpression on days 2 and 6 of reprogramming. At day 2, increasing Nfix overexpression was associated with increased numbers of opening DARs ( Figure 2G , Table S3 ). Interestingly, overlaying chromatin module scores of cCREs previously defined by us as stable during rMEF reprogramming 17 revealed a loss of accessibility within the NFIX day 2 cells ( Figure 2H ). This implies that the additional NFIX, in synergy with OKSM, induced chromatin opening and destabilised the somatic chromatin landscape in a more pronounced manner than OKSM alone ( Figure 2H ). At day 6 of reprogramming, the highest number of DARs relative to day 0 rMEFs were in the Nfix low cells ( Figure 2I ). This indicates that only a subset of lower Nfix overexpressing cells can progress to pluripotency, while cells with high levels of Nfix are refractory to reprogramming. In support of this, module scores of mESC marker genes (relative to rMEFs) at day 6 showed that only the low Nfix cells downregulated somatic ( Figure 2J ), or upregulated mESC genes ( Figure 2K ), compared to day 6 BFP cells. Our DoseH-seq analysis suggests that different Nfix overexpression dosages early during reprogramming may act to either enhance or block reprogramming. To confirm this, we used fluorescence-activated cell sorting (FACS) to sort rMEFs overexpressing Nfix into low, mid or high stratifications (via TagBFP2 levels linked to the Nfix cassette via an internal ribosome entry side [IRES], Figures 2L–N ) and assess reprogramming outcomes ( Figures 2O and 2P ). Compared to BFP only cells, sorted Nfix -overexpressing cells revealed a progressive increase in Nfix transcript levels from low to high stratifications (~2-15-fold above native levels, Figure 2N ). Evaluating the percentage of OCT4-GFP + cells at the end of reprogramming revealed that only the ‘mid’ Nfix population showed a sharp and significant increase in both the percentage of iPSCs ( Figures 2O and 2P) and absolute iPSC numbers per well (Figure S3G) , confirming that Nfix exerts a dose-dependent effect on reprogramming as predicted by DoseH-seq. Despite starting with sorted BFP + cells, >85% of fully-reprogrammed OCT4-GFP + cells from the Nfix overexpression group had silenced expression of the fluorescent reporter, unlike the BFP only control ( Figures 2Q and S3H ). This suggests either that only cells that randomly silence their Nfix -expressing viral inserts can progress or that the more extreme chromatin rearrangements in response to OKSM + Nfix overexpression trigger viral silencing. Given that by day 6, it is the Nfix low cells progressing to pluripotency, we surmised that lentiviral silencing ( Figure 2F ) results in the earlier ‘mid’ population transitioning to ‘low’ as part of pluripotency induction. Thus, Nfix overexpression can synergizes with the Yamanaka-factors in a dosage-dependent manner early during reprogramming to destabilise chromatin and drive pluripotency acquisition. Nfix overexpression drives a limited program of developmental reversion in fibroblasts We used our DoseH-seq data to better understand the effect of Nfix overexpression stratifications on the starting fibroblast population. Fibroblasts in culture differentiate into myofibroblasts, and at higher frequencies when plated at low densities as with reprogramming 26 . A recent study showed that the Yamanaka factors reverse this so-called “mesenchymal-drift” during iPSC generation or in response to transient reprogamming 14 . Indeed, we observed that markers of myofibroblasts including Cthrc1 , Acta2 and Ptn 27 were expressed in the BFP-only control rMEFs, but downregulated in Nfix -overexpressing rMEFs ( Figure S4A ). In contrast, markers of resting fibroblasts 27 were upregulated in NFIX rMEFs. We therefore asked whether Nfix overexpression alone can also drive myofibroblast de-differentiation. Motif enrichment of DARs (Nfix overexpressing fibroblasts vs BFP control) showed that in addition to NFI, opening DARs contained an enrichment of AP-1 motifs, including JUN/FOS, as well as STAT3 and cell identity TFs for fibroblast lineage cells including TEAD1, RUNX1 and TCF21 ( Figure 3A ). Interestingly, closing DARs showed a similar enrichment pattern for AP-1 and cell identity factors ( Figure 3B ). Both opening and closing motif signals became progressively stronger with increasing Nfix overexpression. This implies that NFIX-linked chromatin opening is causing redistribution of AP-1 and cell identity TFs away from cCREs of the starting cell state. We correlated genes differentially expressed between NFIX low rMEFs and BFP rMEFs with marker genes from a pan-tissue fibroblast atlas 28 . The top hit was a strong negative correlation with myofibroblasts ( Figure 3C ), suggesting that even low levels of Nfix overexpression drive myofibroblast de-differentiation, consistent with reversal of mesenchymal drift. We likewise observed a loss of myofibroblast identity in BFP only cells at day 6 of reprogramming ( Figure S4B ), while the day 6 NFIX low cells gained a mesenchymal stromal cell (MSC) identity ( Figure S4C ). Our results indicate that NFIX alone, a somatic TF, can drive a limited program of de-differentiation that is enhanced during reprogramming with OKSM to shed the somatic identity. Shared regulatory element activation by Nfix in reprogramming and PSC differentiation Nfix overexpression is largely silenced by the end of reprogramming, implying incompatibility with pluripotency maintenance ( Figures 2Q and S3H ). Indeed, we overexpressed Nfix in mouse PSCs and found it drove differentiation towards a fibroblast-like state ( Figure 3D ). To understand the relationship between NFIX activity in differentiation from PSCs and during reprogramming, we performed a second DoseH-seq experiment ( Figures 3E and S4D–M ; see below for more details). Samples included an early differentiation time course of mESCs (cultured feeder-free) overexpressing Nfix or BFP only control ( Figure 3E ). Following integration with the first DoseH-seq run, we correlated chromatin accessibility changes in differentiation (vs mESCs) to reprogramming time points (also vs mESCs) ( Figure 3F ). The strongest correlation was between the reprogramming day 2 NFIX cells and the differentiation day 2+3 NFIX cells ( Figures 3F and 3G ), implying NFIX drives a shared program of accessibility remodelling between these two processes ( Figures 3F and 3G ). To investigate this further, we compared DARs ( Table S5 ) for the differentiating NFIX D2+3 cells (vs mESCs) to NFIX reprogramming day 2 cells (vs BFP rMEFs), which revealed a strong positive correlation in overall chromatin accessibility changes ( Figure 3H ). We next focussed specifically on NFIX-induced changes during reprogramming by calculating day 2 reprogramming DARs relative to the Nfix -overexpressing rMEFs and overlapped the opening DARs with the differentiation DARs. We found that 670 (42%) of differentiation DARs overlapped with reprogramming DARs ( Figure 3I ), representing a significantly higher overlap than with the day 2 BFP DARs (2.5%) ( Figure S4N ). Motif enrichment among these different DAR sets revealed significant enrichment for NFI as expected ( Figure 3J ). However, another shared feature of these DARs was OCT, SOX and KLF motifs. We considered that NFIX could be targeting regulatory elements that contain low-affinity OSK binding sites and that this could be contributing to its ability to enhance reprogramming in synergy with OSK. We analysed motif scores for OCT4, SOX2 and the OCT4:SOX2 heterodimer ( Figures S4O–Q ), and found that Nfix overexpression in both reprogramming and differentiation was opening regions with lower affinity OCT4:SOX2 scores ( Figure S4O ), although this was less pronounced for SOX2 alone ( Figure S4P ). Gene ontology (GO) term analysis of genes linked to opening DARs in response to NFIX-mediated differentiation revealed enrichment for developmental processes ( Figure S4R ). A subset of these developmental genes was also linked to DARs opening on day 2 during OKSM reprogramming in combination with Nfix overexpression ( Figure 3K ). In summary, forced Nfix overexpression in PSCs drives differentiation by re-opening transient reprogramming elements with NFI/OCT/SOX/KLF motifs, suggesting they are early-developmental regulators. Nfix overexpression synergizes with Oct4 , Sox2 and Klf4 in a context-dependent manner DoseH-seq revealed that Nfix can boost reprogramming in a dose-specific manner. We next sought to unravel the molecular mechanisms underlying this dosage effect. At day 2, NFIX mid cells had 77% more uniquely opening DARs than BFP only cells, compared to their uninduced rMEF controls ( Figure 4A ). Motif enrichment of opening DARs unique to the low, mid and high Nfix subsets revealed that NFIXmid cells displayed a more balanced enrichment pattern for both OSK and somatic TFs ( Figures S5A and 4B ). Compared to BFP cells, NFIX mid opened increased numbers of cCREs positive for SOX2 and KLF4 motifs, while NFIX low showed fewer unique opening of OCT4/SOX2 binding sites. In contrast, although NFIXhigh cells opened excess SOX2 and KLF4 binding sites, this was accompanied by very strong enrichment for reprogramming barrier AP-1 14,16 ( Figure 4B ). This motif-count analysis reflects high-confidence motif calls; however, Day2 NFIX-unique opening regions also contain lower-affinity OCT4:SOX2 and OCT4 sites ( Figure S4O–Q ). By day 6, Nfix overexpressing cells progressing to pluripotency ( Figures 2I-K ) express low levels of Nfix lenti barcodes ( Figure 2F ). Analysis of DARs uniquely opening in day 6 NFIX low cells compared to BFP control ( Figure 4C ) revealed strong enrichment and excess binding sites for OCT4, SOX2 and KLF4 motifs ( Figures 4D and S5B ). We generated chromatin module scores for DARs uniquely opening in day 6 NFIX low cells, which were highest in late-stage reprogramming THY1 - cells and iPSCs ( Figure S5C ). In contrast, the unique BFP-opening DARs displayed highest activation in earlier reprogramming cell states ( Figure S5D ), In summary, mid-levels of Nfix overexpression early during reprogramming help open chromatin positive for OCT4, SOX2 and KLF4 binding sites and leads to more pronounced activation of late-stage reprogramming cCREs. Synergistic effects between Nfix and Yamanaka factors might be more pronounced when the reprogramming factors are expressed at low or insufficient levels. To explore this, we tested Nfix overexpression in a low-efficiency (~0.06%) reprogramming system based on an inducible polycistronic Sox2 , Klf4 and cMyc (SKM) cassette 22,23 introduced into wild type (WT) MEFs along with the Nfix or TagBFP2- only overexpressing viruses ( Figure 4E ). Nfix overexpression resulted in ~10-fold more cells positive for pluripotency markers 29,30 at the end of SKM reprogramming compared to BFP control ( Figures 4F and 4G ). This suggests that Nfix overexpression can also have reprogramming enhancing effects in the low efficiency SKM system lacking Oct4 . Vitamin C is known to enhance reprogramming, including by boosting the activity of epigenetic remodeller complexes attracted by reprogramming factors 31-33 , a mechanism recently associated with NFI 34 . We therefore asked whether Nfix overexpression might enhance reprogramming through the activity of Vitamin C dependent epigenetic remodeller complexes. We performed a transcriptional similarity analysis between all samples ( Figure 1D), including BFP control samples reprogrammed under Vitamin C-containing (Vc + ) and Vitamin C-free (Vc - ) conditions. This showed that both late-stage reprogramming BFP Vc + and Nfix overexpressing (Vc - ) cells matched closest to mESCs, in contrast to BFP Vc - cells ( Figure 4H ). This indicates that Nfix overexpression mirrors the reprogramming enhancing effects of Vitamin C. Considering Vitamin C boosts the activity of OKSM/NFIX-attracted chromatin remodeller complexes, lower Nfix overexpression levels might be required for beneficial effects on reprogramming outcomes in the presence of Vitamin C. To test this hypothesis, we overexpressed Nfix in the presence of Vitamin Cand used FACS to stratify cells into low, mid and high levels of Nfix overexpression for reprogramming experiments ( Figure 4I ) as previously ( Figures 2L and 2M ). Under Vitamin C-containing conditions, only cells overexpressing Nfix at low levels resulted in a significantly increased percentage of OCT4-GFP + cells ( Figures 4J and S5E ), unlike under Vitamin C-free conditions where mid Nfix overexpression enhanced reprogramming ( Figures 2O and 2P ). We noted that high Nfix overexpression levels under Vitamin C conditions led to significant reduction in reprogramming outcomes ( Figures 4J and S5E ). This reflects a previous report that Nfix overexpression using a TRE promoter under Vitamin C containing conditions blocks reprogramming 22 . Outcomes of Nfix overexpression on Vc + reprogramming using two different promoters (TRE vs EF1α) showed that only the TRE promoter blocked reprogramming ( Figure S5G ), and overexpressed Nfix at significantly higher levels than the EF1α promoter construct ( Figure S5F ). Nfix overexpression via both promoter constructs enhanced reprogramming under Vc - conditions, although this was most pronounced for the EF1α construct with lower overexpression levels ( Figures S5G and S5H ). Overexpression of Nfi family member Nfia via the EF1α promoter ( Table S 2.2 ) likewise enhanced reprogramming outcomes demonstrating that this effect is not Nfix -specific ( Figure S5I ). Our data indicates that low Nfix overexpression can support reprogramming factors (at potentially limited levels) to open chromatin early during reprogramming ( Figures 2G and 2I ). We hypothesised that additional Oct4 , Sox2 or Klf4 might counter this supporting effect. Indeed, overexpression of additional Oct4, Sox2 or Klf4 in rMEFs, with or without Nfix overexpression (EF1α promoter driven), abrogated the reprogramming-enhancing effects of Nfix overexpression ( Figures 4K and S5J ). This suggests that under Vitamin C-free conditions, Nfix overexpression may compensate for insufficient levels of reprogramming factors; in particular for Oct4 , as its addition enhanced rMEF reprogramming outcomes ( Figure 4K ). Under Vc + conditions, neither elevated expression levels of Oct4 , Sox2 , or Klf4 improved generation of iPSCs ( Figure S5J ) and combined Oct4 plus Nfix overexpression resulted in a significant decrease in OCT4-GFP + iPSCs ( Figure S5J ). The mechanistic basis for this decrease is investigated in the next section. Collectively, our data suggest that when OSKM are insufficient (particularly Oct4 ), low-to-intermediate Nfix overexpression enhances reprogramming by promoting chromatin opening. In contrast, under higher OSKM levels Nfix overexpression blocks reprogramming, likely by reinforcing the somatic state. Multi-perturb DoseH-seq reveals combinatorial TF interactions enhancing or blocking reprogramming A key feature of DoseH-seq is its capacity for tracking multiple perturbations in the same cell. We next sought to demonstrate the multi-perturb capabilities of DoseH-seq using dual overexpression and overexpression/knockdown experiments during reprogramming for which we selected two factor pairs. Firstly, we selected Oct4 to overexpress in combination with Nfix to unravel their combinatorial dosage role in blocking reprogramming under Vitamin C conditions ( Figure S5 ). Secondly, given that high Nfix blocks reprogramming, we looked at genes induced in Nfix -overexpressing rMEFs that may contribute to this phenotype. The AP-1 subunit c-Jun, previouslyreported to block reprogramming 16,18 , was specifically upregulated in Nfix high cells ( Figure 5A ). We tested the dosage effects of c-Jun overexpression in Vc + and Vc - reprogramming ( Figure S6A ), finding that all c-Jun overexpression levels displayed a strong blocking effect ( Figure S6B ). This may reflect AP-1's established role in driving cell aging processes 14 , given senescence is incompatible with successful reprogramming 35-39 .We next performed c-Jun knockdown via lentiviral shRNAs against c-Jun and selected a lentiviral mean occurrence of infection (MOI) driving mild (~30%) knockdown in fibroblasts, to mitigate impact on general cellular processes dependent on AP-1 activity ( Figures S6C and S6D ). Concurrent knockdown of c-Jun with Nfix overexpression significantly enhanced reprogramming under both Vc - and Vc + conditions ( Figure 5B ), supporting a role for cJun in limiting the reprogramming-enhancing effects of Nfix overexpression. We performed a DoseH-seq experiment, tracking multiple perturbations during reprogramming with Vitamin C ( Figure 5C ): 1) BFP control, 2) Nfix overexpression, 3) Nfix overexpression plus Oct4 overexpression (‘O+X’) and 4) Nfix overexpression plus c-Jun knockdown (‘J+X’). We integrated the dataset with the first run (including BFP Vc + and Nfix overexpressing cells), with the BFP control and perturb conditions followed separate reprogramming trajectories as before ( Figures 5D and 5E ). Furthermore, the Vc + ‘perturb’ cells displayed the same chromatin destabilizing effects at day 2 ( Figure 5F ), as for Vc - Nfix overexpression alone ( Figure 2H ). For the J+X perturbation, we confirmed in starting rMEF cells that c-Jun expression was significantly downregulated inJ+X cells compared to Nfix overexpression alone ( Figure S6E ). Top motifs enriched in closing DARs for J+X high rMEF cells corresponded to AP-1 family members ( Figure S6F ) and opening DARs were enriched most highly for NFI ( Figure S6G ), confirming that DoseH-seq could capture the dual knockdown/overexpression effects. We previously found that Nfix -induced chromatin remodelling at day 2 of OKSM induction was the precursor for enhancing reprogramming outcomes. We therefore investigated the additive impact of J+X at this timepoint. We used lenti scores for Nfix overexpression alone ( Figure 5G ) or combined J+X ( Figure 5H ) to stratify cells into low, mid and high subsets and calculated DARs relative to their rMEF controls ( Table S6 ). Compared to Nfix alone, J+X drove additional chromatin opening as well as closing for all stratifications ( Figure 5I ). To understand the mechanistic effect of chromatin opening at different levels of overexpression/knockdown, we identified the DARs that were uniquely opening in the low, mid or high cells for Nfix overexpression alone ( Figure S6H ) or J+X ( Figure S6I ). Motif enrichment analysis indicated a shift in the balance of OCT, SOX and KLF motifs to somatic TF motifs for both Nfix alone ( Figure S6J ) as well as J+X ( Figure S6K ) from low to high dose, respectively. Chromatin module scores of uniquely opening day 2 DAR subsets revealed that the DARs uniquely opening in low cells for both Nfix alone and J+X corresponded to cCREs activated in late reprogramming intermediates ( Figure 5J ). In contrast, the unique high and mid DARs corresponded to cCREs showing highest activity in day 2 cells themselves ( Figures 5J and S6L ). Thus, low levels of c-Jun knockdown and Nfix overexpression appear to cooperate to drive greater opening of late-stage reprogramming regulatory elements and shutdown of AP-1 binding site-rich somatic gene regulatory elements. We next examined the combinatorial impact of Oct4 and Nfix overexpression. Oct4 lenti scores were observed at day 2 (but not day 0 as Oct4 overexpression was driven from a DOX inducible TRE promoter) ( Figure 5K ) and motif enrichment of day 2 Oct4 high cells revealed OCT4 motifs ( Figure S6M ). To understand why combinatorial Oct4 and Nfix overexpression is blocking Vc + reprogramming ( Figure S5J ), we calculated DARs at day 2 ( Table S6 ) and compared opening cCREs between O+X high, Nfix high and Nfix low cells. High O+X drove excessive chromatin opening compared to high Nfix overexpression alone ( Figure 5L ) and analysis of motif scores showed that this was due to opening of cCREs with low-affinity OCT4 binding sites ( Figure 5M ). We sought to understand the difference between the blocking effects of high O+X or high Nfix overexpression alone and the enhancing effects of low Nfix overexpression. We performed motif enrichment on O+X high or Nfix high DARs against Nfix low DARs as background, and vice versa. While top motif enrichment for Nfix low-specific DARs yielded OCT4 and SOX factors ( Figure S6N ), O+X specific DARs overrepresented motifs for p53 ( Figure 5N ), a known reprogramming barrier that can cause cell cycle arrest 36-38,40,41 . Indeed, cell cycle scoring showed the O+X high cells had significantly lower S phase ( Figure 5O ) and G2/M phase scores ( Figure S6O ) than Nfix low cells. While there was no significant difference between Nfix high and low cells, Nfix high-specific DARs relative to Nfix low cells likewise overrepresented p53 motifs. We confirmed an elevated p53 pathway signature in O+X high and Nfix high cells, relative to Nfix low ( Figure S6P ). This was not reflected in stratifications of D2 BFP only cells ( Figure S6Q ), indicating that viral load alone is not driving elevated p53 pathway activity. In summary, multi-perturb DoseH-seq reveals that high Nfix constrains reprogramming by driving excessive AP-1 activation and sustaining somatic cCREs. When Oct4 and Nfix are co-overexpressed at high levels, this stress response is exacerbated, triggering P53 activation and cell-cycle arrest, further blocking reprogramming. OCT4/NFI dosage levels in human tumours resemble partial dedifferentiation states Considering oncogenic transformation shares many molecular aspects with reprogramming 42 , and OCT4 is reactivated in a subset of tumors 43,44 , we interrogated the OCT4–NFI dosage relationship in clinical cancer RNA-seq from TCGA. Comparing POU5F1 /OCT4 with NFI -family expression ( Figure 6A ) identified a subset of tumours with OCT4 activation and intermediate NFI expression (OCT4 + /NFI mid ). This state was most frequent in kidney tumours (84% of clear cell renal cell carcinoma [ccRCC]), followed by liver-related (49%) and pancreatic (48%) tumour types ( Figures 6B and S7A ). In contrast, OCT4 high /NFI low tumours were exclusive to testicular cancers ( Figure S7A ), which displayed NANOG expression ( Figure 6C ) and high CytoTRACE2 pluripotency scores ( Figure 6D ). OCT4 + /NFI mid tumours showed moderate CytoTRACE2 scores, consistent with partial dedifferentiation ( Figures 2D and S3D ), mirroring our reprogramming data in which sustained NFI expression is incompatible with fully pluripotent-like states. Motif analysis further showed that DARs opening in ccRCC kidney tumors 45 were enriched for NFI and OCT motifs ( Figure 6E ). We assessed POU5F1 transcript isoforms and detected OCT4A-consistent transcripts, the canonical reprogramming-associated isoform. Among untransformed tissue samples, kidney displayed the highest percentage of OCT4A-positive samples ( Figure S7B ), with kidney tumour types KIRP (kidney renal papillary cell carcinoma) and KIRC (ccRCC) following testicular cancer as the highest ( Figure S7C ). We next examined whether OCT4/NFI-linked transient reprogramming pathways are engaged in kidney cancer by reanalysing a kidney cancer scRNA-seq dataset ( Figure S7D ) 46 . ccRCC tumour cells upregulated POU5F1 relative to normal proximal tubule (PT) cells ( Figures 6F and S7E ), alongside increased NFI-factor expression ( Figure 6F ). We obtained genes both upregulated and linked to opening DARs in NFIX reprogramming day 2 cells (compared to control rMEFs) that were co-positive for NFI and low affinity OCT4 motifs ( Figure S4Q ). Seurat module scores showed that these genes were highly elevated in ccRCC tumour cells compared to both normal PT cells and adjacent tumour cells ( Figure 6G ), as well as a kidney chromophobe (chRCC) tumour that did not display POU5F1 expression ( Figures S7F and S7G ). Together, these data suggest OCT4/NFI-associated dedifferentiation programs uncovered by DoseH-seq may also be engaged in select human tumour contexts. DISCUSSION We present DoseH-seq, a single-cell multiome assay that allows profiling graded perturbation levels in time-course experiments at an affordable cost by capitalising on BGI sequencing technology ( Table S1 ). Multi-perturb experiments including Nfix , Oct4 and c- Jun modulation demonstrated that DoseH-seq can be applied to investigate dual perturbations. Our examination of TF dosage effects in fibroblasts, during reprogramming and in the context of PSC differentiation further highlights the flexibility and broad applicability of DoseH-seq. Aging and stress processes linked to in vitro culture are known to induce myofibroblast programs in fibroblasts, with persistent activation contributing to age-related disorders 15 , 47 . 15,47 . Reprogramming or partial reprogramming with Yamanaka factors can attenuate these programs, termed mesenchymal drift 15 , a finding confirmed by our current data. We previously linked declining NFI TF activity to shutdown of youthful cCREs across cell types 14 . Here, we show that Nfix overexpression in fibroblasts suppresses myofibroblast programs, similar to OKSM. We propose this is driven by redistribution of AP-1 transcription factors towards chromatin exposed by graded Nfix overexpression, underpinning a limited dedifferentiation program. By tracking dose-dependent Nfix overexpression in reprogramming, we challenge the view that somatic TFs already expressed in the starting cell type necessarily inhibit de-differentiation during reprogramming 16 – 22 . Rather, moderate-level Nfix overexpression early during reprogramming can synergize with insufficient Yamanaka factors 48 , 49 to promote transient chromatin opening and collapse of the fibroblast transcriptional network (Fig. 7). In contrast, high-level Nfix dosage blocks reprogramming by activating AP-1 to enforce the somatic state. Combinatorial high overexpression of Nfix and Oct4 drives excessive opening of chromatin harbouring low-affinity OCT4 motifs and increased activation of p53 pathways, culminating in cell cycle decline (Fig. 7). Conversely, low levels of c- Jun knockdown cooperate with moderate-level Nfix overexpression to drive opening of late-stage reprogramming regulatory elements by countering Nfix -induced AP-1 activation. Thus, DoseH-seq can reveal the dose-dependent interactions of TFs during both overexpression and knockdown, enabling unparalleled high-resolution interrogation of shifts in TF networks. Transiently accessible cCREs activated during OKSM overexpression are a hallmark of reprogramming and contribute to erasure of the somatic network by exposing competing somatic TF binding sites 1 , 17 – 20 . These transient cCREs are thought to harbor both somatic TF binding sites and lower-affinity binding sites for Yamanaka factors. Our DoseH-seq experiments in PSCs reveal that NFI induction re-engages the same transient cCREs activated during exit from pluripotency in response to Nfix and OKSM overexpression. As these regions contain motifs for NFI and pluripotency-linked TFs (OCT, SOX, KLF), Nfix overexpression may promote collapse of pluripotency by exposing competing sites, analogous to how the somatic network is dismantled during reprogramming (Fig. 7) 1 , 17 – 20 . This suggests that transient regulatory elements active during iPSC generation may have bi-directional roles to drive both de-differentiation and re-differentiation. Aligned with this notion, a recent preprint study implicated epigenetic derepression of viral elements with competing pluripotency TF sites in early PSC differentiation processes 50 . Our dose-resolved results may help reconcile reports of NFI family members acting as oncogenes or tumour suppressors in a context-dependent manner 12 , 13 , and more broadly suggest that endogenous somatic TF levels can shape cellular responses to Yamanaka-factor exposure. In this light, it is notable that mouse studies report the kidney as among the most susceptible organs to tumour formation following transient OSKM/OKS (but not KMS) induction 51 and we detected OCT4A transcripts in ccRCC and in a subset of untransformed kidney samples at higher frequency than in other untransformed tissues (Figs. S7B and S7C). Consistent with engagement of reprogramming-like programs, OCT4/NFI-associated gene modules linked to early transient reprogramming intermediates were elevated in ccRCC tumours. These correlative observations raise the possibility that dose windows and cell-type context may be important considerations for epigenetic rejuvenation strategies using the Yamanaka factors (e.g., in settings with elevated NFI via copy-number gains 52 – 54 ). DoseH-seq provides a scalable framework to quantify such interaction rules and support more controlled, and potentially safer, cell-state transitions. In summary, we introduce DoseH-seq as a generalisable multiome platform for mapping gene-dosage effects during cell-state transitions. We leverage DoseH-seq to explore the role of Nfix , linked to youthful cell states, in driving developmental reversion, differentiation and its dose-dependent interaction with Yamanaka factor expression. In doing so, DoseH-seq uncovered a class of transient reprogramming cCREs with bi-directional accessibility, linking erasure of somatic identity with controlled exit from pluripotency. DoseH-seq constitutes a technology platform to dissect TF dosage logic in broad contexts including adult stem cell compartments, tumour lineage programs, and engineered immune cells (e.g., CAR-T/NK), where fine control of TF activity is likely to determine regenerative potential, lineage stability, and therapeutic efficacy. Limitations of the study Our two-step hashing/lysis preserved RNA integrity ( Figure S1E ) and yielded highly robust multiome QC data ( Figures S2J–R and S4G–M ) as benchmarked against the existing Multi-perturb workflows ( Table S1 ), with coherent reprogramming/differentiation trajectories (Figs. 2A –B , 3G , 5D–E ). While we did not benchmark against standard 10x nuclei isolation (no hashing), our primary conclusions rely on within-run, dose-stratified comparisons under identical chemistry. For pooled multi-perturb samples, timepoints were assigned by label transfer; analyses therefore focus on high-confidence early-stage cells and within-dataset contrasts. Where cross-experiment comparisons were required (Fig. 3), we compared effect sizes anchored to each experiment’s matched reference controls (e.g., mESC or rMEF) to prevent batch-driven artefacts. Nfix barcode detection declines at late reprogramming stages, consistent with lentiviral silencing, limiting longitudinal dose tracking. We did not directly profile NFIX or OCT4/SOX2 occupancy; motif and accessibility analyses provide indirect evidence of potential TF binding site competition. METHODS Animal studies and use of human cell cultures All animal experiments were reviewed and approved by The University of Queensland Animal Ethics Committee and the Monash Animal Services Ethics Committee and were conducted in compliance with the specified ethical regulations. Biopsy sample collection for derivation of fibroblast cultures was approved by The University of Queensland Human Ethics Committee and conducted in compliance with the specified ethical regulations. Cell culture Fibroblasts and 293T cell lines Human fibroblasts (human dermal fibroblasts, HDFs), 293T cell lines, and mouse embryonic fibroblasts (MEF) were cultured in MEF media (DMEM basal medium; Thermo Fisher Scientific, Cat#11995065) supplemented with 10% (v/v) fetal bovine serum (FBS; Thermo Fisher Scientific, Cat#10099141), 1% (v/v) Penicillin-Streptomycin (Thermo Fisher Scientific, Cat#15070063), 1% (v/v) MEM-NEAA (non-essential amino acids; Thermo Fisher Scientific, Cat#11140050), and 0.1% (v/v) 55 mM 2-mercaptoethanol (Thermo Fisher Scientific, Cat#21985023). Media change was performed every two to three days. Human Pluripotent Stem Cells To culture human pluripotent stem cells (H9 hPSC cell line; WiCell, WA09), 6-well plates were first coated with 5 µg/mL Vitronectin (VTN-N; Thermo Fisher Scientific, Cat#A14700) in DPBS (Thermo Fisher Scientific, Cat#14190144) at room temperature for 1 hr. H9 hPSCs were cultured in Essential 8™ basal medium supplemented with 2% E8 Supplement (Thermo Fisher Scientific, Cat#A1517001), with daily media replacement. Mouse Pluripotent Stem Cells Mouse pluripotent stem cells (mESCs or iPSCs) were cultured as previously described 55 in ESC media (KnockOut™ DMEM basal medium supplemented with 15% (v/v) FBS, 1% (v/v) Penicillin-Streptomycin, 1% (v/v) MEM-NEAA, 1% (v/v) GlutaMAX™ Supplement (Thermo Fisher Scientific, Cat#35050061), 0.1% (v/v) 55 mM 2-mercaptoethanol (Thermo Fisher Scientific, Cat#21985023), and 10 ng/mLleukemia inhibitory factor (LIF; Sigma-Aldrich, Cat#ESG1107). For feeder-dependent culture of mouse pluripotent stem cells, growth-inactivated MEFs (iMEFs, produced as previously described 55 ) were seeded onto the cell culture vessels coated with 0.1% gelatin (Thermo Fisher Scientific, Cat#G1393) at the density of 50,000 per cm 2 as feeder layer one day before seeding mouse pluripotent stem cells. Media change was performed every other day. For feeder layer-free culture, iMEF depletion was performed leveraging differences in cell attachment speed between iMEFs and mESCs. Cells (mESC cultured on iMEFs) harvested from a 6-well were reseeded into a T75 culture flask coated with 0.1% gelatin and placed back into the incubator for ~40 mins. Afterwards, the flask was examined under the microscope to ensure most iMEFs were attached. The supernatant containing the mESCs was then carefully aspirated and transferred to vessels coated with 0.1% gelatin. The iMEF-depleted mESCs were cultured in iMEF-conditioned ESC media (i.e. ESC media collected from 6 wells with iMEFs after 24 hrs and then filtered through a 0.22 µm membrane (Sigma-Aldrich, Cat#SLGPR33RS). iMEF-conditioned mESC media was changed every day. All cells were maintained at 37ºC in a humidified incubator with 5% CO 2 . Lentiviral particle generation and titer determination Lentiviral transfer vectors encoding viral insert to knockdown or overexpress genes of interest were designed and purchased from VectorBuilder Inc. (see Table S2.2 ). As a reporter, blue fluorescence protein (TagBFP2) was employed in nearly all lentiviral constructs (on its own or linked to genes of interest via an internal ribosome entry site [IRES]). Lentiviruses were generated using a 2nd generation lentivirus packaging system. Briefly, 8,500,000 293T cells were seeded into a T75 flask in 15 mL MEF media one day before viral packaging. On the following day, MEF media were replaced with Opti-MEM™ medium (Thermo Fisher Scientific, Cat#31985070). For transfection, Lipofectamine™ 3000 reagent mix (Thermo Fisher Scientific, Cat#L3000015) and DNA vector mix (5.9 μg transfer vector carrying gene of interest, 10.8 μg psPAX2 and 7 μg pMD2G) were prepared separately as per manufacturer’s instructions (packaging plasmids psPAX2 and pMD2G as per Nefzger, C. M. et al., Stem Cells , 2011 56 . The two solutions were then combined and incubated at room temperature for 15 mins. Lipid-DNA complexes were then added dropwise to the T75 flask previously overlaid with Opti-MEM™ medium followed by gentle rocking to ensure even distribution. After 6 hrs of transfection, media were replaced with fresh, pre-warm virus production media (VPM; Advanced DMEM (Thermo Fisher Scientific, Cat#12491015) supplemented with 2% (v/v) FBS, 1% (v/v) GlutaMAX™ Supplement, 1% (v/v) MEM-NEAA, and 1% (v/v) P/S). VPM media containing lentiviral particles were collected at 24 hrs and 48 hrs post transfection and concentrated ~150 times by Amicon® Ultra Centrifugal Filters (Merck Life Science, Cat# UFC910024) at 3,181xg for 15 mins. Concentrated lentiviruses were aliquoted and stored in a --80ºC freezer until usage. Titers of each concentrated lentivirus stock were determined through serial dilution on fibroblast cultures followed by flow cytometry analysis for quantification as described 57 . 57 . Briefly, MEFs were seeded at a density of 20,000 per cm 2 4–24 hrs ahead of infection for titration. Duplicate wells were designed for each viral dilution. On the day of titration, lentiviral concentrates were serially diluted in 10-fold steps at 1:1,000 down to 1:1,000,000 into MEF culture media containing transfection reagent polybrene (Merck, Cat# TR-1003-G) at 5.8 µg/mL. MEFs were overlayed with the viral dilutions followed by a centrifugal inoculation at 610xg, room temperature for 60 mins. Cells were subsequently cultured in a 5% CO2, 37ºC incubator. The infected fibroblasts were harvested 72 hrs post transduction and stained with Propidium iodide (PI; Sigma-Aldrich, Cat#P4864) at 1:1000 dilution for flow cytometry analysis. Cultures with fluorescence reporter positive cells between 1% to 30% were leveraged for titer calculations. Gene overexpression/knockdown using lentivirus rMEFs or wild type MEFs were plated at a density of 20,000 per cm 2 in 6-well plates one day before lentiviral transduction. Viral media were prepared using target cell type culture media supplemented with 5.8 µg/mLpolybrene and lentivirus at the indicated MOI. For overexpression, cells were infected with lentiviruses encoding genes of interest (see vector information in Table S2.2 ) at MOI of 2.5 unless specified otherwise. Lentivirus carrying only the TagBFP2 reporter at matched MOI were used as experimental controls. For c-Jun knockdown, lentivirus carrying shRNAs against c-Jun (see vector information in Table S2.2 ) were utilized at MOI of 2.5 or 10. Lentiviruses carrying non-targeting shRNA with scrambled sequence were used as experimental controls. Cells overlaid with virus-containing media were centrifuged at 610xg, room temperature for 1hr and afterwards maintained in a 5% CO 2 , 37ºC incubator. Spent media containing viral particles were removed after overnight incubation. The efficiency of gene knockdown or overexpression via lentiviruses was measured by qPCR assay at day 3 post transduction. For reprogramming experiments with TF perturbations, infected rMEFs were reseeded at the indicated density at day 3 post transduction in ESC media on 0.1% gelatin-coated tissue culture plates and reprogramming was initiated by 1µg/mL DOX. For overexpression experiments with mouse pluripotent stem cells, mouse ESCs or iPSCs (the latter carrying a Gfp reporter in the endogenous Oct4 locus) were thawed and cultured on iMEFs for 1-2 passages. Prior to lentiviral infection, PSCs were depleted of iMEFs (see “Cell culture - Mouse Pluripotent Stem Cells” section) and seeded at 20,000 per cm 2 on vessels coated with 0.1% gelatin in conditioned ESC media supplemented with additional 10 ng/mL LIF and allowed to attach for at least 6 hrs. Lentiviral infection was performed via centrifugal inoculation as described for MEFs. An MOI of 10 (note that titers were determined on MEFs) was employed for each lentivirus for efficient PSC infection. Immunofluorescence staining To confirm Nfix overexpression via lentivirus in MEFs at the protein level, MEFs infected with lentivirus overexpressing either Nfix or TagBFP2 only were subjected to immunofluorescence staining. MEFs at day 5 post viral infection were fixed on tissue culture plates with 4% (w/v) paraformaldehyde (PFA; ProSciTech, Cat#C004; diluted to 4% with DPBS) at room temperature for 15 mins. Following three DPBS washes, cells were permeabilized with 0.3% % (v/v) Triton X-100 (Sigma-Aldrich, Cat#X100) in DPBS at room temperature for 30 mins. After three washes with PBST buffer (0.1% (v/v) Tween-20 (Sigma-Aldrich, Cat# P5927) in DPBS), cells were then incubated with blocking solution, 2% (w/v) BSA (Sigma-Aldrich, Cat#A9418) in PBST at room temperature for 1hr. For antibody labelling, mouse monoclonal anti-NFIX antibody (Sigma-Aldrich, Cat#SAB1401263; clone 3D2) was employed at 1:800 dilution for overnight incubation at 4ºC. Following three PBST washes, cells were stained with donkey anti-mouse IgG conjugated with Alexa Fluor™ Plus 488 (Thermo Fisher Scientific, Cat# A32766) at 1:500 dilution in dark at room temperature for 1hr. For nuclei visualization, 4′,6-diamidino-2-phenylindole (DAPI; Sigma-Aldrich, Cat#D9542) was used at 1:1000 dilution at room temperature for 10mins. Following the DAPI labelling step, cells were washed with PBST again and left in PBS for microscopic inspection. Quantitative Polymerase Chain Reaction (qPCR) Harvested cells were pelleted and washed with DPBS. Cell pellets were then lysed using TRIzol reagent (Thermo Fisher Scientific, Cat#15596018). TRIzol-treated cell pellets were stored at -80ºC freezer until processing for RNA extraction. Total RNA was extracted using Direct-zol™ RNA Microprep kit (Zymo Research, Cat#R2062) as per manufacturer’s instructions. RNA was quantified by Nanodrop Spectrophotometer (Thermo Fisher Scientific) or Qubit™ RNA High Sensitivity assay (Thermo Fisher Scientific, Cat#Q32855) and stored at -80ºC freezer until further processing. To synthesize cDNA, 200-600 ng RNA was reverse transcribed using QuantiTect® reverse transcription kit (Qiagen, Cat#205311) according to the manufacturer's recommendation. To assess expression levels, qPCR was performed using Viia™ 7 qPCR platform using 384-wells plates (Applied Biosystems). Each qPCR reaction mix contained 2.5 µL of 2X SYBR™ Green qPCR Master Mix (Thermo Fisher Scientific, Cat#4312704), 0.8 µM of each forward primer and reverse primer, 2.5 ng of cDNA template from reverse-transcribed RNA, PCR grade water to a total volume of 5 µL. Housekeeper genes mouse HPRT and mouse β-actin were utilized as dual internal references for normalisation. qPCR reactions were performed in triplicate for each sample and thermal cycling for qPCR was run as follow: 1 cycle of denaturation at 95ºC for 10 mins, 40 cycles of annealing/extension at 95ºC for 15 seconds, and at 60ºC for 60 seconds. Table S2.3 lists the primer pairs used in this study. Pluripotency induction in fibroblasts from reprogrammable mouse model rMEFs (from E13.5 embryos as per 58 ) were isolated from a reprogrammable mouse strain 24 and were heterozygous for both a DOX-inducible polycistronic cassette harbouring Oct4 , Sox2 , Klf4 , and c-Myc (TRE-OKSM) at the Col1a1 locus and a reverse tetracycline transactivator (m2rtTA) cassette expressed from the Rosa26 locus. rMEFs also harboured a heterozygous GFP reporter knocked in at the end of the genomic Oct4 locus to allow monitoring of re-activation of the endogenous pluripotency network. For reprogramming experiments, cryopreserved rMEFs were thawed and recovered in MEF media. rMEFs were at a seeded at a density of ~50,000 per cm 2 onto 12-well plates coated with 0.1% gelatin and maintained in ESC media supplemented with DOX (Sigma-Aldrich, Cat#D9891) at a final concentration of 1 µg/mL (reprogramming media). Reprogramming of fibroblast was initiated no later than at passage 3. Media was replaced every other day or as required. Ectopic OKSM expression through DOX was maintained for 10 days, unless otherwise specified. Afterwards, DOX was then withdrawn for 3 days to allow formation of stable iPSC colonies independent of ectopic OKSM expression. Cells were subsequently used for quantitative analysis of reprogramming efficiency and/or when indicated cultured for additional passages. For reprogramming experiments in the presence of Vitamin C (L-ascorbic acid; Sigma-Aldrich, Cat#A92902), Vitamin C was freshly added to reprogramming media at a final concentration of 50 µg/mL. Alkaline Phosphatase Staining To visualize stable iPSC colonies (in addition to assessing Oct4 -GFP expression 24 ), Alkaline Phosphatase (AP) staining was performed 3 days post DOX withdrawal using Vector® Black Substrate Kit (Vector Laboratories, Cat#SK-5200). Briefly, cells were washed with DPBS following and fixed with 4% PFA at room temperature for 15 mins. AP staining solution was freshly prepared using 100 mM Tris-HCl buffer (Thermo Fisher Scientific, Cat#15567027) with pH adjusted to 9.5 as per manufacturer’s recommendation. Cultures exposed to staining reagent were incubated at room temperature in dark for 20–35 mins. After incubation, staining solution was rinsed off with DPBS and subsequently with distilled water, followed by imaging. Flow Cytometry Cells samples were disassociated from culture plates using 0.25% EDTA/Trypsin (Thermo Fisher Scientific, Cat#25200072) at 37ºC for 3 mins. After washing cells once with 2% FBS in PBS via a pelleting step, cells were counted and ready for flow cytometry experiments. To measure reprogramming efficiency in case of rMEF experiments, cell samples from reprogramming experiments were stained with PI and fully-reprogrammed cells were quantified by assessing the percentage of live cells that had activated the endogenous Oct4 -GFP reporter as per previous work 30,59 . To assess the degree of differentiation in mouse ESC/PSCs, a two-step labelling protocol was performed using established pluripotency and somatic state associated cell surface markers 30,58 . Briefly, cells were labelled with monoclonal anti-mouse SSEA1-biotin (Invitrogen, Cat#13-8813-82) at 1:400 following by a second labelling step with anti-mouse EpCAM-PeCy7 (BioLegend, Cat#118215) at 1:400 and anti-mouse THY1.2-PE (BD Biosciences, Cat#553006) at 1:800 and Streptavidin-APC (BD Biosciences, Cat#554067) at 1:800. Labelling solutions were prepared in 2% FBS in PBS and antibody labelling was conducted on ice for 10 mins in a volume of 100 µL containing ~1 million cells. DAPI (Sigma-Aldrich, Cat#D9542) at 1:1000 dilution was used for live/dead cell discrimination. To account for spectral spillover in multi-color labelling, for each flow cytometry run either single-colour compensation beads (BD Biosciences, Cat#552845, 552843) or single-colour stained cells were used for compensation. Prior to flow cytometry acquisition, labelled samples were filtered through a 35μm meshed strainer (Corning, Cat#352235). For flow cytometry analysis, filtered samples were loaded on a Becton Dickinson LSRFortessa™ X-20 (BD Biosciences) and data for more than 10,000 live cells were recorded. For fluorescence-activated cell sorting (FACS), filtered samples were loaded on either a BD FACSAria Fusion (BD Biosciences) or a MoFlo Astrios Cell Sorter (Beckman Coulter Life Sciences) to isolate specific cell populations. All FACS experiments were performed using a 100-μm nozzle setup. Flow cytometry data were processed with FlowJo (v10) software for analysis and visualization. SKM-mediated pluripotency induction To perform SKM-mediated reprogramming, wild-type MEFs (isolated from E13.5 embryos as per 58 ) at passage 1 were infected with lentiviruses carrying a inducible TRE-SKM cassette ( Table S2.2 , VB230504-1547mzs; a DOX-inducible polycistronic SKM cassette 48,49 ) at an MOI of 10 and m2rtTA -expressing lentivirus ( Table S2.2 , VB220105-1463rbh) at an MOI of 3. For testing the effects of Nfix overexpression on SKM-induced reprogramming, cells were co-spin-infected with Nfix -overexpressing lentivirus ( Table S2.2 , VB201207-1202ujp). As a control, lentivirus carrying TagBFP2 only ( Table S2.2 , VB191219-1157xye) were used for coinfection. 72 hrs after lentiviral infection MEFs were reseeded onto 0.1% gelatin-coated 12-well plates at 15,000-30,000 cells per well. To start reprogramming by inducing SKM expression, 1 µg/mL DOX was added to ESC media containing 50 µg/mL Vitamin C (note that in our hands reprogramming with the inefficient SKM system was only possible in the presence of Vitamin C). Media were replaced every other day or as required. On day 11 or 12 post SKM induction, DOX was withdrawn for 3 days before quantification via flow cytometry. Due to the lack of a endogenous OCT4-GFP reporter in case of wild-type MEFs, surface markers labelling was performed (as per Flow Cytometry section) to quantify iPSC generation on day 3 post DOX removal leveraging using an established cell surface marker pannel 30,58 . Briefly, cells were labelled with monoclonal anti-mouse SSEA1-biotin (Invitrogen, Cat#13-8813-82) at 1:400 following by a second labelling step with anti-mouse EpCAM-PeCy7 (BioLegend, Cat#118215) at 1:400, anti-mouse THY1.2-FITC (BD Biosciences, Cat#553003) at 1:400 and Streptavidin-APC-Cy™7 (BD Biosciences, Cat#554063) at 1:400. DAPI (Sigma-Aldrich, Cat#D9542) at 1:1000 dilution was used for live/dead cell discrimination. Statistical Analysis (Wet-lab experiments) Statistical analysis for wet-lab experiments and qPCR was performed with GraphPad Prism 9.0 (GraphPad Software Inc), and statistical significance in the report was defined as P values < 0.05. Unpaired or paired (as appropriate) two-tailed t-test was used to measure difference between two groups with ≥3 replicates. Statistical comparisons among multiple groups were performed using the one-way ANOVA analysis with Dunnett’s multiple comparison tests. Data in bar charts are presented as Mean ± Standard Error of the Mean (SEM) unless specified differently. Design of barcoded lentiviral constructs For barcoded lentiviral constructs in “sense” direction (used for DoseH-seq experiments), the barcoding cassette was placed right before the 3’LTR of the lentiviral expression constructs. For “antisense”/inverted barcoded lentivirus constructs, a pA element (polyadenylation; BGH pA) was inserted at the end of the antisense-oriented promoter-ORF cassette and the barcoding cassette placed upstream of pA element. Below is the general sequence design for the uniquely barcoded cassettes integrated into sense or antisense lentiviral constructs. A primer binding site for library amplification is highlighted in cyan, and the variable part of a 15-bp barcoding cassette is highlighted in yellow with 4000 possible barcodes provided in Table S2.4 . Sequence design for barcoded cassettes in sense constructs: TTTCCCATGATTCCTTCATATTTGC GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG XXXXXXXXXXXXXXX GGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAAC Sequence design for barcoded cassettes in antisense constructs: TTTCCCATGATTCCTTCATATTTGC GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG XXXXXXXXXXXXXXX GGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAACGGCGAGTGGCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAG Lentiviral vectors for both barcoded construction designs were purchased from VectorBuilder Inc. ( Table S2.2 ). The corresponding lentiviral titers were listed in Table S2.5 . Single-cell multiome DoseH-seq assay (scRNA-seq + scATAC-seq) Human and mouse cell sample collection To validate nuclei hashtag, HDFs from different donors were thawed and maintained in standard culture conditions as described in cell culture section. Cells were harvested using 0.25% trypsin/EDTA, with HDFs collected and used for experiments at passage 8 and hPSCs (H9 line, karyotypically normal) at passage 69 ( Table S2.1 ). Cell samples were subsequently processed for nuclei isolation (see Nuclei isolation / 1 st “soft” lysis-step). To study mouse reprogramming under different stages and conditions, cells were harvested using 0.25% trypsin/EDTA at indicated time points (see Figure 1D, Figure 5E , day 0 [= three days post infection with lentiviruses]) and immediately frozen down using cryopreserve solution composed of 90% FBS and 10% DMSO. To study Nfix -induced mESC differentiation ( Figure 3F ), mESCs cultured under feeder-free conditions were employed and immediately frozen down at day 0, day 1.5, day 2, day 3 post lentiviral infection. To process cryopreserved samples for nuclei preparation (reprogramming/differentiation), samples were thawed and resuspended in cell culture medium. Cell sorting was performed to exclude dead cells and/or to enrich cells progressing towards to pluripotency (based on cell surface marker profile) as described in the figure legends, followed by nuclei isolation (see next section). Nuclei isolation / 1st “soft” lysis-step Detailed compositions for solutions in nuclei isolation are listed in Table S2.6, unless otherwise indicated. For each sample, at least 100,000-750,000 cells were pelleted by centrifugation at 300xg for 5 mins at 4℃ and supernatant was carefully removed. To the cell pellet, 100 µL of cold Lysis Buffer was added and gently pipetted 6-8 times (the 1st “soft” lysis step). Human cell samples (HDFs and hPSCs) and mouse cell samples (rMEFs, reprogramming rMEF cultures and mESCs) were incubated on ice for 3 minutes. After incubation, 0.5 mL of chilled Wash Buffer was added to the lysed cells and mixed by gentle pipetting 5 times with a 1000 µL tip. Following lysis, permeabilised cells were pelleted by centrifugation. All centrifugations to pellet the cells/nuclei post lysis were performed at 500xg for 5 mins at 4℃. In case of DoseH-seq experiments, permeabilised cells/nuclei were immediately used for the steps described in the “Sample multiplexing with nuclear hashtag antibodies and 10x capture” section. Titration of anti-Nuclear Pore Complex Proteins Antibody Alexa Fluor®647 conjugated anti-Nuclear Pore Complex Proteins Antibody (mouse monoclonal antibody, clone: Mab414, BioLegend, Cat#682203) was used for initial antibody titration experiments on both HDFs and hPSCs. The antibody titration/optimisation was performed at four different concentrations (1 ng/µL, 2.5 ng/µL, 5 ng/µL, 10 ng/µL). Briefly, cell samples were lysed with Lysis Buffer 2 ( Table S2.6 ). For each antibody concentration, 500,000-800,000 nuclei were resuspended in 100 µL of the chilled staining buffer solution (ST-SB, Table S2.6 ) containing AF647 anti-Nuclear Pore Complex Antibody. Nuclei samples were incubated for 10 minutes on ice, followed by three washes with 1 mL Wash Buffer ( Table S2.6 ). Finally, strained/filtered nuclei samples, stained with PI (Sigma-Aldrich, Cat#P4864) at 1:1000, were used for flow cytometry analysis. AF647 signals of PI positive nuclei/permeabilised cells were assessed to determine the optimal antibody concentration. RNase inhibitor was not added into the buffers used for these optimisation experiments. Note that the optimised antibody concentration (2.5 ng/ul) was also determined to be effective on mouse cell types of interest. Optimisation of RNase inhibitor treatment conditions To optimize the use of RNase inhibitor type and concentration ( Figure S1D ), HDFs were first subjected to the “soft” lysis step using cold Lysis Buffer ( Table S2.6 ) containing different RNase inhibitors or no inhibitor (negative control). Two different brands of RNase inhibitor were selected for testing, purchased from Sigma-Aldrich (recommended by the 10x Genomics protocol; Cat#03335399001) and Takara Bio (used in publication performing sample multiplexing via anti-Nuclear Pore Complex antibodies for transcriptional profiling 60 ; Cat#2313A). For the Sigma-Aldrich inhibitor, 1 U/µL, 0.2 U/µL and 0.1 U/µL were tested. For the Takara Bio RNase inhibitor, 0.04 U/µL was tested as per 60 . To simulate the time required for the labelling with anti-Nuclear Pore Complex antibodies, the lysed cells/nuclei were left on ice for 60 mins. RNA extraction was performed for both freshly lysed cells and the lysed samples after 60-minute incubation on ice. To exam RNA integrity preservation throughout the procedure ( Figure S1E ) HDFs and hPSCs were subjected to the entire two-step lysis process with the Sigma RNase inhibitor at optimized concentrations (0.1 U/µL during and after the “soft” lysis step and 1 U/µL during and after the “harsh” lysis step). For these experiments, RNA extraction was performed for fresh cells and cell/nuclei samples collected throughout the procedure. Values of RNA integrity number (RIN) for extracted RNA were determined by Agilent Bioanalyzer. Sample multiplexing with hashtag antibodies and 10x capture Nuclear hashtag antibodies solutions were prepared by adding 0.25 µg of antibodies (BioLegend, see Table S2.7 for details) into 125 µL of ST-SB solution ( Table S2.6 ) per sample and kept on ice during the last spin of nuclei isolation procedure (see Nuclei isolation section). For 5 antibodies with lower labelling efficiency (see Table S2.7) , 0.5 µg was used instead. Following the last of the first “soft” lysis-step(see section: Nuclei isolation / 1st “soft” lysis-step), all supernatant was carefully removed to leave behind the pellet. The pellet was resuspended in the prepared antibody solution by gentle pipetting and incubated on ice for 10 mins. After incubation, 0.5 mL of chilled Wash Buffer was added into the tubes and gently pipetted 5 times. Tubes were then spun down, and supernatant was carefully removed. Next, nuclei were washed two more times by resuspending the pellet in 0.5 mL of chilled Wash Buffer ( Table S2.6 ), nuclei were filtered through a 40 µm strainer (Corning, Cat# 352340) and nuclei counts were performed. Subsequently, nuclei were pooled at a desired ratio into one sample tube. After centrifugation, the pellet were resuspended with 100 µL of Lysis Buffer 2 ( Table S2.6 ) by gently pipetting 5 times (the 2nd “harsh” lysis-step). Human samples (HDFs and hPSCs), rMEFs and reprogramming rMEFs were incubated on ice for 2 mins and mESCs and their early differentiation products were incubated on ice for 1 min. Right afterwards, 0.5 mL of chilled Wash Buffer 2 ( Table S.6 ) was added to the lysed cells and gently mixed by pipetting. The tubes were then centrifuged, and the supernatant was carefully removed. This washing step was repeated twice more. At the third wash, prior to centrifugation to pellet the cells, the nuclei were filtered through a 40 µm strainer. Nuclei counts were performed again to quantify nuclei concentration and ensure optimal cell lysis. Finally, pooled nuclei were pelleted and resuspended in NBuffer ( Table S2.6 ) at a density of 15,000 nuclei/µL and processed immediately for Tn5 reaction and nuclei capture with 10x Genomics instrument according to the manufacturer’s instructions (Chromium Next GEM Single Cell Multiome ATAC + Gene Expression, CG000338 Rev F). Preparation of the four library types Single cell multiome (RNA + ATAC) libraries were prepared using 10x Genomics Chromium Next GEM Single Cell Multiome ATAC + Gene Expression Reagents (PN-1000285) according to the User Guide (Document CG000338) with the following modifications: Nuclei input concentration was 15,000/µL, with targeted nuclei recovery of 30,000. For Post-GEM Incubation Cleanup - SPRISelect (Step 3.2), bead volume was adjusted to 100µL (2.0X bead ratio). 1uL of 5uM HTO_additive_v2 primer (GTGACTGGAGTTCAGACGTGTGCTCTTCCGAT*C*T) was added to Pre-amplification Mix (Step 4.1) for final reaction volume of 101µL. For Pre-amplification SPRI Cleanup (Step 4.3), reaction was transferred to 1.5mL tube for clean-up, and bead volume was adjusted to 202µL (2.0X bead ratio). 1uL of 5uM HTO_additive_v2 primer (GTGACTGGAGTTCAGACGTGTGCTCTTCCGAT*C*T) and 1µL of 5uM Chromium_Read2N primer (CTCGTGGGCTCGGAGATGTGTATAAGAGAC) were added to cDNA Amplification Mix (Step 6.1) for final reaction volume of 102µL. cDNA Cleanup - SPRISelect (Step 6.2) was not performed. Instead, a SPRISelect was performed to separate fragment sizes as per Step 2.3A and 2.3B of Chromium Next GEM Single Cell 3’ Reagent Kit v3.1 with Feature Barcoding Technology for Cell Surface Protein User Guide (Document CG000206). The cDNA eluted from the pellet fraction (2.3A) was used for gene expression library construction. For gene expression Sample Index PCR (Step 7.5), Dual Index TT Set A was not used. Instead, Single Index Kit T Set A and SI-PCR primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC) were used according to User Guide for single indexing gene expression workflow (Document CG000206; Step 3.5). Nuclear hashtag antibody libraries were generated according to BioLegend protocol for Total-SeqA, using 2x Kapa Hifi HotStart Readymix (Roche, KK2601), 5µL of purified supernatant (2.3B) fraction from cDNA cleanup as input, and 12 cycles of PCR. For Lentiviral perturbation libraries, an additional 0.6X/0.9X double-sided SPRISelect cleanup was performed with 15µL of purified supernatant (2.3B) fraction from cDNA cleanup, eluted in 40µL Buffer EB. 12.5µL re-purified cDNA was used as input into the library prep reaction, with 2x Kapa HiFi HotStart Readymix (Roche, KK2601), 2.5µL 10µM SI-PCR primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC), and 2.5 µL Single Index Kit N Set A index in a final volume of 50 µL. 14 cycles of 98°C denaturation, 62°C annealing and 72°C extension were performed, with final product cleaned up using 0.65X SPRISelect bead ratio. Library size distribution of all four purified libraries (snRNA-seq, snATAC-seq, Hashtag antibody library, Lentiviral perturbation library) was assessed using a High Sensitivity DNA Kit (Agilent, 5067-4626) and a BioAnalyzer 2100. Concurrent sequencing of all 4 libraries with BGI T7 technology We benchmarked BGI sequencing technology against Illumina sequencing technology in the past and demonstrated comparable outcomes and reduced costs in case of BGI technology 61 . Library sequencing was performed on the BGI DNBSEQ-T7 instrument for which we optimised a strategy to sequence all four libraries concurrently at 30/60/5/5 ratio for Gene expression, ATAC, Hashtag antibody library and Lentiviral perturbation libraries, respectively. Sequencing was performed by MGI Australia on a DNBSEQ-T7 instrument in 2x100bp paired end mode using run configuration Read1 50, Read2 90, i7 Index 8, i5 Index 24 (i7 length of 8bp is sufficient due to single indexing strategy). As part of one T7 run yielding 5 billion read pairs, we concurrently sequenced the 8 libraries from two DoseH-seq 10x Genomics capture runs. The data sets introduced in Figure 3E and the multi-perturb data set introduced in Figure 5C were sequenced as described above, enabling discreet demultiplexing of the different library types and economic sequencing costs. Preceding establishment of the DNBSEQ-T7 sequencing strategy for concurrent sequencing of all 4 library types in a single run (used for differentiation and multiperturb experiments part of Figure 3 and Figure 5 , respectively), data sets introduced in Figure 1B and 1D were sequenced through a strategy utilising BGI T7, BGI G400 and Illumina NextSeq500 instruments. Initially, Gene Expression (scRNA/GEX) libraries were sequenced together with lentiviral barcode and HTO libraries on an MGI DNBSEQ-T7 using a paired-end run configured as R1 50 cycles, R2 90 cycles, i5 16 cycles, i7 16 cycles, with demultiplexing performed using the i7 sample index; for human data set generation ( Figure 1B ) the HTO library was sequenced on a NextSeq500 instrument. Multiome ATAC libraries were sequenced separately on an MGI DNBSEQ-G400 instrument (R1 50, i7 8, i5 24, R2 49), where the i5 read captures the full ATAC barcode structure (16-bp 10x cell barcode + 8-bp spacer). In accordance with 10x guidance that index reads may be longer than required, any extra index bases beyond the required sample index were trimmed prior to downstream. Human fibroblast scATAC+RNA-seq processing Prior to mapping, we performed preprocessing to ensure compatibility between BGI sequencing Fastq files and CellRanger. In order to demultiplex the hashtag reads from BGI Fastq files (which included transcript reads), we first converted raw Fastqs from BGI format to illumina format (https://github.com/powellgenomicslab/BGI_vs_Illumina_Benchmark). Next, R2 reads were trimmed to 101bp using trimmomatic 62 . For each lane, we filtered the trimmed R2 with the 15 antibodies at the beginning of each read. Finally, the reads were separated into Fastq files containing HTO or transcript reads. Fastq files were then processed using CellRanger ARC v 2.0.0 and mapped to the hg38 genome (10x CellRanger ARC reference GRCh38-2020-A-2.0.0). CellRanger-ready Fastq files are provided in the data repository. Demultiplexing and removal of multiplets was performed using the deMULTIplex2 R package (v 1.0.1) 63 . The demultiplexTags function was run on the HTO counts with default settings and fibroblast cells annotated as ‘multiplet’ or ‘negative’ were filtered out prior to further analysis. Individual fibroblast donors were determined based on deMULTIplex2 HTO assignments. As validation of HTO demultiplexing we compared to SNP-based demultiplexing using the Souporcell package 64 , which was run on the gene expression BAM file produced by CellRanger ARC. Downstream analysis of scATAC+RNA-seq data was performed using Seurat (v 5.1.0) 65 and Signac (v 1.14.0) 66 . We generated a UMAP plot of fibroblast and ESC cells based on the multimodal RNA and ATAC-seq data, following the Seurat/Signac protocol for multimodal analysis. For RNA data we first ran the Seurat SCTransform and RunPCA functions. For the ATAC data we used the RunTFIDF function, followed by FindTopFeatures with min.cutoff set to ‘q75’ to capture the top quartile of peaks, and finally the RunSVD function. We used the FindMultiModalNeighbors on the PCA and LSI coordinates, with k nearest neighbors of 20 and based on the first 15 dimensions of both PCA and LSI. Coverage track over GAPDH was created using the Signac CoveragePlot function. Mouse scATAC+RNA-seq processing Prior to mapping, preprocessing was performed to ensure compatibility between BGI Fastq files and CellRanger. The provided R2 reads were processed to extract the 16bp I2 read from the end of the read and generate updated R2 reads. Fastq files were then processed using CellRanger ARC v 2.0.2 and mapped to the mm10 genome (10x CellRanger ARC reference mm10-2020-A-2.0.0). The processed data were subsequently analysed using Seurat and Signac. For both DoseH-seq runs we required a minimum of 5000 RNA UMI and ATAC fragment counts per cell (see Table S2.8 for full filtering criteria). Demultiplexing and removal of multiplets was performed using the deMULTIplex2 R package 63 as above, with cells annotated as ‘multiplet’ or ‘negative’ filtered out prior to further analysis. Sample conditions were determined based on deMULTIplex2 HTO assignments (see Table S2.7 for HTO sample mapping). CellRanger-ready Fastq files are provided in the data repository. Peak calling was performed in a condition-specific manner using the MACS2 67 wrapper in Signac (CallPeaks) and counts regenerated against the updated peak set. We then generated UMAP coordinates in Signac based on the ATAC-seq data. The top 20% of peaks was selected with FindTopFeatures and min.cutoff = 'q80'. The RunSVD function was run with default parameters and RunUMAP with reduction = ‘lsi’, dimensions = 1-15 and n.neighbors = 50. The UMAP coordinates were used as an initial input to SCANPY 68 to generate a force-directed layout visualisation using 50 nearest neighbours calculated on the first 15 LSI dimensions. The FDL map revealed that the day 10+3 cells were largely indistinguishable from the day 6 cells, implying that many of the cells in this group were refractory to reprogramming. These were therefore removed from downstream analyses to mitigate potential confounding effects in UMAP and DFL calculations. UMAP and DFL were regenerated on the filtered cells using the same parameters as above. To calculate dendrograms of average transcriptional similarity, we used the Seurat BuildClusterTree function, utilising the top 2000 variable genes for the subset of conditions without Vitamin C, and 3000 for the dendrogram on all conditions. The dendrogram was then visualised using the dendextend R package 69 . Lenti score generation Lenti scores were obtained using logistic regression models trained to predict perturbation status based on lenti count data. First, the raw lenti counts were normalised using SCTransform 70 . For the first DoseH-seq, which had one gene perturbation ( Nfix OE), a model was trained to predict OE vs control samples. The R glm function was used with family = binomial(link='logit') and the model Condition ~ OE + Control, where Condition is a binary vector (1=OE, 0=BFP), OE is the normalised Nfix lenti counts and Control is the normalised BFP lenti counts. The log-odds scores from the model were then used to represent the lenti score. For the multi-perturb dataset, we built logistic regression models as above, but incorporating intersections between the multiple perturbations. For O+X ( Oct4 + Nfix ), we built a model predicting the O+X conditions vs the remainder using the formula: Condition ~ Oct4 + Control + shRNA +Nfix + Nfix:Oct4. For J+X ( c-Jun shRNA + Nfix ), we built a model predicting the J+X conditions vs the remainder using the formula: Condition ~ shRNA + Oct4 + Control + Nfix + Nfix:shRNA. For Nfix (which included Nfix by itself as well as O+X and J+X) we built a model predicting all Nfix OE conditions vs BFP control using the formula: Condition ~ Nfix + Control + Oct4 + shRNA+ Nfix:shRNA + Nfix:Oct4. Low, mid and high cell stratifications were determined by ranking the lenti score within a timepoint/condition and sub-setting the cells by thirds. Low cells were defined as the bottom 33% of ranked cells, high as the top 33% and mid as the remainder. Integration across DoseH-seq runs Integration of conditions between the first and second DoseH-seq runs was performed on the scATAC-seq data. For integration of the reprogramming conditions, we included from the first run: the Nfix overexpression conditions (rMEF [day 0], day 2, day 6 and THY1 - ), the BFP +VitC samples (day 2, day 6 and THY1 - ) as well as the mESC cells and BFP rMEF control. From the second run we included all reprogramming samples, as well as the mESCs and rMEF control. We merged the peak sets from both runs using the GRanges 71 reduce function. The merged peak sets were then recounted using the Signac FeatureMatrix function. Integration analysis was performed on the scATAC-seq data in Signac by first running FindTopFeatures with min.cutoff = ‘q80’ followed by RunTFIDF and RunSVD for both datasets. We identified integration anchors using FindIntegrationAnchors with reduction = ‘rsli’ and dims = 1:15. Finally, LSI embeddings were integrated using the IntegrateEmbeddings function, with dims.to.integrate set to 1:15. A UMAP of integrated conditions was then generated using the integrated LSI, with dims = 1:15. As the second run ‘pool’ reprogramming conditions contained pools of reprogramming timepoints, we mapped the originating timepoint of individual cells based on timepoints from run 1 using the Seurat/Signac label transfer approach. The mapping was performed separately for the BFP (+VitC) reprogramming timepoints and the Nfix OE reprogramming timepoints. In both cases, transfer anchors were first identified using the FindTransferAnchors function with the first 25 LSI dimensions and used as input to the MapQuery function. We performed integration between the first run and the differentiation samples from run 2 using the process as above. For peak merging, recounting and integration we used the full reprogramming dataset from the first DoseH-seq run and from the second run integrated the samples from the differentiation time course as well as the mESCs and BFP and Nfix OE rMEFs. For the UMAP of differentiation vs Nfix OE reprogramming samples, we retained only the Nfix overexpressing samples from the first DoseH-seq experiment, in addition to the mESC and BFP rMEF control. CytoTRACE2 analysis The CytoTRACE2 25 v1.0.0 R package was used for CytoTRACE2 analyses. For the first DoseH-seq run, corrected UMI counts were generated using SCTransform, with percent mitochondrial content specified in the vars.to.regress parameter. The corrected UMI counts were then input to the cytotrace2 function with default parameters. For the TCGA samples, Ensembl IDs for unstranded, protein coding counts were first converted to HGNC gene symbols, then input to the cytotrace2 function with species = “human”. Calling differential accessible regions Differentially accessible region calling was performed using a pseudo-bulk approach adopted from 72 . Cells within groups to be tested were randomly pooled into 10 pseudo-bulk replicates and ATAC fragments summed. Testing was performed using DESeq2 29 and peaks with FDR < 0.001 were considered significant. Differentially expressed genes Differential gene expression testing was performed using the Seurat FindMarkers function with MAST testing 73 . Genes with adjusted p-value 0.5 were considered significant. Motif enrichment analysis Motif enrichment analysis was performed using the HOMER 74 findMotifsGenome.pl (mm10) function with parameters -mknown to match against known vertebrate motifs. Motif enrichment was assessed against the default HOMER GC-matched background unless otherwise indicated. Peak:gene linkage analysis We performed peak:gene linkage analysis for the run 2 samples. We used the LinkPeaks function in Signac, with distance set to 500,000, min.cells set to 20 and a p-value cutoff of 0.05. Gene ontology terms GO term tests were performed on genes linked to peaks for the indicated analysis. GO term over-representation testing was performed using the ViSEAGO 75 (v 1.11.0) R package. The BiomaRt package 76 was used to convert gene symbols to Uniprot/Swissprot identifiers and GO annotations for the genes were performed using the ViSEAGO Uniprot2GO and annotate functions. Over-representation testing was performed using the TopGO 77 (v 2.56.0) runTest function, with algorithm = “classic” and statistic = “fisher” and the full set of protein coding genes with potential links to peaks used as background. The returned p-values were adjusted for multiple testing using Benjamini & Hochberg correction, with GO terms considered significant if they obtained an adjusted p-value/FDR < 0.05. STRING network analysis STRING network analysis was performed on genes linked to the 670 opening DARs overlapping between the reprogramming day 2 Nfix (vs BFP rMEF) and differentiation days 2+3 Nfix (vs mESCs) conditions. The linked genes were submitted to the STRING (v 12) 78 web server and high confidence (score > 700) connections were retained. The connections were plotted in Cytoscape 79 , with connected subnetworks of over size 2 retained for visualisation and genes coloured according to annotation with the GO Biological Process term “Developmental process”. Motif scoring We mapped motif position weight matrix (PWM) scores using HOMER findMotifsGenome.pl with the -find parameter set to a PWM set of interest. For analysis of OCT4 PWM scores in O+X and Nfix high vs Nfix low cells, we used the HOMER OCT4 motif (Oct4(POU,Homeobox)/mES-Oct4-ChIP-Seq(GSE11431)/Homer). For comparison of PWM scores in iPSCs to reprogramming and differentiation, we added the same HOMER OCT4 motif as well as the SOX2 motif (Sox2(HMG)/mES-Sox2-ChIP-Seq(GSE11431)/Homer). For the OCT4:SOX2 heterodimer we used the JASPAR Pou5f1::Sox2 PWM (MA0142.1). To provide a consistent scoring threshold we set a cutoff at 25% of the maximum motif score to enable comparison of low affinity motif scores. Comparison with pan-tissue fibroblast atlas We downloaded the mouse perturbed-state fibroblast atlas 28 as a Seurat object from https://www.fibroxplorer.com/download and calculated differentially expressed genes for each of the fibroblast subtypes compared to the remainder. We compared the log 2 fold-changes for each of the fibroblast subtypes to the reprogramming comparisons using the R cor.test function with method = “spearman”. The reprogramming comparisons were: 1) rMEF Nfix low vs rMEF BFP, 2) day 6 BFP vs rMEF BFP and 3) day 6 Nfix low vs rMEF Nfix . Comparison with TCGA RNA-seq TCGA tumour and normal tissue samples were downloaded via the TCGAbiolinks R/Bioconductor package 80 . The GDCquery function was used to download samples annotated as “Primary Tumor” or “Sold Tissue Normal” and RNA-seq counts were obtained with data.category = “Transcriptome Profiling”, data.type = “Gene Expression Quantification” and workflow.type = “STAR - Counts”. Unstranded, protein coding counts were retained and used for log 2 1 + counts per-million (CPM) normalization. The R biomaRt 76 package (v. 2.60.1) was used to convert Ensembl IDs to HGNC (HUGO Gene Nomenclature Committee) gene symbols. The ‘project_id’ and ‘name’ columns from the sample meta data were used to categorize the frequency of tumour types in the NFI / POU5F1 expression stratifications. For expression stratifications, we defined POU5F1 activation (bottom bar in Figure 6A ) as above top 95% of normal expression and high (top bar) as over top 99% of tumour expression. For each NFI factor (left and right bars in Figure 6A represent the 25 th and 75 th percentiles, respectively, of all samples). For analysis of POU5F1 /OCT4 isoforms, Kallisto 81 transcript count estimates were downloaded from UCSC 82 (https://xenabrowser.net/datapages/?dataset=tcga_Kallisto_est_counts&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). After cross-referencing with meta data, the samples were grouped into subsets based on disease status (normal tissue/tumour) and tumour type. 727 normal tissue samples and 9,634 tumour samples were obtained. To detect the counts of POU5F1 isoforms, the HGNC symbols of the Ensembl transcript IDs were first annotated using biomaRt (v. 2.62.1) 83,84 . To evaluate percentages of sample types positive for OCT4A, we considered samples with an estimated count of the OCT4A isoform (Ensembl ID ENST00000259915.12) greater than 10 to be OCT4A positive. Comparison with kidney scRNA-seq The filtered_gene_bc_matrcies_h5.h5 files were downloaded from GEO (GSE159115) along with the annotation files. For joint UMAP visualisation, we followed the Seurat 65 integration pipeline employed by the original authors. For each individual sample, NormalizeData and FindVariableFeatures were run, followed by FindIntegrationAnchors with dims = 1:30 and IntegrateData with dims = 1:30. We then run ScaleData, RunPCA and RunUMAP with dims = 1:30. For analysis of module scores, we first identified peaks from the run 1/run 2 diff integration that contained both a match for the HOMER NFI motif “NF1-halfsite(CTF)/LNCaP-NF1-ChIP-Seq(Unpublished)/Homer” and the OCT4 motif “Oct4(POU,Homeobox)/mES-Oct4-ChIP-Seq(GSE11431)/Homer”. For the OCT4 motif, we allowed for low affinity motif occurrences as described in subsection ‘Motif scoring’ above. Using the peak:gene links described above (subsection ‘Peak:gene linkage analysis’) we identified genes linked the NFI/OCT4 positive peaks and overlapped with genes upregulated in reprogramming day 2 NFIX cells relative to BFP rMEF cells (MAST testing; p adj 0.5). These genes were converted to human using the nichenetr 85 convert_mouse_to_human_symbols function and input to the Seurat AddModuleScore function. Declarations Acknowledgments We acknowledge the high-quality scientific and technical assistance of the Translational Research Institute Flow Cytometry facility. The authors thank the BGI Australia facility for sequencing on BGI instruments and IMB/UQ sequencing facility for sequencing on Illumina Instruments. We also thank UQ Biological Resources (Queensland Bioscience Precinct) and Monash Animal Services for help with animal husbandry. Imaging was performed at the IMB Microscopy Facility at UQ. This work was supported by an NHMRC ideas grants to C.M.N. (APP2013574), a collaborative project grant between C.M.N. and UQ’s Genome Innovation Hub, and through start-up funding committed by Prof Brandon Wainwright from the Institute of Molecular Bioscience, UQ. It received further support through a MGI Collaborative grant from BGI Australia and through two Innovations Connections grants from the Australian Government with C.M.N. as the academic lead and financial support through industry partner Scott Needham (Systemic Medicine). M.A. and W.A. extend their appreciation to the Deputyship for Research and Innovation, Ministry of Education, Saudi Arabia, for funding through project numbers 1-441-120 and 1-441-121. M.N-S. and X.C. received support through the Women’s Research Assistance Program from the Queensland state government. Results shown in this study include data generated by the TCGA Research Network: https://www.cancer.gov/tcga. Finally, we would like to thank Dr Geoffrey McDermott and Dr Cathrine King from 10x Genomics for practical advice as part of DoseH-seq assay development. Contributions C.M.N., Y.Y., R.P. designed the experiments with additional support from S.A.; Y.Y and R.P. contributed equally to this study. Y.Y. performed the study’s cell culture experiments and wet-lab based aspects of DoseH-seq assay development with support from X.C., S.A., J.Z., Y.H., M.E., M.G., D.P., C.M.S., S.W. under supervision of C.M.N, J.B., F.J.S, S.T.N.; R.P. performed the study’s computational analyses, with additional support from S.L., K.T. and J.C.M under guidance of C.M.N.; BioRender schematics were created by Y.Y and R.P.; W.A., M.J.S., M.P., S.C., S.S., and M.D.W. contributed intellectually. M.N-S., R.P., Y.Y and C.M.N. prioritised NFI transcription factors as the study’s targets. Y.Y, R.P., and C.M.N. wrote the manuscript. All authors approved and contributed to the final version of the manuscript. Declaration of interests The authors declare no competing interests. Data availability Single-nucleus RNA+ATAC-seq DoseH-seq runs generated as part of this study will be made publicly available upon publication. References Nair, S., Ameen, M., Sundaram, L., Pampari, A., Schreiber, J., Balsubramani, A., Wang, Y.X., Burns, D., Blau, H.M., Karakikes, I., Wang, K.C., and Kundaje, A. (2023). Transcription factor stoichiometry, motif affinity and syntax regulate single-cell chromatin dynamics during fibroblast reprogramming to pluripotency. bioRxiv. 10.1101/2023.10.04.560808. Fei, L., Zhang, K., Poddar, N., Hautaniemi, S., and Sahu, B. (2023). Single-cell epigenome analysis identifies molecular events controlling direct conversion of human fibroblasts to pancreatic ductal-like cells. Dev Cell 58 , 1701-1715 e1708. 10.1016/j.devcel.2023.08.023. Boudreau-Pinsonneault, C., David, L.A., Lourenco Fernandes, J.A., Javed, A., Fries, M., Mattar, P., and Cayouette, M. (2023). Direct neuronal reprogramming by temporal identity factors. Proc Natl Acad Sci U S A 120 , e2122168120. 10.1073/pnas.2122168120. Schraivogel, D., Gschwind, A.R., Milbank, J.H., Leonce, D.R., Jakob, P., Mathur, L., Korbel, J.O., Merten, C.A., Velten, L., and Steinmetz, L.M. (2020). Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat Methods 17 , 629-635. 10.1038/s41592-020-0837-5. Replogle, J.M., Norman, T.M., Xu, A., Hussmann, J.A., Chen, J., Cogan, J.Z., Meer, E.J., Terry, J.M., Riordan, D.P., Srinivas, N., et al. (2020). Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat Biotechnol 38 , 954-961. 10.1038/s41587-020-0470-y. Liu, W., Saelens, W., Rainer, P., Biocanin, M., Gardeux, V., Gralak, A.J., van Mierlo, G., Gebhart, A., Russeil, J., Liu, T., Chen, W., and Deplancke, B. (2025). Dissecting the impact of transcription factor dose on cell reprogramming heterogeneity using scTF-seq. Nat Genet 57 , 2522-2535. 10.1038/s41588-025-02343-7. Jost, M., Santos, D.A., Saunders, R.A., Horlbeck, M.A., Hawkins, J.S., Scaria, S.M., Norman, T.M., Hussmann, J.A., Liem, C.R., Gross, C.A., and Weissman, J.S. (2020). Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat Biotechnol 38 , 355-364. 10.1038/s41587-019-0387-5. Metzner, E., Southard, K.M., and Norman, T.M. (2025). Multiome Perturb-seq unlocks scalable discovery of integrated perturbation effects on the transcriptome and epigenome. Cell Syst 16 , 101161. 10.1016/j.cels.2024.12.002. Yan, R.E., Corman, A., Katgara, L., Wang, X., Xue, X., Gajic, Z.Z., Sam, R., Farid, M., Friedman, S.M., Choo, J., et al. (2024). Pooled CRISPR screens with joint single-nucleus chromatin accessibility and transcriptome profiling. Nat Biotechnol. 10.1038/s41587-024-02475-x. Liu, X., Nefzger, C.M., Rossello, F.J., Chen, J., Knaupp, A.S., Firas, J., Ford, E., Pflueger, J., Paynter, J.M., Chy, H.S., et al. (2017). Comprehensive characterization of distinct states of human naive pluripotency generated by reprogramming. Nat Methods 14 , 1055-1062. 10.1038/nmeth.4436. Naqvi, S., Kim, S., Hoskens, H., Matthews, H.S., Spritz, R.A., Klein, O.D., Hallgrímsson, B., Swigut, T., Claes, P., Pritchard, J.K., and Wysocka, J. (2023). Precise modulation of transcription factor levels identifies features underlying dosage sensitivity. Nat Genet 55 , 841-851. 10.1038/s41588-023-01366-2. Fane, M., Harris, L., Smith, A.G., and Piper, M. (2017). Nuclear factor one transcription factors as epigenetic regulators in cancer. Int J Cancer 140 , 2634-2641. 10.1002/ijc.30603. Chen, K.S., Lim, J.W.C., Richards, L.J., and Bunt, J. (2017). The convergent roles of the nuclear factor I transcription factors in development and cancer. Cancer Lett 410 , 124-138. 10.1016/j.canlet.2017.09.015. Patrick, R., Naval-Sanchez, M., Deshpande, N., Huang, Y., Zhang, J., Chen, X., Yang, Y., Tiwari, K., Esmaeili, M., Tran, M., et al. (2024). The activity of early-life gene regulatory elements is hijacked in aging through pervasive AP-1-linked chromatin opening. Cell Metab 36 , 1858-1881 e1823. 10.1016/j.cmet.2024.06.006. Lu, J.Y., Tu, W.B., Li, R., Weng, M., Sanketi, B.D., Yuan, B., Reddy, P., Rodriguez Esteban, C., and Izpisua Belmonte, J.C. (2025). Prevalent mesenchymal drift in aging and disease is reversed by partial reprogramming. Cell. 10.1016/j.cell.2025.07.031. Liu, J., Han, Q., Peng, T., Peng, M., Wei, B., Li, D., Wang, X., Yu, S., Yang, J., Cao, S., et al. (2015). The oncogene c-Jun impedes somatic cell reprogramming. Nat Cell Biol 17 , 856-867. 10.1038/ncb3193. Knaupp, A.S., Buckberry, S., Pflueger, J., Lim, S.M., Ford, E., Larcombe, M.R., Rossello, F.J., de Mendoza, A., Alaei, S., Firas, J., et al. (2017). Transient and Permanent Reconfiguration of Chromatin and Transcription Factor Occupancy Drive Reprogramming. Cell Stem Cell 21 , 834-845.e836. 10.1016/j.stem.2017.11.007. Li, D., Liu, J., Yang, X., Zhou, C., Guo, J., Wu, C., Qin, Y., Guo, L., He, J., Yu, S., et al. (2017). Chromatin Accessibility Dynamics during iPSC Reprogramming. Cell Stem Cell 21 , 819-833.e816. 10.1016/j.stem.2017.10.012. Xing, Q.R., El Farran, C.A., Gautam, P., Chuah, Y.S., Warrier, T., Toh, C.D., Kang, N.Y., Sugii, S., Chang, Y.T., Xu, J., et al. (2020). Diversification of reprogramming trajectories revealed by parallel single-cell transcriptome and chromatin accessibility sequencing. Sci Adv 6 . 10.1126/sciadv.aba1190. Chronis, C., Fiziev, P., Papp, B., Butz, S., Bonora, G., Sabri, S., Ernst, J., and Plath, K. (2017). Cooperative Binding of Transcription Factors Orchestrates Reprogramming. Cell 168 , 442-459.e420. 10.1016/j.cell.2016.12.016. Markov, G.J., Mai, T., Nair, S., Shcherbina, A., Wang, Y.X., Burns, D.M., Kundaje, A., and Blau, H.M. (2021). AP-1 is a temporally regulated dual gatekeeper of reprogramming to pluripotency. Proc Natl Acad Sci U S A 118 . 10.1073/pnas.2104841118. Wille, C.K., and Sridharan, R. (2022). DOT1L inhibition enhances pluripotency beyond acquisition of epithelial identity and without immediate suppression of the somatic transcriptome. Stem Cell Reports 17 , 384-396. 10.1016/j.stemcr.2021.12.004. Kang, H.M., Subramaniam, M., Targ, S., Nguyen, M., Maliskova, L., McCarthy, E., Wan, E., Wong, S., Byrnes, L., Lanata, C.M., et al. (2018). Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol 36 , 89-94. 10.1038/nbt.4042. Stadtfeld, M., Maherali, N., Borkent, M., and Hochedlinger, K. (2010). A reprogrammable mouse strain from gene-targeted embryonic stem cells. Nat Methods 7 , 53-55. 10.1038/nmeth.1409. Kang, M., Gulati, G.S., Brown, E.L., Qi, Z., Avagyan, S., Armenteros, J.J.A., Gleyzer, R., Zhang, W., Steen, C.B., D'Silva, J.P., et al. (2025). Improved reconstruction of single-cell developmental potential with CytoTRACE 2. Nat Methods 22 , 2258-2263. 10.1038/s41592-025-02857-2. Masur, S.K., Dewal, H.S., Dinh, T.T., Erenburg, I., and Petridou, S. (1996). Myofibroblasts differentiate from fibroblasts when plated at low density. Proc Natl Acad Sci U S A 93 , 4219-4223. 10.1073/pnas.93.9.4219. Patrick, R., Janbandhu, V., Tallapragada, V., Tan, S.S.M., McKinna, E.E., Contreras, O., Ghazanfar, S., Humphreys, D.T., Murray, N.J., Tran, Y.T.H., et al. (2024). Integration mapping of cardiac fibroblast single-cell transcriptomes elucidates cellular principles of fibrosis in diverse pathologies. Sci Adv 10 , eadk8501. 10.1126/sciadv.adk8501. Buechler, M.B., Pradhan, R.N., Krishnamurty, A.T., Cox, C., Calviello, A.K., Wang, A.W., Yang, Y.A., Tam, L., Caothien, R., Roose-Girma, M., et al. (2021). Cross-tissue organization of the fibroblast lineage. Nature 593 , 575-579. 10.1038/s41586-021-03549-5. Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15 , 550. 10.1186/s13059-014-0550-8. Polo, J.M., Anderssen, E., Walsh, R.M., Schwarz, B.A., Nefzger, C.M., Lim, S.M., Borkent, M., Apostolou, E., Alaei, S., Cloutier, J., et al. (2012). A molecular roadmap of reprogramming somatic cells into iPS cells. Cell 151 , 1617-1632. 10.1016/j.cell.2012.11.039. Wang, T., Chen, K., Zeng, X., Yang, J., Wu, Y., Shi, X., Qin, B., Zeng, L., Esteban, M.A., Pan, G., and Pei, D. (2011). The histone demethylases Jhdm1a/1b enhance somatic cell reprogramming in a vitamin-C-dependent manner. Cell Stem Cell 9 , 575-587. 10.1016/j.stem.2011.10.005. Cimmino, L., Neel, B.G., and Aifantis, I. (2018). Vitamin C in Stem Cell Reprogramming and Cancer. Trends Cell Biol 28 , 698-708. 10.1016/j.tcb.2018.04.001. Chen, J., Guo, L., Zhang, L., Wu, H., Yang, J., Liu, H., Wang, X., Hu, X., Gu, T., Zhou, Z., et al. (2013). Vitamin C modulates TET1 function during somatic cell reprogramming. Nat Genet 45 , 1504-1509. 10.1038/ng.2807. Göös, H., Kinnunen, M., Salokas, K., Tan, Z., Liu, X., Yadav, L., Zhang, Q., Wei, G.H., and Varjosalo, M. (2022). Human transcription factor protein interaction networks. Nat Commun 13 , 766. 10.1038/s41467-022-28341-5. Banito, A., Rashid, S.T., Acosta, J.C., Li, S., Pereira, C.F., Geti, I., Pinho, S., Silva, J.C., Azuara, V., Walsh, M., Vallier, L., and Gil, J. (2009). Senescence impairs successful reprogramming to pluripotent stem cells. Genes Dev 23 , 2134-2139. 10.1101/gad.1811609. Hong, H., Takahashi, K., Ichisaka, T., Aoi, T., Kanagawa, O., Nakagawa, M., Okita, K., and Yamanaka, S. (2009). Suppression of induced pluripotent stem cell generation by the p53-p21 pathway. Nature 460 , 1132-1135. 10.1038/nature08235. Kawamura, T., Suzuki, J., Wang, Y.V., Menendez, S., Morera, L.B., Raya, A., Wahl, G.M., and Izpisua Belmonte, J.C. (2009). Linking the p53 tumour suppressor pathway to somatic cell reprogramming. Nature 460 , 1140-1144. 10.1038/nature08311. Li, H., Collado, M., Villasante, A., Strati, K., Ortega, S., Canamero, M., Blasco, M.A., and Serrano, M. (2009). The Ink4/Arf locus is a barrier for iPS cell reprogramming. Nature 460 , 1136-1139. 10.1038/nature08290. Marion, R.M., Strati, K., Li, H., Murga, M., Blanco, R., Ortega, S., Fernandez-Capetillo, O., Serrano, M., and Blasco, M.A. (2009). A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature 460 , 1149-1153. 10.1038/nature08287. Utikal, J., Polo, J.M., Stadtfeld, M., Maherali, N., Kulalert, W., Walsh, R.M., Khalil, A., Rheinwald, J.G., and Hochedlinger, K. (2009). Immortalization eliminates a roadblock during cellular reprogramming into iPS cells. Nature 460 , 1145-1148. 10.1038/nature08285. Sarig, R., Rivlin, N., Brosh, R., Bornstein, C., Kamer, I., Ezra, O., Molchadsky, A., Goldfinger, N., Brenner, O., and Rotter, V. (2010). Mutant p53 facilitates somatic cell reprogramming and augments the malignant potential of reprogrammed cells. J Exp Med 207 , 2127-2140. 10.1084/jem.20100797. Huyghe, A., Furlan, G., Schroeder, J., Cascales, E., Trajkova, A., Ruel, M., Stüder, F., Larcombe, M., Yang Sun, Y.B., Mugnier, F., et al. (2022). Comparative roadmaps of reprogramming and oncogenic transformation identify Bcl11b and Atoh8 as broad regulators of cellular plasticity. Nat Cell Biol 24 , 1350-1363. 10.1038/s41556-022-00986-w. Wang, Y.J., and Herlyn, M. (2015). The emerging roles of Oct4 in tumor-initiating cells. Am J Physiol Cell Physiol 309 , C709-718. 10.1152/ajpcell.00212.2015. Chen, W., and Wang, Y.J. (2025). Multifaceted roles of OCT4 in tumor microenvironment: biology and therapeutic implications. Oncogene 44 , 1213-1229. 10.1038/s41388-025-03408-x. Terekhanova, N.V., Karpova, A., Liang, W.W., Strzalkowski, A., Chen, S., Li, Y., Southard-Smith, A.N., Iglesia, M.D., Wendl, M.C., Jayasinghe, R.G., et al. (2023). Epigenetic regulation during cancer transitions across 11 tumour types. Nature 623 , 432-441. 10.1038/s41586-023-06682-5. Zhang, Y., Narayanan, S.P., Mannan, R., Raskind, G., Wang, X., Vats, P., Su, F., Hosseini, N., Cao, X., Kumar-Sinha, C., et al. (2021). Single-cell analyses of renal cell cancers reveal insights into tumor microenvironment, cell of origin, and therapy response. Proc Natl Acad Sci U S A 118 . 10.1073/pnas.2103240118. Chan, M., Yuan, H., Soifer, I., Maile, T.M., Wang, R.Y., Ireland, A., O'Brien, J.J., Goudeau, J., Chan, L.J.G., Vijay, T., et al. (2022). Novel insights from a multiomics dissection of the Hayflick limit. Elife 11 . 10.7554/eLife.70283. An, Z., Liu, P., Zheng, J., Si, C., Li, T., Chen, Y., Ma, T., Zhang, M.Q., Zhou, Q., and Ding, S. (2019). Sox2 and Klf4 as the Functional Core in Pluripotency Induction without Exogenous Oct4. Cell Rep 29 , 1986-2000.e1988. 10.1016/j.celrep.2019.10.026. Velychko, S., Adachi, K., Kim, K.P., Hou, Y., MacCarthy, C.M., Wu, G., and Schöler, H.R. (2019). Excluding Oct4 from Yamanaka Cocktail Unleashes the Developmental Potential of iPSCs. Cell Stem Cell 25 , 737-753.e734. 10.1016/j.stem.2019.10.002. O’Hara, R., and Banaszynski, L.A. (2022). Loss of heterochromatin at endogenous retroviruses creates competition for transcription factor binding. bioRxiv, 2022.2004.2028.489907. 10.1101/2022.04.28.489907. Ohnishi, K., Semi, K., Yamamoto, T., Shimizu, M., Tanaka, A., Mitsunaga, K., Okita, K., Osafune, K., Arioka, Y., Maeda, T., et al. (2014). Premature termination of reprogramming in vivo leads to cancer development through altered epigenetic regulation. Cell 156 , 663-677. 10.1016/j.cell.2014.01.005. Shiraishi, R., Cancila, G., Kumegawa, K., Torrejon, J., Basili, I., Bernardi, F., Silva, P., Wang, W., Chapman, O., Yang, L., et al. (2024). Cancer-specific epigenome identifies oncogenic hijacking by nuclear factor I family proteins for medulloblastoma progression. Dev Cell 59 , 2302-2319 e2312. 10.1016/j.devcel.2024.05.013. Wu, N., Jia, D., Ibrahim, A.H., Bachurski, C.J., Gronostajski, R.M., and MacPherson, D. (2016). NFIB overexpression cooperates with Rb/p53 deletion to promote small cell lung cancer. Oncotarget 7 , 57514-57524. 10.18632/oncotarget.11583. Semenova, E.A., Kwon, M.C., Monkhorst, K., Song, J.Y., Bhaskaran, R., Krijgsman, O., Kuilman, T., Peters, D., Buikhuisen, W.A., Smit, E.F., et al. (2016). Transcription Factor NFIB Is a Driver of Small Cell Lung Cancer Progression in Mice and Marks Metastatic Disease in Patients. Cell Rep 16 , 631-643. 10.1016/j.celrep.2016.06.020. Paynter, J.M., Chen, J., Liu, X., and Nefzger, C.M. (2019). Propagation and Maintenance of Mouse Embryonic Stem Cells. Methods Mol Biol 1940 , 33-45. 10.1007/978-1-4939-9086-3_3. Nefzger, C.M., Haynes, J.M., and Pouton, C.W. (2011). Directed expression of Gata2, Mash1, and Foxa2 synergize to induce the serotonergic neuron phenotype during in vitro differentiation of embryonic stem cells. Stem Cells 29 , 928-939. 10.1002/stem.640. Larcombe, M.R., Manent, J., Chen, J., Mishra, K., Liu, X., and Nefzger, C.M. (2019). Production of High-Titer Lentiviral Particles for Stable Genetic Modification of Mammalian Cells. Methods Mol Biol 1940 , 47-61. 10.1007/978-1-4939-9086-3_4. Nefzger, C.M., Alaei, S., Knaupp, A.S., Holmes, M.L., and Polo, J.M. (2014). Cell surface marker mediated purification of iPS cell intermediates from a reprogrammable mouse model. J Vis Exp, e51728. 10.3791/51728. Nefzger, C.M., Rossello, F.J., Chen, J., Liu, X., Knaupp, A.S., Firas, J., Paynter, J.M., Pflueger, J., Buckberry, S., Lim, S.M., et al. (2017). Cell Type of Origin Dictates the Route to Pluripotency. Cell Rep 21 , 2649-2660. 10.1016/j.celrep.2017.11.029. Gaublomme, J.T., Li, B., McCabe, C., Knecht, A., Yang, Y., Drokhlyansky, E., Van Wittenberghe, N., Waldman, J., Dionne, D., Nguyen, L., et al. (2019). Nuclei multiplexing with barcoded antibodies for single-nucleus genomics. Nat Commun 10 , 2907. 10.1038/s41467-019-10756-2. Naval-Sanchez, M., Deshpande, N., Tran, M., Zhang, J., Alhomrani, M., Alsanie, W., Nguyen, Q., and Nefzger, C.M. (2022). Benchmarking of ATAC Sequencing Data From BGI's Low-Cost DNBSEQ-G400 Instrument for Identification of Open and Occupied Chromatin Regions. Front Mol Biosci 9 , 900323. 10.3389/fmolb.2022.900323. Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30 , 2114-2120. 10.1093/bioinformatics/btu170. Zhu, Q., Conrad, D.N., and Gartner, Z.J. (2024). deMULTIplex2: robust sample demultiplexing for scRNA-seq. Genome Biol 25 , 37. 10.1186/s13059-024-03177-y. Heaton, H., Talman, A.M., Knights, A., Imaz, M., Gaffney, D.J., Durbin, R., Hemberg, M., and Lawniczak, M.K.N. (2020). Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat Methods 17 , 615-620. 10.1038/s41592-020-0820-1. Hao, Y., Stuart, T., Kowalski, M.H., Choudhary, S., Hoffman, P., Hartman, A., Srivastava, A., Molla, G., Madad, S., Fernandez-Granda, C., and Satija, R. (2024). Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol 42 , 293-304. 10.1038/s41587-023-01767-y. Stuart, T., Srivastava, A., Madad, S., Lareau, C.A., and Satija, R. (2021). Single-cell chromatin state analysis with Signac. Nat Methods 18 , 1333-1341. 10.1038/s41592-021-01282-5. Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol 9 , R137. 10.1186/gb-2008-9-9-r137. Wolf, F.A., Angerer, P., and Theis, F.J. (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19 , 15. 10.1186/s13059-017-1382-0. Galili, T. (2015). dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering. Bioinformatics 31 , 3718-3720. 10.1093/bioinformatics/btv428. Hafemeister, C., and Satija, R. (2019). Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol 20 , 296. 10.1186/s13059-019-1874-1. Lawrence, M., Huber, W., Pages, H., Aboyoun, P., Carlson, M., Gentleman, R., Morgan, M.T., and Carey, V.J. (2013). Software for computing and annotating genomic ranges. PLoS Comput Biol 9 , e1003118. 10.1371/journal.pcbi.1003118. Patrick, R., Humphreys, D.T., Janbandhu, V., Oshlack, A., Ho, J.W.K., Harvey, R.P., and Lo, K.K. (2020). Sierra: discovery of differential transcript usage from polyA-captured single-cell RNA-seq data. Genome Biol 21 , 167. 10.1186/s13059-020-02071-7. Finak, G., McDavid, A., Yajima, M., Deng, J., Gersuk, V., Shalek, A.K., Slichter, C.K., Miller, H.W., McElrath, M.J., Prlic, M., Linsley, P.S., and Gottardo, R. (2015). MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16 , 278. 10.1186/s13059-015-0844-5. Duttke, S.H., Guzman, C., Chang, M., Delos Santos, N.P., McDonald, B.R., Xie, J., Carlin, A.F., Heinz, S., and Benner, C. (2024). Position-dependent function of human sequence-specific transcription factors. Nature 631 , 891-898. 10.1038/s41586-024-07662-z. Brionne, A., Juanchich, A., and Hennequet-Antier, C. (2019). ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity. BioData Min 12 , 16. 10.1186/s13040-019-0204-1. Durinck, S., Spellman, P.T., Birney, E., and Huber, W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc 4 , 1184-1191. 10.1038/nprot.2009.97. Alexa, A., and Rahnenfuhrer, J. (2024). topGO: Enrichment Analysis for Gene Ontology. Szklarczyk, D., Kirsch, R., Koutrouli, M., Nastou, K., Mehryary, F., Hachilif, R., Gable, A.L., Fang, T., Doncheva, N.T., Pyysalo, S., et al. (2023). The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 51 , D638-D646. 10.1093/nar/gkac1000. Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13 , 2498-2504. 10.1101/gr.1239303. Colaprico, A., Silva, T.C., Olsen, C., Garofano, L., Cava, C., Garolini, D., Sabedot, T.S., Malta, T.M., Pagnotta, S.M., Castiglioni, I., et al. (2016). TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res 44 , e71. 10.1093/nar/gkv1507. Bray, N.L., Pimentel, H., Melsted, P., and Pachter, L. (2016). Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34 , 525-527. 10.1038/nbt.3519. Vivian, J., Rao, A.A., Nothaft, F.A., Ketchum, C., Armstrong, J., Novak, A., Pfeil, J., Narkizian, J., Deran, A.D., Musselman-Brown, A., et al. (2017). Toil enables reproducible, open source, big biomedical data analyses. Nature Biotechnology 35 , 314-316. 10.1038/nbt.3772. Durinck, S., Moreau, Y., Kasprzyk, A., Davis, S., De Moor, B., Brazma, A., and Huber, W. (2005). BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21 , 3439-3440. 10.1093/bioinformatics/bti525. Durinck, S., Spellman, P.T., Birney, E., and Huber, W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nature Protocols 4 , 1184-1191. 10.1038/nprot.2009.97. Browaeys, R., Saelens, W., and Saeys, Y. (2020). NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 17 , 159-162. 10.1038/s41592-019-0667-5. Additional Declarations The authors declare no competing interests. Supplementary Files SupplementaryFigures.pdf Supplementary Figures SuppTable1.xlsx Table S1: Comparison between DoseH-seq and alternative single-cell multiome perturb-seq tools. SuppTable2.xlsx Table S2: List of materials and buffer preparation for DoseH-seq development and validation. SuppTable3.xlsx Table S3: DARs related to Figure 1 and 2: impact of Nfix overexpression on fibroblasts and during reprogramming. SuppTable4.xlsx Table S4: Motif enrichment for DARs opening in response to Nfix overexpression stratifications in fibroblasts. SuppTable5.xlsx Table S5: DARs related to Figure 3: reprogramming and differentiation comparisons. SuppTable6.xlsx Table S6: DARs related to Figure 5: single- and multi-perturb comparisons. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8400524","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Method Article","associatedPublications":[],"authors":[{"id":562711362,"identity":"9742da56-2833-442f-a027-7abb883b979c","order_by":0,"name":"Ying Yang","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Ying","middleName":"","lastName":"Yang","suffix":""},{"id":562711363,"identity":"fbbe9d22-abbb-461c-8aae-ae9d85c505a8","order_by":1,"name":"Ralph Patrick","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Ralph","middleName":"","lastName":"Patrick","suffix":""},{"id":562711364,"identity":"db997c64-d7a5-4ff4-b256-0543412cf823","order_by":2,"name":"Xiaoli Chen","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Xiaoli","middleName":"","lastName":"Chen","suffix":""},{"id":562711368,"identity":"db899a87-2dd8-4df0-b568-1d4eae57fc7d","order_by":3,"name":"Stacey Anderson","email":"","orcid":"","institution":"Genome Innovation Hub, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Stacey","middleName":"","lastName":"Anderson","suffix":""},{"id":562711365,"identity":"0f62aeee-87e0-4abb-b355-3d4ba453a4b9","order_by":4,"name":"Jingyu Zhang","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Jingyu","middleName":"","lastName":"Zhang","suffix":""},{"id":562711366,"identity":"07485e0f-e278-4544-ac40-a3d239a8dc7e","order_by":5,"name":"Yifei Huang","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Yifei","middleName":"","lastName":"Huang","suffix":""},{"id":562711367,"identity":"cf1bd47d-9082-46d9-b4b6-b9bbf83c9a69","order_by":6,"name":"Mohammadhossein Esmaeili","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Mohammadhossein","middleName":"","lastName":"Esmaeili","suffix":""},{"id":562711369,"identity":"d3e8fdf0-8a71-4490-9bbb-a78248b8e970","order_by":7,"name":"Kanupriya Tiwari","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Kanupriya","middleName":"","lastName":"Tiwari","suffix":""},{"id":562711571,"identity":"d4f93b7c-bfeb-40a7-aeba-32c813875c9c","order_by":8,"name":"Shivangi Wani","email":"","orcid":"","institution":"Genome Innovation Hub, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Shivangi","middleName":"","lastName":"Wani","suffix":""},{"id":562711370,"identity":"5fd06549-1531-41a2-ac16-dcdaab2aca52","order_by":9,"name":"Monisha Ganesan","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Monisha","middleName":"","lastName":"Ganesan","suffix":""},{"id":562711371,"identity":"f8f1e3f3-3da1-4cc8-ae8c-dd385d9adfbc","order_by":10,"name":"Hsin-Yi Chou","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Hsin-Yi","middleName":"","lastName":"Chou","suffix":""},{"id":562711372,"identity":"d16aa270-3c92-413a-8bb3-44482cc5db86","order_by":11,"name":"Dominique Power","email":"","orcid":"","institution":"Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Dominique","middleName":"","lastName":"Power","suffix":""},{"id":562711373,"identity":"5c1f5764-a30b-4e9b-9c43-82faeadbea73","order_by":12,"name":"Cassy M Spiller","email":"","orcid":"","institution":"School of Biomedical Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Cassy","middleName":"M","lastName":"Spiller","suffix":""},{"id":562711374,"identity":"7f210ed2-5764-4f7f-a399-98c3f9a4b8cf","order_by":13,"name":"Sas Loganathan","email":"","orcid":"","institution":"Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Sas","middleName":"","lastName":"Loganathan","suffix":""},{"id":562711375,"identity":"3871dc02-d4cf-4529-bc0b-6a06c7b4a318","order_by":14,"name":"Solal Chauquet","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Solal","middleName":"","lastName":"Chauquet","suffix":""},{"id":562711376,"identity":"94f36bd1-454e-40a6-a185-cd7c5d7a7998","order_by":15,"name":"Michael Piper","email":"","orcid":"","institution":"School of Biomedical Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"","lastName":"Piper","suffix":""},{"id":562711377,"identity":"1679e62a-a877-4ac8-b419-2f3e80e880e8","order_by":16,"name":"Majid Alhomrani","email":"","orcid":"","institution":"Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia","correspondingAuthor":false,"prefix":"","firstName":"Majid","middleName":"","lastName":"Alhomrani","suffix":""},{"id":562711378,"identity":"77db40ec-b1be-4836-a234-7976ca1cdbfa","order_by":17,"name":"Walaa Alsanie","email":"","orcid":"","institution":"Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia","correspondingAuthor":false,"prefix":"","firstName":"Walaa","middleName":"","lastName":"Alsanie","suffix":""},{"id":562711379,"identity":"6740b95e-875d-4a87-b977-7e12a048a144","order_by":18,"name":"Sonia Shah","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Sonia","middleName":"","lastName":"Shah","suffix":""},{"id":562711380,"identity":"6bde86bf-c5cd-4223-ae00-41d1cb3b6ebe","order_by":19,"name":"Josephine Bowles","email":"","orcid":"","institution":"School of Biomedical Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Josephine","middleName":"","lastName":"Bowles","suffix":""},{"id":562711567,"identity":"913691cd-4c71-4633-b3fd-ed63d45f5e47","order_by":20,"name":"Jessica C Mar","email":"","orcid":"","institution":"Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Jessica","middleName":"C","lastName":"Mar","suffix":""},{"id":562711568,"identity":"dad975cf-9804-4e5a-84a6-81559499b91e","order_by":21,"name":"Shyuan T Ngo","email":"","orcid":"","institution":"Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Shyuan","middleName":"T","lastName":"Ngo","suffix":""},{"id":562711569,"identity":"20d38613-e803-4648-8877-45b0c7dd594c","order_by":22,"name":"Melanie D White","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Melanie","middleName":"D","lastName":"White","suffix":""},{"id":562711570,"identity":"5d3051ca-aaab-4498-967a-ee8b8cca6c99","order_by":23,"name":"Marina Naval-Sanchez","email":"","orcid":"","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":false,"prefix":"","firstName":"Marina","middleName":"","lastName":"Naval-Sanchez","suffix":""},{"id":563875053,"identity":"a7e221e7-8067-40ad-9fa7-bdbe0e4679a2","order_by":24,"name":"Christian M Nefzger","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA9klEQVRIiWNgGAWjYLCCChDB3tjADOUbENZyBkTwHCRZi0QCA3Fa+Bt4Dz448KdWznzm49bNBQyHZRvYm7dJMNQcxqlF4gBfssEBnuPGMrcT227PYDhs3MBzrEyC4RhuLQYMPGbSHySOJc6QBmrhYTic2CCRYybBwIZXi/mPAwbH6mdIHoRqkX8D1PIPvy0MBxJqEiQkGGG28JgB2Xj8cpjHWOLAgQOGM3hAfjFIN27jSSu2SOxLx6mFv73H8MOBP3XyEuzHn90uqLCW7Wc/vPHGh2/WOLVA4wLmDAMGxjYQnYBbAwzUwVmMDYRVj4JRMApGwQgDAJy+UjfQHBcNAAAAAElFTkSuQmCC","orcid":"https://orcid.org/0000-0003-4631-5633","institution":"Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia","correspondingAuthor":true,"prefix":"","firstName":"Christian","middleName":"M","lastName":"Nefzger","suffix":""}],"badges":[],"createdAt":"2025-12-19 04:17:39","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":true,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":true},"doi":"10.21203/rs.3.rs-8400524/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8400524/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":99309177,"identity":"e48b94ef-a8a8-4892-845e-5896fab2eadd","added_by":"auto","created_at":"2025-12-31 16:09:50","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":546105,"visible":true,"origin":"","legend":"","description":"","filename":"DoseHseqmanuscript.docx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/9e2c310a875c29dd7f5ab207.docx"},{"id":99309592,"identity":"b5727098-d277-45d9-ae5c-28f8d1d3c386","added_by":"auto","created_at":"2025-12-31 16:10:47","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":342,"visible":true,"origin":"","legend":"","description":"","filename":"rs8400524.json","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/f9a6a5ea3ee5e6356a4ced08.json"},{"id":98885653,"identity":"7476d2f9-5258-4404-a88f-86c289f0fe55","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":290503,"visible":true,"origin":"","legend":"","description":"","filename":"rs84005240enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/5fd8965643649684ddff981d.xml"},{"id":99309299,"identity":"2bfcb1ec-b13c-4a99-81e3-89687b900942","added_by":"auto","created_at":"2025-12-31 16:10:03","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":286971,"visible":true,"origin":"","legend":"","description":"","filename":"rs84005240structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/9ae7e01ce99a1c441ed02bd2.xml"},{"id":99309332,"identity":"58b38ae7-de5c-4f07-be53-0104a625de0e","added_by":"auto","created_at":"2025-12-31 16:10:11","extension":"html","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":313334,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/6bb9dda49870507031213d0e.html"},{"id":98885645,"identity":"269aee08-510e-4ce0-adf6-3f8309c041e1","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":790929,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDevelopment and validation of DoseH-seq: a single-cell platform for dosage-resolved, multiplexed perturbations. (A)\u003c/strong\u003e\u0026nbsp;Barcoded lentiviral constructs enable gene overexpression or knockdown, while hashtag oligonucleotides (HTOs) allow demultiplexing of conditions or time points following 10x Genomics capture. DoseH-seq generates four sequencing libraries that can be sequenced concurrently on BGI’s low-cost T7 sequencer. At the time of submission, up to 24 hashtag antibodies were available from BioLegend for sample multiplexing.\u0026nbsp;\u003cstrong\u003e(B)\u003c/strong\u003e\u0026nbsp;Single-cell multiome workflow for multiplexing 16 human samples: 15 human fibroblast samples from 14 donors (one donors labelled by two different hashtags as additional control), and one unlabelled hPSC line.\u0026nbsp;\u003cstrong\u003e(C)\u003c/strong\u003e\u0026nbsp;Accuracy of nuclear hashtag assignment compared with SNP-based demultiplexing (Souporcell).\u0026nbsp;\u003cstrong\u003e(D)\u003c/strong\u003e\u0026nbsp;DoseH-seq applied to a reprogramming time course from rMEFs, with\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression or BFP-only control. Live mESCs and mid-to-late reprogramming samples (days 6, 10 and 13) were FACS-sorted, with THY1\u003csup\u003e-\u003c/sup\u003e\u0026nbsp;intermediates sorted and pooled.\u003cstrong\u003e\u0026nbsp;(E)\u003c/strong\u003e\u0026nbsp;UMAP of cells based on HTO counts, coloured by deMULTIplex2 classification; pre-OSKM induction rMEFs are indicated.\u0026nbsp;\u003cstrong\u003e(F)\u003c/strong\u003e\u0026nbsp;Chromatin accessibility UMAP of pre-OSKM induction rMEFs, showing\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression (OE) level inferred by lentiviral barcode-derived lenti scores. Lenti scores represent log-odds values from logistic regression models predicting HTO-defined conditions using\u0026nbsp;barcode counts from the lentiviral cassettes\u0026nbsp;(\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression\u0026nbsp;or\u0026nbsp;\u003cem\u003eTagBFP2\u003c/em\u003e\u0026nbsp;only).\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003e\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;barcode expression (log scale) in rMEFs, comparing BFP control with low, mid and high stratifications defined by\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;lenti score tertiles; *p\u003csub\u003eadj\u003c/sub\u003e\u0026nbsp;\u0026lt; 0.05 (MAST testing).\u0026nbsp;\u003cstrong\u003e(H)\u003c/strong\u003e\u0026nbsp;Numbers of opening or closing differentially accessible regions (DARs; FDR \u0026lt; 0.001) across\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003eoverexpression stratifications\u0026nbsp;\u003cstrong\u003e(I)\u003c/strong\u003e\u0026nbsp;NFI motif enrichment in opening or closing DARs by\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression stratifications.\u0026nbsp;\u0026nbsp;\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/9cb557727161c9a9dd382524.jpg"},{"id":98885648,"identity":"a3ca447a-bf98-497d-a554-e6dac9b98aeb","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1169208,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDoseH-seq reveals dosage-dependent synergy between\u0026nbsp;\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eNfix\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;overexpression and OKSM that promotes reprogramming by destabilizing somatic chromatin.\u0026nbsp;(A,B)\u003c/strong\u003e\u0026nbsp;Force-directed layout (FDL) of the DoseH-seq reprogramming time course highlighting cells from\u0026nbsp;\u003cstrong\u003e(A)\u003c/strong\u003e\u0026nbsp;BFP control or\u0026nbsp;\u003cstrong\u003e(B)\u003c/strong\u003e\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression conditions, coloured by time point.\u0026nbsp;\u003cstrong\u003e(C)\u003c/strong\u003e\u0026nbsp;Dendrogram of conditions based on average expression of the top 2000 most variable genes.\u0026nbsp;\u003cstrong\u003e(D)\u003c/strong\u003eCytoTRACE2 scores of conditions, ordered from more differentiated to more pluripotent.\u0026nbsp;\u003cstrong\u003e(E,F)\u0026nbsp;\u003c/strong\u003eOverlay of lentiviral overexpression (lenti) scores for\u0026nbsp;\u003cstrong\u003e(E)\u003c/strong\u003e\u0026nbsp;BFP controls and\u0026nbsp;\u003cstrong\u003e(F)\u003c/strong\u003e\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression samples across reprogramming.\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003eNumbers of opening DARs in day 2 cells stratified by increasing\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;lenti scores or BFP control.\u0026nbsp;\u003cstrong\u003e(H)\u003c/strong\u003e\u0026nbsp;Chromatin module scores for regulatory elements previously defined as stable during reprogramming (Knaupp et al., Cell Stem Cell, 2017); inset highlights lower scores in day 2\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpressing cells, indicating chromatin destabilisation.\u0026nbsp;\u003cstrong\u003e(I)\u003c/strong\u003e\u0026nbsp;Numbers of opening DARs in day 6 cells stratified by increasing\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;lenti scores or BFP control.\u0026nbsp;\u003cstrong\u003e(J,K)\u003c/strong\u003e\u0026nbsp;Seurat module scores based on expression of genes\u0026nbsp;\u003cstrong\u003e(J)\u003c/strong\u003e\u0026nbsp;downregulated or\u003cstrong\u003e\u0026nbsp;(K)\u003c/strong\u003e\u0026nbsp;upregulated in mESCs.\u0026nbsp;\u003cstrong\u003e(L,M)\u0026nbsp;\u003c/strong\u003eFACS strategy\u0026nbsp;\u003cstrong\u003e(L)\u003c/strong\u003e\u0026nbsp;and representative flow cytometry plots\u0026nbsp;\u003cstrong\u003e(M)\u003c/strong\u003e\u0026nbsp;used to isolate reprogramming subsets with varying\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression levels.\u0026nbsp;\u003cstrong\u003e(N)\u0026nbsp;\u003c/strong\u003eqPCR of\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;expression in FACS-isolated MEF subsets (n = 4 biological replicates). Statistical significance determined by paired t-tests (two-tailed).\u0026nbsp;\u003cstrong\u003e(O,P)\u003c/strong\u003e\u0026nbsp;Flow cytometry\u0026nbsp;analysis of\u0026nbsp;OCT4-GFP\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;cells (%) at reprogramming end point showing\u0026nbsp;\u003cstrong\u003e(O)\u003c/strong\u003erepresentative flow cytometry comparison of\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpressing or BFP subsets\u0026nbsp;and\u0026nbsp;\u003cstrong\u003e(P)\u003c/strong\u003e\u0026nbsp;quantification of\u0026nbsp;OCT4-GFP\u003csup\u003e+\u003c/sup\u003ecells (n = 3 experimental replicates). Statistical significance determined by paired t-tests (two-tailed), according to paired experimental replicates.\u0026nbsp;\u003cstrong\u003e(Q)\u003c/strong\u003e\u0026nbsp;Percentage of TagBFP2\u003csup\u003e-\u003c/sup\u003e\u0026nbsp;cells within the OCT4-GFP\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;population for mid-level overexpression subsets at reprogramming end point (n = 3 experimental replicates). Statistical significance was determined by paired t-tests (two-tailed).\u0026nbsp;Data are mean ± SEM. Not significant (ns),\u0026nbsp;*\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.05,\u0026nbsp;**\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.01, ***\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.001.\u0026nbsp;****\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.0001. D, day.\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/ab37ec38b99a985ff48f7de4.jpg"},{"id":98885647,"identity":"6711aec4-7396-4f81-9469-e0eeb5c07ac8","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":980005,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDoseH-seq reveals that\u0026nbsp;\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eNfix\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;overexpression alone induces partial fibroblast dedifferentiation and promotes\u0026nbsp;PSC differentiation via reactivation of transient reprogramming regions.\u0026nbsp;(A,B)\u003c/strong\u003e\u0026nbsp;Representative TF motif enrichment of\u0026nbsp;\u003cstrong\u003e(A) \u003c/strong\u003eopening or\u0026nbsp;\u003cstrong\u003e(B)\u003c/strong\u003e\u0026nbsp;closing DARs in rMEFs (day 0 [D0] of reprogramming) with increasing\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression relative to BFP controls; Circle size denotes the number of motif-positive DARs.\u0026nbsp;\u003cstrong\u003e(C)\u003c/strong\u003e\u0026nbsp;Spearman correlation of differential gene expression log\u003csub\u003e2\u003c/sub\u003e\u0026nbsp;fold-changes between NFIX low rMEFs (vs BFP control) and fibroblast subsets from a pan-tissue fibroblast atlas\u0026nbsp;(Buechler et al., Nature, 2021). Indicated are Spearman rho values.\u0026nbsp;\u003cstrong\u003e(D)\u003c/strong\u003e\u0026nbsp;Representative images of PSCs carrying an endogenous OCT4-GFP reporter under BFP or\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;OE conditions.\u0026nbsp;\u003cstrong\u003e(E)\u0026nbsp;\u003c/strong\u003eSchematic of sample collection for single-cell multiome profiling of\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression-mediated differentiation from mESCs.\u0026nbsp;\u003cstrong\u003e(F)\u0026nbsp;\u003c/strong\u003eCorrelation of chromatin accessibility log\u003csub\u003e2\u003c/sub\u003e\u0026nbsp;fold-changes (across all accessible peaks) between differentiation and reprogramming experiments, each compared to their mESC controls.\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003e\u0026nbsp;UMAP of integrated reprogramming and differentiation cells based on chromatin accessibility.\u0026nbsp;\u003cstrong\u003e(H)\u003c/strong\u003e\u0026nbsp;Comparison of\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression-induced\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003elog\u003csub\u003e2\u003c/sub\u003e\u0026nbsp;fold-changes between reprogramming and differentiation.\u0026nbsp;\u003cstrong\u003e(I)\u0026nbsp;\u003c/strong\u003eOverlap of opening DARs between\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003ereprogramming (day 2 vs BFP rMEF control) and\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003edifferentiation (days 2+3 vs mESCs). Indicated\u0026nbsp;\u003cem\u003ep\u003c/em\u003e-value from a Fisher’s exact test.\u003cstrong\u003e\u0026nbsp;(J)\u0026nbsp;\u003c/strong\u003eRepresentative TF motif enrichment of DARs for indicated differentiation or reprogramming conditions, and the 670 overlapping motifs from\u0026nbsp;\u003cstrong\u003e(I)\u003c/strong\u003e.\u0026nbsp;\u003cstrong\u003e(K) \u003c/strong\u003eSTRING network of genes linked to the 670 overlapping DARs, highlighting genes annotated with the ‘Developmental process’ gene ontology term. iMEFs, growth-inactivated mouse embryonic fibroblasts; D, day.\u003c/p\u003e","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/878ab3d0988827d7bd82f2bb.jpg"},{"id":99309408,"identity":"ff5e9457-0de6-4046-a0e0-a6d1cb986405","added_by":"auto","created_at":"2025-12-31 16:10:21","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":605431,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eLow\u0026nbsp;\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eNfix\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;overexpression promotes reprogramming under limiting OKSM by acting as a chromatin co-opener, but inhibits reprogramming at high levels when OKSM is sufficient\u003c/strong\u003e.\u003cstrong\u003e\u0026nbsp;(A)\u003c/strong\u003e\u0026nbsp;Overlap of opening DARs between\u0026nbsp;day 2\u0026nbsp;BFP cells (vs rMEF BFP control) and day 2 NFIX mid cells (vs rMEF NFIX).\u0026nbsp;\u003cstrong\u003e(B)\u0026nbsp;\u003c/strong\u003eRepresentative TF motif enrichment of unique opening DARs in D2\u0026nbsp;\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpressing cells (by lenti score stratification), compared to day 2 BFP cells; Colour indicates NFIX minus BFP motif counts differences and size indicates absolute count differences.\u0026nbsp;\u003cstrong\u003e(C)\u0026nbsp;\u003c/strong\u003eOverlap of opening DARs between day 6 BFP cells (vs rMEF BFP control) and\u0026nbsp;day 6\u0026nbsp;NFIX low cells (vs rMEF NFIX).\u003cstrong\u003e\u0026nbsp;(D)\u003c/strong\u003e\u0026nbsp;Representative TF motif enrichment of unique opening DARs in day 6\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpressing cells stratified by lenti score, compared to day 6 BFP cells. Dot colour and size as per\u0026nbsp;\u003cstrong\u003e(B)\u003c/strong\u003e.\u0026nbsp;\u003cstrong\u003e(E)\u003c/strong\u003e\u0026nbsp;Experimental workflow for SKM-induced reprogramming using wild-type MEFs (wtMEFs) under Vitamin C conditions (+Vc).\u0026nbsp;\u003cstrong\u003e(F,G)\u003c/strong\u003e\u0026nbsp;Representative flow cytometry plots (concatenated replicates from each condition)\u0026nbsp;\u003cstrong\u003e(F)\u003c/strong\u003e\u0026nbsp;and quantification\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003e\u0026nbsp;for THY1\u003csup\u003e-\u003c/sup\u003e\u0026nbsp;SSEA1\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;EpCAM\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;cells (%) at the end of SKM-induced reprogramming (+Vc) with or without\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression (n = 4 biological replicates). Quantification relative to\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003eoverexpression group. Statistical significance determined by unpaired t-test (two-tailed).\u0026nbsp;\u003cstrong\u003e(H)\u003c/strong\u003e\u0026nbsp;Dendrogram of reprogramming timepoints, including Vitamin C conditions, based on average expression of the top 3000 most variable genes.\u0026nbsp;\u003cstrong\u003e(I,J)\u003c/strong\u003e\u0026nbsp;FACS strategy\u0026nbsp;\u003cstrong\u003e(I)\u003c/strong\u003e\u0026nbsp;and percentage quantification of OCT4-GFP\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;cells\u0026nbsp;\u003cstrong\u003e(J)\u003c/strong\u003e\u0026nbsp;isolated at the end of reprogramming (+Vc) by level of\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression (n = 4 experimental replicates). Statistical significance determined by paired t-test (two-tailed).\u0026nbsp;\u003cstrong\u003e(K)\u0026nbsp;\u003c/strong\u003eOCT4-GFP\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;cell quantification\u0026nbsp;(%) at end of reprogramming (-Vc) for rMEFs infected with lentiviruses overexpressing\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;and/or additional\u0026nbsp;\u003cem\u003eOct4\u003c/em\u003e/\u003cem\u003eSox2\u003c/em\u003e/\u003cem\u003eKlf4\u003c/em\u003e\u0026nbsp;or\u0026nbsp;\u003cem\u003eTagBFP2\u003c/em\u003e\u0026nbsp;control (n = 3-6 experimental replicates). Statistical analyses were performed by one-way ANOVA. Dunnett’s multiple comparisons test was used for comparisons between control and treatments. Not significant (ns), *\u003cem\u003eP\u003c/em\u003e\u0026lt;0.05, **\u003cem\u003eP\u003c/em\u003e\u0026lt;0.01, ****\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.0001. Data represented as mean ± SEM.\u0026nbsp;SKM,\u0026nbsp;\u003cem\u003eSox2\u003c/em\u003e,\u0026nbsp;\u003cem\u003eKlf4\u003c/em\u003e\u0026nbsp;and\u0026nbsp;\u003cem\u003ec-Myc\u003c/em\u003e; Vc, Vitamin C.\u003c/p\u003e","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/c4e5579c76a984facb741a95.jpg"},{"id":98885652,"identity":"8dccedfb-baf9-4a67-a5f8-3cebc018590e","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1390291,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMulti-perturb DoseH-seq reveals that high-dose\u0026nbsp;\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eNfix\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;and\u0026nbsp;\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eOct4\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;overexpression trigger AP-1, p53, and cell cycle arrest programs, while c-\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eJun\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026nbsp;knockdown enhances reprogramming by blunting NFIX\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003einduced AP-1 activation.\u0026nbsp;(A)\u003c/strong\u003e\u0026nbsp;Differential gene expression (DEG) of\u0026nbsp;\u003cem\u003eNfi\u003c/em\u003e\u0026nbsp;factors and\u0026nbsp;\u003cem\u003ec-Jun\u003c/em\u003e\u0026nbsp;in day 0\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e-overexpressing (OE) rMEFs relative to day 0 BFP control (no Vitamin C). Significant DEGs defined by p\u003csub\u003eadj\u003c/sub\u003e\u0026nbsp;\u0026lt; 0.05 (MAST testing). Barcoded\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e:\u003cem\u003e\u0026nbsp;\u003c/em\u003eunique\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003elentiviral barcode expression.\u003cstrong\u003e\u0026nbsp;(B)\u003c/strong\u003e\u0026nbsp;Quantification of OCT4-GFP\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;iPSCs (%) at the end of reprogramming for combined\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression and\u0026nbsp;\u003cem\u003ec-Jun\u003c/em\u003e\u0026nbsp;knockdown at MOI = 2.5 (n = 4\u0026nbsp;experimental\u0026nbsp;replicates).\u0026nbsp;For -Vc group,\u0026nbsp;statistical significance was determined by\u0026nbsp;repeated measures one-way ANOVA analysis; Dunnett’s multiple comparisons test was used for comparisons between groups; for +Vc group, statistical significance was determined by paired t-test (two-tailed). \u003cstrong\u003e(C)\u003c/strong\u003e\u0026nbsp;Multi-perturb experimental design combining\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression with concurrent\u0026nbsp;\u003cem\u003eOct4\u003c/em\u003e\u0026nbsp;overexpression (‘O+X’) or\u0026nbsp;\u003cem\u003ec-Jun\u003c/em\u003e\u0026nbsp;knockdown via shRNA (‘J+X’). See vector information in\u0026nbsp;\u003cstrong\u003eTable S2\u003c/strong\u003e.\u0026nbsp;\u003cstrong\u003e(D,E)\u003c/strong\u003e\u0026nbsp;UMAP showing the multi-perturb dataset integrated with the\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression reprogramming dataset. Highlighted are\u0026nbsp;\u003cstrong\u003e(D)\u003c/strong\u003e\u0026nbsp;cells from the BFP Vitamin C reprogramming conditions and mESCs or\u0026nbsp;\u003cstrong\u003e(E)\u003c/strong\u003e\u0026nbsp;cells from the perturb conditions and mESCs.\u0026nbsp;\u003cstrong\u003e(F)\u0026nbsp;\u003c/strong\u003eChromatin module scores for regions previously defined as stable\u0026nbsp;(Knaupp et al., Cell Stem Cell, 2017)\u0026nbsp;during reprogramming, showing reduced scores in day 2 perturb cells.\u0026nbsp;\u003cstrong\u003e(G,H)\u003c/strong\u003e\u0026nbsp;Lenti score overlays for\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003e\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression\u0026nbsp;\u003cstrong\u003e(H)\u003c/strong\u003e\u0026nbsp;combined\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression and\u0026nbsp;\u003cem\u003ec-Jun\u003c/em\u003e\u0026nbsp;shRNA.\u0026nbsp;\u003cstrong\u003e(I)\u003c/strong\u003e\u0026nbsp;Numbers of opening or closing DARs at day 2 across stratified\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression levels and with\u0026nbsp;\u003cem\u003ec-Jun\u003c/em\u003e\u0026nbsp;shRNA.\u0026nbsp;\u003cstrong\u003e(J)\u003c/strong\u003e\u0026nbsp;Chromatin module scores for DARs uniquely opening in low or high cells, for\u0026nbsp;\u003cem\u003eNfix\u003c/em\u003e\u0026nbsp;overexpression or combined with\u0026nbsp;\u003cem\u003ec-Jun\u0026nbsp;\u003c/em\u003eshRNA.\u0026nbsp;\u003cstrong\u003e(K)\u003c/strong\u003e\u0026nbsp;Lenti score overlay for\u0026nbsp;\u003cem\u003eOct4\u003c/em\u003e\u0026nbsp;overexpression, highlighting activation of the\u0026nbsp;\u003cem\u003eOct4\u0026nbsp;\u003c/em\u003ebarcodes following reprogramming initiation.\u0026nbsp;\u003cstrong\u003e(L)\u003c/strong\u003e\u0026nbsp;Overlap of opening DARs in day 2 O+X high cells (relative to O+X rMEFs) and in day 2 NFIX low and NFIX\u003cem\u003e\u0026nbsp;\u003c/em\u003ehigh cells (compared to NFIX rMEFs).\u0026nbsp;\u003cstrong\u003e(M)\u003c/strong\u003e\u0026nbsp;Motif position weight matrix (PWM) scores for OCT4 hits in day 2 opening DARs unique to the indicated condition as per\u0026nbsp;\u003cstrong\u003e(l)\u003c/strong\u003e;\u0026nbsp;\u003cem\u003ep\u003c/em\u003e-values from 1-sided Wilcoxon rank-sum tests.\u0026nbsp;\u003cstrong\u003e(N)\u0026nbsp;\u003c/strong\u003eMotif enrichment for O+X or NFIX high vs NFIX\u003cem\u003e\u0026nbsp;\u003c/em\u003elow and vice versa. Shown are a combination of the top 10 most significant motifs from the O+X high and NFIX high comparisons.\u0026nbsp;\u003cstrong\u003e(O)\u003c/strong\u003e\u0026nbsp;Seurat cell cycle scores\u0026nbsp;for S phase genes in day 2 NFIX low, NFIX high or O+X high cells;\u0026nbsp;\u003cem\u003ep\u003c/em\u003e-values from 1-sided Wilcoxon rank-sum tests.\u0026nbsp;Not significant (ns), *\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.05, **\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.01, ***\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.001, ****\u003cem\u003eP\u003c/em\u003e\u0026nbsp;\u0026lt;0.0001. Data are represented as mean ± SEM. D, day; Vc, Vitamin C.\u003c/p\u003e","description":"","filename":"Figure5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/2d6b3d2fb5a07590a527b37b.jpg"},{"id":98885655,"identity":"fc3659ca-5037-42af-a35a-29067eece2b9","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":849412,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eNFI/OCT4 dynamics in human tumours.\u0026nbsp;(A)\u003c/strong\u003e\u0026nbsp;Expression of\u0026nbsp;\u003cem\u003eNFI\u003c/em\u003e\u0026nbsp;family member and\u0026nbsp;\u003cem\u003ePOU5F1\u003c/em\u003e\u0026nbsp;(OCT4) in TCGA tumour or normal tissue samples. Indicated sections defined based of percentile-based expression thresholds for\u0026nbsp;\u003cem\u003ePOU5F1\u003c/em\u003eor each\u0026nbsp;\u003cem\u003eNFI\u003c/em\u003e\u0026nbsp;family member.\u0026nbsp;\u003cstrong\u003e(B)\u003c/strong\u003e\u0026nbsp;Percentage of each tumour type that falls within the NFI\u003csup\u003emid\u003c/sup\u003e/OCT4\u003csup\u003e+\u003c/sup\u003e\u0026nbsp;section. Shown are the top 7 tumours.\u0026nbsp;\u003cstrong\u003e(C)\u003c/strong\u003e\u0026nbsp;Comparison of\u0026nbsp;\u003cem\u003eNFIX\u003c/em\u003e\u0026nbsp;and\u0026nbsp;\u003cem\u003ePOU5F1\u003c/em\u003e\u0026nbsp;expression, with\u0026nbsp;\u003cem\u003eNANOG\u003c/em\u003e\u0026nbsp;expression overlayed.\u0026nbsp;\u003cstrong\u003e(D) \u003c/strong\u003eComparison of CytoTRACE2 scores between normal tissue samples, or the 3 sections indicated in\u0026nbsp;\u003cstrong\u003e(A)\u003c/strong\u003e.\u0026nbsp;\u003cstrong\u003e(E)\u003c/strong\u003e\u0026nbsp;Motif enrichment for scATAC-seq DARs opening in indicated tumours relative to cell type of origin; DARs obtained from Terekhanova et al.\u0026nbsp;\u003cstrong\u003e(F)\u003c/strong\u003e\u0026nbsp;Kidney scRNA-seq expression of\u0026nbsp;\u003cem\u003eNFI\u003c/em\u003e\u0026nbsp;factors or\u0026nbsp;\u003cem\u003ePOU5F1\u003c/em\u003e\u0026nbsp;in normal proximal tubule (PT) cells or clear cell renal cell carcinoma (ccRCC).\u0026nbsp;\u003cstrong\u003e(G)\u003c/strong\u003e\u0026nbsp;Module scores in kidney scRNA-seq for genes both upregulated (MAST testing; p\u003csub\u003eadj\u003c/sub\u003e\u0026nbsp;\u0026lt; 0.05 and LFC \u0026gt; 0.5) in D2 NFIX reprogramming cells (compared to rMEFs) and linked to opening DARs positive for NFI and low affinity OCT4 motifs.\u0026nbsp;\u0026nbsp;\u003c/p\u003e","description":"","filename":"Figure6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/64f1e383d06a96ad7244a436.jpg"},{"id":99309306,"identity":"d03b2624-737c-4b2f-9a18-49ccb48eaf7e","added_by":"auto","created_at":"2025-12-31 16:10:04","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":4405601,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDoseH-seq–based model of \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eNfix o\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003everexpression levels governing reprogramming and PSC differentiation. \u003c/strong\u003eLow level \u003cem\u003eNfix\u003c/em\u003e overexpression beyond native levels reverses mesenchymal drift in fibroblasts as limited dedifferentiation (when overexpressed on its own) and synergises with insufficient OKSM levels (c-MYC not shown for simplicity) to promote more widespread chromatin opening early during reprogramming and collapse of the somatic transcriptional network. In contrast, high levels of \u003cem\u003eNfix\u003c/em\u003e leads to AP-1 activation, reinforcing the somatic network and blocking reprogramming. Additional high \u003cem\u003eOct4\u003c/em\u003e, results in p53 activation and cell cycle arrest, exacerbating the reprogramming block. In PSCs, \u003cem\u003eNfix \u003c/em\u003einduction drives re-engagement of a subset of regulatory regions (linked to developmental processes), activated during reprogramming with binding sites for pluripotency and NFI binding sites, destabilizing the pluripotency network.\u003c/p\u003e","description":"","filename":"Figure7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/f13c9ba13ce340646151c5db.jpg"},{"id":100593930,"identity":"5d4db18e-fbff-4124-a9c6-cd20aa94303d","added_by":"auto","created_at":"2026-01-19 13:27:59","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":13657194,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/5c0956dd-eaf5-4bcc-93f9-55d60855587c.pdf"},{"id":99309427,"identity":"6762766c-4ab3-4ae2-b37b-35cdc5893a7e","added_by":"auto","created_at":"2025-12-31 16:10:21","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2445025,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSupplementary Figures\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"SupplementaryFigures.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/62b7c650d1bc254e03a1ea24.pdf"},{"id":98885644,"identity":"8531b5a1-1731-4cb8-bc31-3aafd409f9db","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":11753,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S1: \u003c/strong\u003eComparison between DoseH-seq and alternative single-cell multiome perturb-seq tools.\u003c/p\u003e","description":"","filename":"SuppTable1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/5552253a776c8d31f47ae7bd.xlsx"},{"id":99309452,"identity":"01996fef-d6b2-4df9-9fa4-e9a02afec7fb","added_by":"auto","created_at":"2025-12-31 16:10:25","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":127336,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S2:\u003c/strong\u003e List of materials and buffer preparation for DoseH-seq development and validation.\u003c/p\u003e","description":"","filename":"SuppTable2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/18b02b0a38b7195fcf37c03f.xlsx"},{"id":98885657,"identity":"dc57ed9c-f1c0-440d-8d4d-6dc229eb430a","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":3744457,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S3:\u003c/strong\u003e DARs related to Figure 1 and 2: impact of \u003cem\u003eNfix\u003c/em\u003e overexpression on fibroblasts and during reprogramming.\u003c/p\u003e","description":"","filename":"SuppTable3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/e76c1b0794a883a5f1bfe42d.xlsx"},{"id":99309461,"identity":"605d324c-595f-45d2-a558-0237c8d50d49","added_by":"auto","created_at":"2025-12-31 16:10:25","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":103157,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S4:\u003c/strong\u003e Motif enrichment for DARs opening in response to \u003cem\u003eNfix\u003c/em\u003e overexpression stratifications in fibroblasts.\u003c/p\u003e","description":"","filename":"SuppTable4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/e3fa948178fa52a6e0fe4d48.xlsx"},{"id":98885658,"identity":"71f1246f-ead8-4b73-9122-ae6e19ed37c7","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":2372343,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S5:\u003c/strong\u003e DARs related to Figure 3: reprogramming and differentiation comparisons.\u003c/p\u003e","description":"","filename":"SuppTable5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/9a9dd0036a561d28253766de.xlsx"},{"id":98885661,"identity":"d5436547-4864-48ea-a0c3-a57177fc75b3","added_by":"auto","created_at":"2025-12-23 15:04:51","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":5787887,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable S6:\u003c/strong\u003e DARs related to Figure 5: single- and multi-perturb comparisons.\u003c/p\u003e","description":"","filename":"SuppTable6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8400524/v1/e51db0a9fb49e27adfd51aad.xlsx"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eDoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming\u003c/p\u003e","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eSingle-cell technologies have expanded the ability to unravel cell states and cell fate transitions at unprecedented levels. In particular, the dual profiling of gene expression and chromatin accessibility with single-nucleus (sn)RNA\u0026thinsp;+\u0026thinsp;ATAC-seq has enabled the probing of TF networks underlying cell fate decisions\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. To understand the mechanistic role of TFs in regulating cell states, snRNA\u0026thinsp;+\u0026thinsp;ATAC-seq perturbation experiments that either overexpress the TF, or knock it down or out, are required. There has been great progress in developing perturb-seq methods for single-cell (sc)RNA-seq\u003csup\u003e\u003cspan additionalcitationids=\"CR5\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, including with dosage-sensitivity\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e; however, these methods lack the multiomic capacity of snRNA\u0026thinsp;+\u0026thinsp;ATAC-seq assays. Recently, Multiome Perturb-seq\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e and MultiPerturb-seq\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e were developed for profiling perturbation effects using single-cell multiomics. While these methods can capture the binary phenotype of a TF knockdown, TF activity is typically graded, and often sensitive to dosage through graded changes in expression levels\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. Dose-dependent TF activity has been linked to context-specific outcomes in cell fate reprogramming\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e and may underlie paradoxical TF roles in cancer\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e,\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e, highlighting the imperative to develop technologies for interrogating graded shifts in TF networks.\u003c/p\u003e \u003cp\u003eTo resolve these dose-response mechanisms at single-cell, multiomic resolution, we developed Dosage and Hashtag sequencing (DoseH-seq), an expansion of the 10x Genomics snRNA\u0026thinsp;+\u0026thinsp;ATAC-seq platform. To our knowledge, DoseH-seq is the first single-cell multiome (and chromatin level) method that enables sensitive detection of lentiviral perturbations to study dosage effects. DoseH-seq tracks continuous TF overexpression or knockdown (including multi-perturb designs), driven by promoters heterogeneously expressed in different cells, for a targeted panel of TFs at high resolution. This contrasts DoseH-seq to current multiomic perturb-seq methods\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e, which assay discrete/binary CRISPRi knockdowns across many genes. As a further advance over existing approaches, DoseH-seq integrates sample hashing to enable multiplexed assaying of multiple conditions or time points in a single run.\u003c/p\u003e \u003cp\u003eWe applied DoseH-seq to interrogate the biological impact of somatic NFI transcription factors, which we previously identified as underpinning the activity of youthful gene regulatory elements, silenced during aging\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Here, we leverage DoseH-seq to resolve the dose-dependent impacts of overexpressing NFI factor \u003cem\u003eNfix\u003c/em\u003e in fibroblasts, pluripotent stem cells (PSCs) and during induced PSC (iPSC) reprogramming. We track \u003cem\u003eNfix\u003c/em\u003e activity both by itself, and in dual overexpression contexts (\u003cem\u003eNfix\u003c/em\u003e plus \u003cem\u003eOct4\u003c/em\u003e) as well as combinatorial overexpression plus knockdown (\u003cem\u003eNfix\u003c/em\u003e plus c-\u003cem\u003eJun\u003c/em\u003e shRNA) experiments.\u003c/p\u003e \u003cp\u003eDoseH-seq revealed that \u003cem\u003eNfix\u003c/em\u003e overexpression in fibroblasts induces a dose-dependent restricted developmental reversion, resulting in loss of myofibroblast identity. This mirrors the effects of transient epigenetic reprogramming with \u003cem\u003eOct4\u003c/em\u003e, \u003cem\u003eSox2\u003c/em\u003e, \u003cem\u003eKlf4\u003c/em\u003e and c-\u003cem\u003eMyc\u003c/em\u003e shown to reverse mesenchymal drift linked to fibroblast aging\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. While somatic TFs expressed in the starting cell type have been reported to inhibit de-differentiation during reprogramming\u003csup\u003e\u003cspan additionalcitationids=\"CR17 CR18 CR19 CR20 CR21\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e, we uncover a dose-dependent switch in NFIX function. Moderate \u003cem\u003eNfix\u003c/em\u003e overexpression synergizes with OKSM early during reprogramming by acting as a chromatin co-opener to boost reprogramming, with potential analogous dosage-interactions in oncogenic dedifferentiation. In contrast, high \u003cem\u003eNfix\u003c/em\u003e blocks reprogramming by activating AP-1 and reinforcing somatic networks. In PSCs, \u003cem\u003eNfix\u003c/em\u003e destabilizes pluripotency by re-engaging transient reprogramming regions comprising degenerate motifs for pluripotency TFs, mechanistically linking reprogramming and differentiation processes. Collectively, DoseH-seq provides a critically needed perturbation-multiomics platform to dissect gene dosage-dependent effects, revealing context and dosage-dependent roles for \u003cem\u003eNfix\u003c/em\u003e in potentiating graded levels of developmental reversion.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003eWe developed DoseH-seq to enable tracking of continuous perturbation levels in multiplexed snRNA+ATAC-seq 10x Genomics experiments (\u003cstrong\u003eFigure 1A\u003c/strong\u003e). The DoseH-seq assay consists of two main components: 1) design of dosage trackable lentiviral perturbations with cell-to-cell variation in expression levels and 2) a two-step nuclei hashing process. For design of a dosage trackable perturbation system, we developed barcoded lentiviral constructs harbouring a synthetic barcoded cassette inserted proximally to the “polyA-like” signal within the 3’LTR. To enable cost-effective multiplexing, we developed a two-step lysis procedure for nuclei hashing compatible with snRNA+ATAC-seq (\u003cstrong\u003eFigure S1A–E\u003c/strong\u003e). An initial “soft” lysis step enables nuclear hashtag antibody labelling by permeabilising only the outer cell membrane; the nuclear membrane is therefore preserved to protect RNA until the second “harsh” lysis step in preparation for the ATAC-seq transposition reaction in the presence of high levels of RNase inhibitor (\u003cstrong\u003eFigures 1A\u003c/strong\u003e and\u003cstrong\u003e\u0026nbsp;S1E\u003c/strong\u003e). The design of DoseH-seq allows for flexible perturbation experiments across multiple experimental conditions (e.g. timepoint studies), including overexpression, knockdown and multi-perturb designs (\u003cstrong\u003eFigure 1A\u003c/strong\u003e). The capability to perform overexpression experiments and track perturbation dosage-levels distinguishes DoseH-seq from the existing perturb-seq methods Multiome Perturb-seq\u003csup\u003e8\u003c/sup\u003e and MultiPerturb-seq\u003csup\u003e9\u003c/sup\u003e (\u003cstrong\u003eTable\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eS1\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eBenchmarking of antibody-hashing against genetic demultiplexing of human samples\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe first validated the nuclei hashing step by performing a snRNA+ATAC-seq experiment in human fibroblast samples from 14 donors, plus a human embryonic stem cell (ESC) control line (annotated as hPSC hereafter; \u003cstrong\u003eFigures 1B\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S1F–M, Table S\u003c/strong\u003e\u003cstrong\u003e2.1\u003c/strong\u003e). The ability to perform single nucleotide polymorphism (SNP)-based demultiplexing, leveraging genetic heterogeneity between human donors (see \u003cstrong\u003eMethods\u003c/strong\u003e)\u003csup\u003e23\u003c/sup\u003e, provided a ground truth for evaluating hashing efficacy. Our experiment yielded high quality data (\u003cstrong\u003eFigures S1G–M\u003c/strong\u003e), including clear multiomic discrimination of hPSC and fibroblast cells (\u003cstrong\u003eFigure S2A\u003c/strong\u003e) and distinct peaks called across regulatory elements (\u003cstrong\u003eFigure S2B\u003c/strong\u003e). Within the fibroblasts, hashtag-based demultiplexing enabled clear discrimination of samples and donors (\u003cstrong\u003eFigure S1M\u003c/strong\u003e). We compared the hashtag-based demultiplexing to the SNP-based\u003csup\u003e23\u003c/sup\u003e and found our hashtag-based demultiplexing had a median assignment accuracy of 95% (\u003cstrong\u003eFigure\u003c/strong\u003e\u003cstrong\u003es\u003c/strong\u003e \u003cstrong\u003eS2C\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;1C\u003c/strong\u003e). This experiment validated that DoseH-seq enables highly accurate recovery of numerous hashed sample conditions.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTracking variable \u003cem\u003eNfix\u003c/em\u003e overexpression in mouse fibroblasts and during reprogramming\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe recently reported that activator protein 1 (AP-1) TFs commonly drive chromatin opening with maturation and aging, while closing chromatin is enriched for cell identity TFs for their respective cell types\u003csup\u003e14\u003c/sup\u003e. Of note, pan-somatic NFI factors were also commonly enriched in closing candidate cis-regulatory elements (cCREs) across cell types and were redistributed to opened chromatin as AP-1 co-factors\u003csup\u003e14\u003c/sup\u003e. We therefore applied DoseH-seq to explore the impact of dose-resolved overexpression of widely expressed (but switched off during iPSC reprogramming [\u003cstrong\u003eFigure S2D\u003c/strong\u003e]) NFI member \u003cem\u003eNfix\u003c/em\u003e in different cellular contexts.\u003c/p\u003e\n\u003cp\u003eBriefly, reprogrammable mouse embryonic fibroblasts (rMEFs with a doxycycline [DOX] inducible Oct4, Klf4, Sox2 and cMyc [OKSM] cassette and an endogenous \u003cem\u003eOct4\u003c/em\u003e-\u003cem\u003eGfp\u003c/em\u003e reporter to mark fully-reprogrammed iPSCs)\u003csup\u003e24\u003c/sup\u003e, were transduced with either the barcoded EF1α-driven \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003econstruct or the barcoded \u003cem\u003eTagBFP2\u0026nbsp;\u003c/em\u003econtrol construct\u0026nbsp;(\u003cstrong\u003eFigure\u003c/strong\u003e \u003cstrong\u003eS2E–H\u003c/strong\u003e and \u003cstrong\u003eTable S\u003c/strong\u003e\u003cstrong\u003e2.2\u003c/strong\u003e). Cells were collected under \u003cem\u003eNfix\u003c/em\u003e perturbation conditions at multiple time points before, during and after reprogramming (including mouse embryonic stem cells [mESCs] as pluripotency reference) for multiome profiling (\u003cstrong\u003eFigures 1D\u003c/strong\u003e and \u003cstrong\u003eS2I–M\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFollowing demultiplexing (\u003cstrong\u003eFigure 1E\u003c/strong\u003e) and stringent quality control filtering, we progressed with 8,341 nuclei (median of 15,799 ATAC fragments, 8,952 RNA UMIs per nucleus) for analysis (\u003cstrong\u003eFigure S2N\u003c/strong\u003e). To track \u003cem\u003eNfix\u003c/em\u003e overexpression relative to BFP control, we created a “lenti score”, defined by log-odds values from a logistic regression model classifying \u003cem\u003eNfix\u003c/em\u003e overexpression or control status based on lenti barcode counts. We calculated Uniform Manifold and Approximate Projection (UMAP) coordinates for fibroblasts before OKSM induction and observed a gradient of lenti scores, from negative in BFP control cells to low and then high in \u003cem\u003eNfix\u003c/em\u003e overexpressing cells (\u003cstrong\u003eFigure 1F\u003c/strong\u003e). We stratified the \u003cem\u003eNfix\u003c/em\u003e overexpressing fibroblasts into ‘low’, ‘mid’ and ‘high’ classifications based on the lenti score, corresponding to increasing levels of normalised lenti counts (\u003cstrong\u003eFigure 1G\u003c/strong\u003e). We calculated differentially accessible regions (DARs), comparing each overexpression stratification to the BFP rMEF control cells (\u003cstrong\u003eTable S3\u003c/strong\u003e). We observed an increasing number of cCREs opening and closing in response to increasing levels of \u003cem\u003eNfix\u003c/em\u003e overexpression (\u003cstrong\u003eFigure 1H\u003c/strong\u003e), not driven by differences in numbers of ATAC features or fragment counts (\u003cstrong\u003eFigures S2O\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S2P\u003c/strong\u003e). NFI was the top motif enriched in opening DARs (\u003cstrong\u003eTable S4\u003c/strong\u003e), with increasing DAR numbers associated with increased enrichment (\u003cstrong\u003eFigure 1I)\u003c/strong\u003e. We observeda similar increase in NFIX activity through calculating ChromVAR motif deviances from the scATAC-seq library (\u003cstrong\u003eFigure S2Q\u003c/strong\u003e) and confirmed that we could capture increasing levels of 3’ LTR expression from the scRNA-seq library in correspondence to increasing stratifications of lenti score (\u003cstrong\u003eFigure S2R\u003c/strong\u003e).Together, this established that DoseH-seq captures the impact of dosage level-resolved TF overexpression in single cells.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDoseH-seq reveals dose-dependent effects of \u003cem\u003eNfix\u003c/em\u003e overexpression during reprogramming\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe used DoseH-seq to track the impact of \u003cem\u003eNfix\u003c/em\u003e overexpression levels along the reprogramming time course to pluripotency. Directed Force Layout (DFL) plots displayed a transition from fibroblast to endogenous pluripotency marker expression (\u003cstrong\u003eFigures 2A\u003c/strong\u003e,\u003cstrong\u003e\u0026nbsp;2B, S3A\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S3B\u003c/strong\u003e) and revealed that BFP control cells (\u003cstrong\u003eFigure 2A\u003c/strong\u003e) followed a different reprogramming trajectory to cells that overexpress \u003cem\u003eNfix\u003c/em\u003e (\u003cstrong\u003eFigure 2B\u003c/strong\u003e). ChromVAR deviances calculated on NFIX\u003csup\u003e+\u003c/sup\u003e cCREs confirmed that early reprogramming \u003cem\u003eNfix\u003c/em\u003e overexpressing cells were characterised by highly elevated NFIX motif activity (\u003cstrong\u003eFigure S3C\u003c/strong\u003e). However, THY1\u003csup\u003e-\u003c/sup\u003e cells from the NFIX group progressing beyond day 6 appeared to have silenced lentiviral inserts (unlike the BFP control), suggesting that sustained \u003cem\u003eNfix\u003c/em\u003e overexpression is not compatible with pluripotency progression (\u003cstrong\u003eFigures 2E\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2F\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDendrogram analysis of average transcriptional similarity between conditions revealed that the progressing NFIX late-stage reprogramming (THY1\u003csup\u003e-\u003c/sup\u003e) cells were most similar to the mESCs (\u003cstrong\u003eFigure 2C\u003c/strong\u003e). In contrast, the late stage BFP THY1\u003csup\u003e-\u003c/sup\u003e cells clustered with the earlier day 6 reprogramming time point. CytoTRACE2\u003csup\u003e25\u003c/sup\u003e analysis, which quantifies developmental potential, showed that the late stage NFIX THY1\u003csup\u003e-\u003c/sup\u003e cells had the highest pluripotency scores following mESCs (\u003cstrong\u003eFigures 2D\u003c/strong\u003e and \u003cstrong\u003eS3D\u003c/strong\u003e). NFIX THY1\u003csup\u003e-\u003c/sup\u003e cells also displayed higher activation for pluripotency marker \u003cem\u003eEpcam\u003c/em\u003e compared to late stage BFP THY1\u003csup\u003e-\u003c/sup\u003e cells (\u003cstrong\u003eFigure S3E\u003c/strong\u003e), with the opposite pattern for fibroblast marker \u003cem\u003eCd44\u003c/em\u003e (\u003cstrong\u003eFigure S3F\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eAs somatic \u003cem\u003eNfix\u003c/em\u003e activity is silenced as cells progress towards pluripotent state (\u003cstrong\u003eFigures S3C\u003c/strong\u003e and \u003cstrong\u003e2F\u003c/strong\u003e), we explored dosage-specific effects of \u003cem\u003eNfix\u003c/em\u003e overexpression on days 2 and 6 of reprogramming. At day 2, increasing \u003cem\u003eNfix\u003c/em\u003e overexpression was associated with increased numbers of opening DARs (\u003cstrong\u003eFigure 2G\u003c/strong\u003e, \u003cstrong\u003eTable S3\u003c/strong\u003e). Interestingly, overlaying chromatin module scores of cCREs previously defined by us as stable during rMEF reprogramming\u003csup\u003e17\u003c/sup\u003e revealed a loss of accessibility within the NFIX day 2 cells (\u003cstrong\u003eFigure 2H\u003c/strong\u003e). This implies that the additional NFIX, in synergy with OKSM, induced chromatin opening and destabilised the somatic chromatin landscape in a more pronounced manner than OKSM alone (\u003cstrong\u003eFigure 2H\u003c/strong\u003e). At day 6 of reprogramming, the highest number of DARs relative to day 0 rMEFs were in the \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003elow cells (\u003cstrong\u003eFigure 2I\u003c/strong\u003e). This indicates that only a subset of lower \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpressing cells can progress to pluripotency, while cells with high levels of \u003cem\u003eNfix\u003c/em\u003e are refractory to reprogramming. In support of this, module scores of mESC marker genes (relative to rMEFs) at day 6 showed that only the low \u003cem\u003eNfix\u003c/em\u003e cells downregulated somatic (\u003cstrong\u003eFigure 2J\u003c/strong\u003e), or upregulated mESC genes (\u003cstrong\u003eFigure 2K\u003c/strong\u003e), compared to day 6 BFP cells.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOur DoseH-seq analysis suggests that different \u003cem\u003eNfix\u003c/em\u003e overexpression dosages early during reprogramming may act to either enhance or block reprogramming. To confirm this, we used\u0026nbsp;fluorescence-activated cell sorting\u0026nbsp;(FACS) to sort rMEFs overexpressing \u003cem\u003eNfix\u003c/em\u003e into low, mid or high stratifications (via TagBFP2 levels linked to the \u003cem\u003eNfix\u003c/em\u003e cassette via an internal ribosome entry side [IRES], \u003cstrong\u003eFigures 2L–N\u003c/strong\u003e) and assess reprogramming outcomes (\u003cstrong\u003eFigures 2O\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2P\u003c/strong\u003e). Compared to BFP only cells, sorted \u003cem\u003eNfix\u003c/em\u003e-overexpressing cells revealed a progressive increase in \u003cem\u003eNfix\u003c/em\u003e transcript levels from low to high stratifications (~2-15-fold above native levels, \u003cstrong\u003eFigure 2N\u003c/strong\u003e). Evaluating the percentage of OCT4-GFP\u003csup\u003e+\u003c/sup\u003e cells at the end of reprogramming revealed that only the ‘mid’ \u003cem\u003eNfix\u003c/em\u003e population showed a sharp and significant increase in both the percentage of iPSCs (\u003cstrong\u003eFigures 2O\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2P)\u0026nbsp;\u003c/strong\u003eand absolute iPSC numbers per well\u003cstrong\u003e\u0026nbsp;(Figure S3G)\u003c/strong\u003e, confirming that \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eexerts a dose-dependent effect on reprogramming as predicted by DoseH-seq. Despite starting with sorted BFP\u003csup\u003e+\u003c/sup\u003e cells, \u0026gt;85% of fully-reprogrammed OCT4-GFP\u003csup\u003e+\u003c/sup\u003e cells from the \u003cem\u003eNfix\u003c/em\u003e overexpression group had silenced expression of the fluorescent reporter, unlike the BFP only control (\u003cstrong\u003eFigures 2Q\u003c/strong\u003e and \u003cstrong\u003eS3H\u003c/strong\u003e). This suggests either that only cells that randomly silence their \u003cem\u003eNfix\u003c/em\u003e-expressing viral inserts can progress or that the more extreme chromatin rearrangements in response to OKSM + \u003cem\u003eNfix\u003c/em\u003e overexpression trigger viral silencing. Given that by day 6, it is the \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003elow cells progressing to pluripotency, we surmised that lentiviral silencing (\u003cstrong\u003eFigure 2F\u003c/strong\u003e) results in the earlier ‘mid’ population transitioning to ‘low’ as part of pluripotency induction. Thus, \u003cem\u003eNfix\u003c/em\u003e overexpression can synergizes with the Yamanaka-factors in a dosage-dependent manner early during reprogramming to destabilise chromatin and drive pluripotency acquisition.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eNfix\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;overexpression drives a limited program of developmental reversion in fibroblasts\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe used our DoseH-seq data to better understand the effect of \u003cem\u003eNfix\u003c/em\u003e overexpression stratifications on the starting fibroblast population. Fibroblasts in culture differentiate into myofibroblasts, and at higher frequencies when plated at low densities as with reprogramming\u003csup\u003e26\u003c/sup\u003e. A recent study showed that the Yamanaka factors reverse this so-called “mesenchymal-drift” during iPSC generation or in response to transient reprogamming\u003csup\u003e14\u003c/sup\u003e. Indeed, we observed that markers of myofibroblasts including \u003cem\u003eCthrc1\u003c/em\u003e, \u003cem\u003eActa2\u003c/em\u003e and \u003cem\u003ePtn\u003c/em\u003e\u003cem\u003e\u003csup\u003e27\u003c/sup\u003e\u003c/em\u003e were expressed in the BFP-only control rMEFs, but downregulated in \u003cem\u003eNfix\u003c/em\u003e-overexpressing rMEFs (\u003cstrong\u003eFigure S4A\u003c/strong\u003e). In contrast, markers of resting fibroblasts\u003csup\u003e27\u003c/sup\u003e were upregulated in NFIX rMEFs.\u0026nbsp;We therefore asked whether \u003cem\u003eNfix\u003c/em\u003e overexpression alone can also drive myofibroblast de-differentiation.\u003c/p\u003e\n\u003cp\u003eMotif enrichment of DARs (Nfix overexpressing fibroblasts vs BFP control) showed that in addition to NFI, opening DARs contained an enrichment of AP-1 motifs, including JUN/FOS, as well as STAT3 and cell identity TFs for fibroblast lineage cells including TEAD1, RUNX1 and TCF21 (\u003cstrong\u003eFigure 3A\u003c/strong\u003e). Interestingly, closing DARs showed a similar enrichment pattern for AP-1 and cell identity factors (\u003cstrong\u003eFigure 3B\u003c/strong\u003e). Both opening and closing motif signals became progressively stronger with increasing \u003cem\u003eNfix\u003c/em\u003e overexpression. This implies that NFIX-linked chromatin opening is causing redistribution of AP-1 and cell identity TFs away from cCREs of the starting cell state.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe correlated genes differentially expressed between NFIX low rMEFs and BFP rMEFs with marker genes from a pan-tissue fibroblast atlas\u003csup\u003e28\u003c/sup\u003e. The top hit was a strong negative correlation with myofibroblasts (\u003cstrong\u003eFigure 3C\u003c/strong\u003e), suggesting that even low levels of \u003cem\u003eNfix\u003c/em\u003e overexpression drive myofibroblast de-differentiation,\u0026nbsp;consistent with reversal of mesenchymal drift.\u0026nbsp;We likewise observed a loss of myofibroblast identity in BFP only cells at day 6 of reprogramming (\u003cstrong\u003eFigure S4B\u003c/strong\u003e), while the day 6 NFIX low cells gained a mesenchymal stromal cell (MSC) identity (\u003cstrong\u003eFigure S4C\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOur results indicate that NFIX alone, a somatic TF, can drive a limited program of de-differentiation that is enhanced during reprogramming with OKSM to shed the somatic identity.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eShared regulatory\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;element activation by \u003cem\u003eNfix\u003c/em\u003e in reprogramming and PSC differentiation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNfix\u003c/em\u003e overexpression is largely silenced by the end of reprogramming, implying incompatibility with pluripotency maintenance (\u003cstrong\u003eFigures 2Q\u0026nbsp;\u003c/strong\u003eand \u003cstrong\u003eS3H\u003c/strong\u003e). Indeed, we overexpressed \u003cem\u003eNfix\u003c/em\u003e in mouse PSCs and found it drove differentiation towards a fibroblast-like state (\u003cstrong\u003eFigure\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e3D\u003c/strong\u003e). To understand the relationship between NFIX activity in differentiation from PSCs and during reprogramming, we performed a second DoseH-seq experiment (\u003cstrong\u003eFigures 3E\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S4D–M\u003c/strong\u003e; see below for more details). Samples included an early differentiation time course of mESCs (cultured feeder-free) overexpressing \u003cem\u003eNfix\u003c/em\u003e or BFP only control (\u003cstrong\u003eFigure 3E\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFollowing integration with the first DoseH-seq run, we correlated chromatin accessibility changes in differentiation (vs mESCs) to reprogramming time points (also vs mESCs) (\u003cstrong\u003eFigure 3F\u003c/strong\u003e). The strongest correlation was between the reprogramming day 2 NFIX cells and the differentiation day 2+3 NFIX cells (\u003cstrong\u003eFigures 3F\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;3G\u003c/strong\u003e), implying NFIX drives a shared program of accessibility remodelling between these two processes (\u003cstrong\u003eFigures 3F\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;3G\u003c/strong\u003e). To investigate this further, we compared DARs (\u003cstrong\u003eTable S5\u003c/strong\u003e) for the differentiating NFIX D2+3 cells (vs mESCs) to NFIX reprogramming day 2 cells (vs BFP rMEFs), which revealed a strong positive correlation in overall chromatin accessibility changes (\u003cstrong\u003eFigure 3H\u003c/strong\u003e). We next focussed specifically on NFIX-induced changes during reprogramming by calculating day 2 reprogramming DARs relative to the \u003cem\u003eNfix\u003c/em\u003e-overexpressing rMEFs and overlapped the opening DARs with the differentiation DARs. We found that 670 (42%) of differentiation DARs overlapped with reprogramming DARs (\u003cstrong\u003eFigure 3I\u003c/strong\u003e), representing a significantly higher overlap than with the day 2 BFP DARs (2.5%) (\u003cstrong\u003eFigure\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eS4N\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMotif enrichment among these different DAR sets revealed significant enrichment for NFI as expected (\u003cstrong\u003eFigure 3J\u003c/strong\u003e). However, another shared feature of these DARs was OCT, SOX and KLF motifs. We considered that NFIX could be targeting regulatory elements that contain low-affinity OSK binding sites and that this could be contributing to its ability to enhance reprogramming in synergy with OSK. We analysed motif scores for OCT4, SOX2 and the OCT4:SOX2 heterodimer (\u003cstrong\u003eFigures S4O–Q\u003c/strong\u003e), and found that \u003cem\u003eNfix\u003c/em\u003e overexpression in both reprogramming and differentiation was opening regions with lower affinity OCT4:SOX2 scores (\u003cstrong\u003eFigure S4O\u003c/strong\u003e), although this was less pronounced for SOX2 alone (\u003cstrong\u003eFigure S4P\u003c/strong\u003e). Gene ontology (GO) term analysis of genes linked to opening DARs in response to NFIX-mediated differentiation revealed enrichment for developmental processes (\u003cstrong\u003eFigure S4R\u003c/strong\u003e). A subset of these developmental genes was also linked to DARs opening on day 2 during OKSM reprogramming in combination with \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression (\u003cstrong\u003eFigure 3K\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eIn summary, forced \u003cem\u003eNfix\u003c/em\u003e overexpression in PSCs drives differentiation by re-opening transient reprogramming elements with NFI/OCT/SOX/KLF motifs, suggesting they are early-developmental regulators.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eNfix\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;overexpression synergizes with\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e\u003cem\u003eOct4\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e,\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e\u003cem\u003eSox2\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;and \u003cem\u003eKlf4\u003c/em\u003e in a context-dependent manner\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDoseH-seq revealed that \u003cem\u003eNfix\u003c/em\u003e can boost reprogramming in a dose-specific manner. We next sought to unravel the molecular mechanisms underlying this dosage effect. At day 2, NFIX mid cells had 77% more uniquely opening DARs than BFP only cells, compared to their uninduced rMEF controls (\u003cstrong\u003eFigure 4A\u003c/strong\u003e). Motif enrichment of opening DARs unique to the low, mid and high \u003cem\u003eNfix\u003c/em\u003e subsets revealed that NFIXmid cells displayed a more balanced enrichment pattern for both OSK and somatic TFs (\u003cstrong\u003eFigures S5A\u003c/strong\u003e and \u003cstrong\u003e4B\u003c/strong\u003e). Compared to BFP cells, NFIX mid opened increased numbers of cCREs positive for SOX2 and KLF4 motifs, while NFIX low showed fewer unique opening of OCT4/SOX2 binding sites. In contrast, although NFIXhigh cells opened excess SOX2 and KLF4 binding sites, this was accompanied by very strong enrichment for reprogramming barrier AP-1\u003csup\u003e14,16\u003c/sup\u003e (\u003cstrong\u003eFigure 4B\u003c/strong\u003e). This motif-count analysis reflects high-confidence motif calls; however, Day2 NFIX-unique opening regions also contain lower-affinity OCT4:SOX2 and OCT4 sites (\u003cstrong\u003eFigure S4O–Q\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eBy day 6, \u003cem\u003eNfix\u003c/em\u003e overexpressing cells progressing to pluripotency (\u003cstrong\u003eFigures 2I-K\u003c/strong\u003e) express low levels of \u003cem\u003eNfix\u003c/em\u003e lenti barcodes (\u003cstrong\u003eFigure 2F\u003c/strong\u003e). Analysis of DARs uniquely opening in day 6 NFIX low cells compared to BFP control (\u003cstrong\u003eFigure 4C\u003c/strong\u003e) revealed strong enrichment and excess binding sites for OCT4, SOX2 and KLF4 motifs (\u003cstrong\u003eFigures 4D\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S5B\u003c/strong\u003e). We generated chromatin module scores for DARs uniquely opening in day 6 NFIX low cells, which were highest in late-stage reprogramming THY1\u003csup\u003e-\u003c/sup\u003e cells and iPSCs (\u003cstrong\u003eFigure S5C\u003c/strong\u003e). In contrast, the unique BFP-opening DARs displayed highest activation in earlier reprogramming cell states (\u003cstrong\u003eFigure S5D\u003c/strong\u003e), In summary, mid-levels of \u003cem\u003eNfix\u003c/em\u003e overexpression early during reprogramming help open chromatin positive for OCT4, SOX2 and KLF4 binding sites and leads to more pronounced activation of late-stage reprogramming cCREs.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eSynergistic effects between \u003cem\u003eNfix\u003c/em\u003e and Yamanaka factors might be more pronounced when the reprogramming factors are expressed at low or insufficient levels. To explore this, we tested \u003cem\u003eNfix\u003c/em\u003e overexpression in a low-efficiency (~0.06%) reprogramming system based on an inducible polycistronic \u003cem\u003eSox2\u003c/em\u003e, \u003cem\u003eKlf4\u003c/em\u003e and cMyc (SKM) cassette\u003csup\u003e22,23\u003c/sup\u003e introduced into wild type (WT) MEFs along with the \u003cem\u003eNfix\u003c/em\u003e or \u003cem\u003eTagBFP2-\u003c/em\u003eonly overexpressing viruses (\u003cstrong\u003eFigure 4E\u003c/strong\u003e). \u003cem\u003eNfix\u003c/em\u003e overexpression resulted in ~10-fold more cells positive for pluripotency markers\u003csup\u003e29,30\u003c/sup\u003e at the end of SKM reprogramming compared to BFP control (\u003cstrong\u003eFigures 4F\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;4G\u003c/strong\u003e). This suggests that \u003cem\u003eNfix\u003c/em\u003e overexpression can also have reprogramming enhancing effects in the low efficiency SKM system lacking \u003cem\u003eOct4\u003c/em\u003e.\u003c/p\u003e\n\u003cp\u003eVitamin C is known to enhance reprogramming, including by boosting the activity of epigenetic remodeller complexes attracted by reprogramming factors\u003csup\u003e31-33\u003c/sup\u003e, a mechanism recently associated with NFI\u003csup\u003e34\u003c/sup\u003e. We therefore asked whether \u003cem\u003eNfix\u003c/em\u003e overexpression might enhance reprogramming through the activity of Vitamin C dependent epigenetic remodeller complexes. We performed a transcriptional similarity analysis between all samples (\u003cstrong\u003eFigure 1D),\u003c/strong\u003e including BFP control samples reprogrammed under Vitamin C-containing (Vc\u003csup\u003e+\u003c/sup\u003e) and Vitamin C-free (Vc\u003csup\u003e-\u003c/sup\u003e) conditions. This showed that both late-stage reprogramming BFP Vc\u003csup\u003e+\u003c/sup\u003e and \u003cem\u003eNfix\u003c/em\u003e overexpressing (Vc\u003csup\u003e-\u003c/sup\u003e) cells matched closest to mESCs, in contrast to BFP Vc\u003csup\u003e-\u003c/sup\u003e cells (\u003cstrong\u003eFigure 4H\u003c/strong\u003e). This indicates that \u003cem\u003eNfix\u003c/em\u003e overexpression mirrors the reprogramming enhancing effects of Vitamin C. Considering Vitamin C boosts the activity of OKSM/NFIX-attracted chromatin remodeller complexes, lower \u003cem\u003eNfix\u003c/em\u003e overexpression levels might be required for beneficial effects on reprogramming outcomes in the presence of Vitamin C. To test this hypothesis, we overexpressed \u003cem\u003eNfix\u003c/em\u003e in the presence of Vitamin Cand used FACS to stratify cells into low, mid and high levels of \u003cem\u003eNfix\u003c/em\u003e overexpression for reprogramming experiments (\u003cstrong\u003eFigure\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e4I\u003c/strong\u003e) as previously (\u003cstrong\u003eFigures 2L\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2M\u003c/strong\u003e). Under Vitamin C-containing conditions, only cells overexpressing \u003cem\u003eNfix\u003c/em\u003e at low levels resulted in a significantly increased percentage of OCT4-GFP\u003csup\u003e+\u003c/sup\u003e cells (\u003cstrong\u003eFigures 4J\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S5E\u003c/strong\u003e), unlike under Vitamin C-free conditions where mid \u003cem\u003eNfix\u003c/em\u003e overexpression enhanced reprogramming (\u003cstrong\u003eFigures 2O\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2P\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eWe noted that high \u003cem\u003eNfix\u003c/em\u003e overexpression levels under Vitamin C conditions led to significant reduction in reprogramming outcomes (\u003cstrong\u003eFigures 4J\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S5E\u003c/strong\u003e). This reflects a previous report that \u003cem\u003eNfix\u003c/em\u003e overexpression using a TRE promoter under Vitamin C containing conditions blocks reprogramming\u003csup\u003e22\u003c/sup\u003e. Outcomes of \u003cem\u003eNfix\u003c/em\u003e overexpression on Vc\u003csup\u003e+\u003c/sup\u003e reprogramming using two different promoters (TRE vs EF1α) showed that only the TRE promoter blocked reprogramming (\u003cstrong\u003eFigure S5G\u003c/strong\u003e), and overexpressed \u003cem\u003eNfix\u003c/em\u003e at significantly higher levels than the EF1α promoter construct (\u003cstrong\u003eFigure S5F\u003c/strong\u003e). \u003cem\u003eNfix\u003c/em\u003e overexpression via both promoter constructs enhanced reprogramming under Vc\u003csup\u003e-\u003c/sup\u003e conditions, although this was most pronounced for the EF1α construct with lower overexpression levels (\u003cstrong\u003eFigures\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eS5G\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S5H\u003c/strong\u003e). Overexpression of \u003cem\u003eNfi\u003c/em\u003e family member \u003cem\u003eNfia\u003c/em\u003e via the EF1α promoter (\u003cstrong\u003eTable S\u003c/strong\u003e\u003cstrong\u003e2.2\u003c/strong\u003e) likewise enhanced reprogramming outcomes demonstrating that this effect is not \u003cem\u003eNfix\u003c/em\u003e-specific (\u003cstrong\u003eFigure S5I\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eOur data indicates that low \u003cem\u003eNfix\u003c/em\u003e overexpression can support reprogramming factors (at potentially limited levels) to open chromatin early during reprogramming (\u003cstrong\u003eFigures 2G\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;2I\u003c/strong\u003e). We hypothesised that additional \u003cem\u003eOct4\u003c/em\u003e, \u003cem\u003eSox2\u003c/em\u003e or \u003cem\u003eKlf4\u003c/em\u003e might counter this supporting effect. Indeed, overexpression of additional \u003cem\u003eOct4, Sox2 or Klf4\u003c/em\u003e in\u0026nbsp;rMEFs, with or without \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression (EF1α\u0026nbsp;promoter driven),\u0026nbsp;abrogated the reprogramming-enhancing effects of \u003cem\u003eNfix\u003c/em\u003e overexpression\u0026nbsp;(\u003cstrong\u003eFigures 4K\u003c/strong\u003e and\u003cstrong\u003e\u0026nbsp;S5J\u003c/strong\u003e).\u0026nbsp;This suggests that under Vitamin C-free conditions, \u003cem\u003eNfix\u003c/em\u003e overexpression may compensate for insufficient levels of reprogramming factors; in particular for \u003cem\u003eOct4\u003c/em\u003e, as its addition enhanced rMEF reprogramming outcomes (\u003cstrong\u003eFigure 4K\u003c/strong\u003e). Under Vc\u003csup\u003e+\u003c/sup\u003e conditions, neither elevated expression levels of \u003cem\u003eOct4\u003c/em\u003e, \u003cem\u003eSox2\u003c/em\u003e, or \u003cem\u003eKlf4\u003c/em\u003e improved generation of iPSCs (\u003cstrong\u003eFigure S5J\u003c/strong\u003e) and\u0026nbsp;combined \u003cem\u003eOct4\u0026nbsp;\u003c/em\u003eplus\u003cem\u003eNfix\u003c/em\u003e overexpression\u0026nbsp;resulted in a significant\u0026nbsp;decrease\u0026nbsp;in OCT4-GFP\u003csup\u003e+\u003c/sup\u003e iPSCs\u0026nbsp;(\u003cstrong\u003eFigure S5J\u003c/strong\u003e). The mechanistic basis for this decrease is investigated in the next section.\u003c/p\u003e\n\u003cp\u003eCollectively, our data suggest that when OSKM are insufficient (particularly \u003cem\u003eOct4\u003c/em\u003e), low-to-intermediate \u003cem\u003eNfix\u003c/em\u003e overexpression enhances reprogramming by promoting chromatin opening. In contrast, under higher OSKM levels \u003cem\u003eNfix\u003c/em\u003e overexpression blocks reprogramming, likely by reinforcing the somatic state.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMulti-perturb DoseH-seq reveals combinatorial TF interactions enhancing or blocking reprogramming\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA key feature of DoseH-seq is its capacity for tracking multiple perturbations in the same cell. We next sought to demonstrate the multi-perturb capabilities of DoseH-seq using dual overexpression and overexpression/knockdown experiments during reprogramming for which we selected two factor pairs. Firstly, we selected \u003cem\u003eOct4\u0026nbsp;\u003c/em\u003eto overexpress in combination with \u003cem\u003eNfix\u003c/em\u003e to unravel their combinatorial dosage role in blocking reprogramming under Vitamin C conditions (\u003cstrong\u003eFigure S5\u003c/strong\u003e). Secondly, given that high \u003cem\u003eNfix\u003c/em\u003e blocks reprogramming, we looked at genes induced in \u003cem\u003eNfix\u003c/em\u003e-overexpressing rMEFs that may contribute to this phenotype. The AP-1 subunit \u003cem\u003ec-Jun,\u0026nbsp;\u003c/em\u003epreviouslyreported to block reprogramming\u003csup\u003e16,18\u003c/sup\u003e,\u0026nbsp;was specifically upregulated in \u003cem\u003eNfix\u003c/em\u003e high cells (\u003cstrong\u003eFigure 5A\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe tested the dosage effects of \u003cem\u003ec-Jun\u003c/em\u003e overexpression in Vc\u003csup\u003e+\u003c/sup\u003e and Vc\u003csup\u003e-\u003c/sup\u003e reprogramming (\u003cstrong\u003eFigure S6A\u003c/strong\u003e), finding that all \u003cem\u003ec-Jun\u003c/em\u003e overexpression levels displayed a strong blocking effect (\u003cstrong\u003eFigure S6B\u003c/strong\u003e). This may reflect AP-1's established role in driving cell aging processes\u003csup\u003e14\u003c/sup\u003e, given senescence is incompatible with successful reprogramming\u003csup\u003e35-39\u003c/sup\u003e.We next performed \u003cem\u003ec-Jun\u003c/em\u003e knockdown via lentiviral shRNAs against \u003cem\u003ec-Jun\u003c/em\u003e and selected a lentiviral mean occurrence of infection (MOI) driving mild (~30%) knockdown in fibroblasts, to mitigate impact on general cellular processes dependent on AP-1 activity (\u003cstrong\u003eFigures S6C\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S6D\u003c/strong\u003e). Concurrent knockdown of \u003cem\u003ec-Jun\u003c/em\u003e with \u003cem\u003eNfix\u003c/em\u003e overexpression significantly enhanced reprogramming under both Vc\u003csup\u003e-\u003c/sup\u003e and Vc\u003csup\u003e+\u003c/sup\u003e conditions (\u003cstrong\u003eFigure 5B\u003c/strong\u003e), supporting a role for \u003cem\u003ecJun\u003c/em\u003e in limiting the reprogramming-enhancing effects of \u003cem\u003eNfix\u003c/em\u003e overexpression.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe performed a DoseH-seq experiment, tracking multiple perturbations during reprogramming with Vitamin C (\u003cstrong\u003eFigure 5C\u003c/strong\u003e): 1) BFP control, 2) \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression, 3) \u003cem\u003eNfix\u003c/em\u003e overexpression plus \u003cem\u003eOct4\u003c/em\u003e overexpression (‘O+X’) and 4) \u003cem\u003eNfix\u003c/em\u003e overexpression plus \u003cem\u003ec-Jun\u003c/em\u003e knockdown (‘J+X’). We integrated the dataset with the first run (including BFP Vc\u003csup\u003e+\u003c/sup\u003e and \u003cem\u003eNfix\u003c/em\u003e overexpressing cells), with the BFP control and perturb conditions followed separate reprogramming trajectories as before (\u003cstrong\u003eFigures 5D\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;5E\u003c/strong\u003e). Furthermore, the Vc\u003csup\u003e+\u003c/sup\u003e ‘perturb’ cells displayed the same chromatin destabilizing effects at day 2 (\u003cstrong\u003eFigure 5F\u003c/strong\u003e), as for Vc\u003csup\u003e-\u003c/sup\u003e \u003cem\u003eNfix\u003c/em\u003e overexpression alone (\u003cstrong\u003eFigure 2H\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eFor the J+X perturbation, we confirmed in starting rMEF cells that \u003cem\u003ec-Jun\u003c/em\u003e expression was significantly downregulated inJ+X cells compared to \u003cem\u003eNfix\u003c/em\u003e overexpression alone (\u003cstrong\u003eFigure S6E\u003c/strong\u003e). Top motifs enriched in closing DARs for J+X high rMEF cells corresponded to AP-1 family members (\u003cstrong\u003eFigure S6F\u003c/strong\u003e) and opening DARs were enriched most highly for NFI (\u003cstrong\u003eFigure S6G\u003c/strong\u003e), confirming that DoseH-seq could capture the dual knockdown/overexpression effects.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe previously found that \u003cem\u003eNfix\u003c/em\u003e-induced chromatin remodelling at day 2 of OKSM induction was the precursor for enhancing reprogramming outcomes. We therefore investigated the additive impact of\u0026nbsp;J+X\u0026nbsp;at this timepoint. We used lenti scores for \u003cem\u003eNfix\u003c/em\u003e overexpression alone (\u003cstrong\u003eFigure 5G\u003c/strong\u003e) or combined J+X (\u003cstrong\u003eFigure 5H\u003c/strong\u003e) to stratify cells into low, mid and high subsets and calculated DARs relative to their rMEF controls (\u003cstrong\u003eTable S6\u003c/strong\u003e). Compared to \u003cem\u003eNfix\u003c/em\u003e alone, J+X drove additional chromatin opening as well as closing for all stratifications (\u003cstrong\u003eFigure 5I\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eTo understand the mechanistic effect of chromatin opening at different levels of overexpression/knockdown, we identified the DARs that were uniquely opening in the low, mid or high cells for \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression alone (\u003cstrong\u003eFigure S6H\u003c/strong\u003e) or J+X (\u003cstrong\u003eFigure S6I\u003c/strong\u003e). Motif enrichment analysis indicated a shift in the balance of OCT, SOX and KLF motifs to somatic TF motifs for both \u003cem\u003eNfix\u003c/em\u003e alone (\u003cstrong\u003eFigure S6J\u003c/strong\u003e) as well as J+X (\u003cstrong\u003eFigure S6K\u003c/strong\u003e) from low to high dose, respectively. Chromatin module scores of uniquely opening day 2 DAR subsets revealed that the DARs uniquely opening in low cells for both \u003cem\u003eNfix\u003c/em\u003e alone and J+X corresponded to cCREs activated in late reprogramming intermediates (\u003cstrong\u003eFigure\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e5J\u003c/strong\u003e). In contrast, the unique high and mid DARs corresponded to cCREs showing highest activity in day 2 cells themselves (\u003cstrong\u003eFigures 5J\u0026nbsp;\u003c/strong\u003eand\u003cstrong\u003e\u0026nbsp;S6L\u003c/strong\u003e). Thus, low levels of \u003cem\u003ec-Jun\u003c/em\u003e knockdown and \u003cem\u003eNfix\u003c/em\u003e overexpression appear to cooperate to drive greater opening of late-stage reprogramming regulatory elements and shutdown of AP-1 binding site-rich somatic gene regulatory elements.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe next examined the combinatorial impact of \u003cem\u003eOct4\u003c/em\u003e and \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression. \u003cem\u003eOct4\u003c/em\u003e lenti scores were observed at day 2 (but not day 0 as \u003cem\u003eOct4\u003c/em\u003e overexpression was driven from a DOX inducible TRE promoter) (\u003cstrong\u003eFigure 5K\u003c/strong\u003e) and motif enrichment of day 2 \u003cem\u003eOct4\u003c/em\u003e high cells revealed OCT4 motifs (\u003cstrong\u003eFigure S6M\u003c/strong\u003e). To understand why combinatorial \u003cem\u003eOct4\u003c/em\u003e and \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eoverexpression is blocking Vc\u003csup\u003e+\u003c/sup\u003e reprogramming (\u003cstrong\u003eFigure S5J\u003c/strong\u003e), we calculated DARs at day 2 (\u003cstrong\u003eTable S6\u003c/strong\u003e) and compared opening cCREs between O+X high, \u003cem\u003eNfix\u003c/em\u003e high and \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003elow cells. High O+X drove excessive chromatin opening compared to high \u003cem\u003eNfix\u003c/em\u003e overexpression alone (\u003cstrong\u003eFigure 5L\u003c/strong\u003e\u003cstrong\u003e)\u003c/strong\u003e and analysis of motif scores showed that this was due to opening of cCREs with low-affinity OCT4 binding sites (\u003cstrong\u003eFigure 5M\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe sought to understand the difference between the blocking effects of high O+X or high \u003cem\u003eNfix\u003c/em\u003e overexpression alone and the enhancing effects of low \u003cem\u003eNfix\u003c/em\u003e overexpression. We performed motif enrichment on O+X high or \u003cem\u003eNfix\u003c/em\u003e high DARs against \u003cem\u003eNfix\u003c/em\u003e low DARs as background, and vice versa. While top motif enrichment for \u003cem\u003eNfix\u003c/em\u003e low-specific DARs yielded OCT4 and SOX factors (\u003cstrong\u003eFigure S6N\u003c/strong\u003e), O+X specific DARs overrepresented motifs for p53 (\u003cstrong\u003eFigure 5N\u003c/strong\u003e), a known reprogramming barrier that can cause cell cycle arrest\u003csup\u003e36-38,40,41\u003c/sup\u003e. Indeed, cell cycle scoring showed the O+X high cells had significantly lower S phase (\u003cstrong\u003eFigure 5O\u003c/strong\u003e) and G2/M phase scores (\u003cstrong\u003eFigure S6O\u003c/strong\u003e) than \u003cem\u003eNfix\u003c/em\u003e low cells. While there was no significant difference between \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003ehigh and low cells, \u003cem\u003eNfix\u003c/em\u003e high-specific DARs relative to Nfix low cells likewise overrepresented p53 motifs. We confirmed an elevated p53 pathway signature in O+X high and \u003cem\u003eNfix\u003c/em\u003e high cells, relative to \u003cem\u003eNfix\u003c/em\u003e low (\u003cstrong\u003eFigure S6P\u003c/strong\u003e). This was not reflected in stratifications of D2 BFP only cells (\u003cstrong\u003eFigure S6Q\u003c/strong\u003e), indicating that viral load alone is not driving elevated p53 pathway activity.\u003c/p\u003e\n\u003cp\u003eIn summary, multi-perturb DoseH-seq reveals that high \u003cem\u003eNfix\u003c/em\u003e constrains reprogramming by driving excessive AP-1 activation and sustaining somatic cCREs. When \u003cem\u003eOct4\u003c/em\u003e and \u003cem\u003eNfix\u003c/em\u003e are co-overexpressed at high levels, this stress response is exacerbated, triggering P53 activation and cell-cycle arrest, further blocking reprogramming.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eOCT4/NFI dosage levels in human tumours resemble partial dedifferentiation states\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConsidering oncogenic transformation shares many molecular aspects with reprogramming\u003csup\u003e42\u003c/sup\u003e, and OCT4 is reactivated in a subset of tumors\u003csup\u003e43,44\u003c/sup\u003e, we interrogated the OCT4–NFI dosage relationship in clinical cancer RNA-seq from TCGA.\u0026nbsp;Comparing \u003cem\u003ePOU5F1\u003c/em\u003e/OCT4 with \u003cem\u003eNFI\u003c/em\u003e-family expression (\u003cstrong\u003eFigure 6A\u003c/strong\u003e) identified a subset of tumours with OCT4 activation and intermediate \u003cem\u003eNFI\u003c/em\u003e expression (OCT4\u003csup\u003e+\u003c/sup\u003e/NFI\u003csup\u003emid\u003c/sup\u003e). This state was most frequent in kidney tumours (84% of clear cell renal cell carcinoma [ccRCC]), followed by liver-related (49%) and pancreatic (48%) tumour types (\u003cstrong\u003eFigures 6B\u003c/strong\u003e and \u003cstrong\u003eS7A\u003c/strong\u003e). In contrast, OCT4\u003csup\u003ehigh\u003c/sup\u003e/NFI\u003csup\u003elow\u003c/sup\u003e tumours were exclusive to testicular cancers (\u003cstrong\u003eFigure S7A\u003c/strong\u003e), which displayed \u003cem\u003eNANOG\u003c/em\u003e expression (\u003cstrong\u003eFigure 6C\u003c/strong\u003e) and high CytoTRACE2 pluripotency scores (\u003cstrong\u003eFigure 6D\u003c/strong\u003e). OCT4\u003csup\u003e+\u003c/sup\u003e/NFI\u003csup\u003emid\u003c/sup\u003e tumours showed moderate CytoTRACE2 scores, consistent with partial dedifferentiation (\u003cstrong\u003eFigures 2D\u003c/strong\u003e and \u003cstrong\u003eS3D\u003c/strong\u003e), mirroring our reprogramming data in which sustained NFI expression is incompatible with fully pluripotent-like states. Motif analysis further showed that DARs opening in ccRCC kidney tumors\u003csup\u003e45\u003c/sup\u003e were enriched for NFI and OCT motifs (\u003cstrong\u003eFigure 6E\u003c/strong\u003e).\u003c/p\u003e\n\u003cp\u003eWe assessed \u003cem\u003ePOU5F1\u003c/em\u003e transcript isoforms and detected OCT4A-consistent transcripts, the canonical reprogramming-associated isoform. Among untransformed tissue samples, kidney displayed the highest percentage of OCT4A-positive samples (\u003cstrong\u003eFigure S7B\u003c/strong\u003e), with kidney tumour types KIRP (kidney renal papillary cell carcinoma) and KIRC (ccRCC) following testicular cancer as the highest (\u003cstrong\u003eFigure S7C\u003c/strong\u003e). We next examined whether OCT4/NFI-linked transient reprogramming pathways are engaged in kidney cancer by reanalysing a kidney cancer scRNA-seq dataset (\u003cstrong\u003eFigure S7D\u003c/strong\u003e)\u003csup\u003e46\u003c/sup\u003e. ccRCC tumour cells upregulated POU5F1 relative to normal proximal tubule (PT) cells (\u003cstrong\u003eFigures 6F\u003c/strong\u003e and \u003cstrong\u003eS7E\u003c/strong\u003e), alongside increased NFI-factor expression (\u003cstrong\u003eFigure 6F\u003c/strong\u003e).\u0026nbsp;We obtained genes both upregulated and linked to opening DARs in NFIX reprogramming day 2 cells (compared to control rMEFs) that were co-positive for NFI and low affinity OCT4 motifs (\u003cstrong\u003eFigure S4Q\u003c/strong\u003e). Seurat module scores showed that these genes were highly elevated in ccRCC tumour cells compared to both normal PT cells and adjacent tumour cells (\u003cstrong\u003eFigure 6G\u003c/strong\u003e), as well as a kidney chromophobe (chRCC) tumour that did not display \u003cem\u003ePOU5F1\u003c/em\u003e expression (\u003cstrong\u003eFigures S7F\u003c/strong\u003e and \u003cstrong\u003eS7G\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTogether, these data suggest OCT4/NFI-associated dedifferentiation programs uncovered by DoseH-seq may also be engaged in select human tumour contexts.\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eWe present DoseH-seq, a single-cell multiome assay that allows profiling graded perturbation levels in time-course experiments at an affordable cost by capitalising on BGI sequencing technology (\u003cb\u003eTable S1\u003c/b\u003e). Multi-perturb experiments including \u003cem\u003eNfix\u003c/em\u003e, \u003cem\u003eOct4\u003c/em\u003e and c-\u003cem\u003eJun\u003c/em\u003e modulation demonstrated that DoseH-seq can be applied to investigate dual perturbations. Our examination of TF dosage effects in fibroblasts, during reprogramming and in the context of PSC differentiation further highlights the flexibility and broad applicability of DoseH-seq.\u003c/p\u003e \u003cp\u003eAging and stress processes linked to \u003cem\u003ein vitro\u003c/em\u003e culture are known to induce myofibroblast programs in fibroblasts, with persistent activation contributing to age-related disorders\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e,\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e.\u003csup\u003e15,47\u003c/sup\u003e. Reprogramming or partial reprogramming with Yamanaka factors can attenuate these programs, termed mesenchymal drift\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e, a finding confirmed by our current data. We previously linked declining NFI TF activity to shutdown of youthful cCREs across cell types\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Here, we show that \u003cem\u003eNfix\u003c/em\u003e overexpression in fibroblasts suppresses myofibroblast programs, similar to OKSM. We propose this is driven by redistribution of AP-1 transcription factors towards chromatin exposed by graded Nfix overexpression, underpinning a limited dedifferentiation program.\u003c/p\u003e \u003cp\u003eBy tracking dose-dependent \u003cem\u003eNfix\u003c/em\u003e overexpression in reprogramming, we challenge the view that somatic TFs already expressed in the starting cell type necessarily inhibit de-differentiation during reprogramming\u003csup\u003e\u003cspan additionalcitationids=\"CR17 CR18 CR19 CR20 CR21\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. Rather, moderate-level \u003cem\u003eNfix\u003c/em\u003e overexpression early during reprogramming can synergize with insufficient Yamanaka factors\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e,\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e to promote transient chromatin opening and collapse of the fibroblast transcriptional network (Fig.\u0026nbsp;7). In contrast, high-level \u003cem\u003eNfix\u003c/em\u003e dosage blocks reprogramming by activating AP-1 to enforce the somatic state. Combinatorial high overexpression of \u003cem\u003eNfix\u003c/em\u003e and \u003cem\u003eOct4\u003c/em\u003e drives excessive opening of chromatin harbouring low-affinity OCT4 motifs and increased activation of p53 pathways, culminating in cell cycle decline (Fig.\u0026nbsp;7). Conversely, low levels of c-\u003cem\u003eJun\u003c/em\u003e knockdown cooperate with moderate-level \u003cem\u003eNfix\u003c/em\u003e overexpression to drive opening of late-stage reprogramming regulatory elements by countering \u003cem\u003eNfix\u003c/em\u003e-induced AP-1 activation. Thus, DoseH-seq can reveal the dose-dependent interactions of TFs during both overexpression and knockdown, enabling unparalleled high-resolution interrogation of shifts in TF networks.\u003c/p\u003e \u003cp\u003eTransiently accessible cCREs activated during OKSM overexpression are a hallmark of reprogramming and contribute to erasure of the somatic network by exposing competing somatic TF binding sites\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan additionalcitationids=\"CR18 CR19\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. These transient cCREs are thought to harbor both somatic TF binding sites and lower-affinity binding sites for Yamanaka factors. Our DoseH-seq experiments in PSCs reveal that NFI induction re-engages the same transient cCREs activated during exit from pluripotency in response to \u003cem\u003eNfix\u003c/em\u003e and OKSM overexpression. As these regions contain motifs for NFI and pluripotency-linked TFs (OCT, SOX, KLF), \u003cem\u003eNfix\u003c/em\u003e overexpression may promote collapse of pluripotency by exposing competing sites, analogous to how the somatic network is dismantled during reprogramming (Fig.\u0026nbsp;7)\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan additionalcitationids=\"CR18 CR19\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. This suggests that transient regulatory elements active during iPSC generation may have bi-directional roles to drive both de-differentiation and re-differentiation. Aligned with this notion, a recent preprint study implicated epigenetic derepression of viral elements with competing pluripotency TF sites in early PSC differentiation processes\u003csup\u003e\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eOur dose-resolved results may help reconcile reports of NFI family members acting as oncogenes or tumour suppressors in a context-dependent manner\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e,\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e, and more broadly suggest that endogenous somatic TF levels can shape cellular responses to Yamanaka-factor exposure. In this light, it is notable that mouse studies report the kidney as among the most susceptible organs to tumour formation following transient OSKM/OKS (but not KMS) induction\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e and we detected OCT4A transcripts in ccRCC and in a subset of untransformed kidney samples at higher frequency than in other untransformed tissues (Figs. S7B and S7C). Consistent with engagement of reprogramming-like programs, OCT4/NFI-associated gene modules linked to early transient reprogramming intermediates were elevated in ccRCC tumours. These correlative observations raise the possibility that dose windows and cell-type context may be important considerations for epigenetic rejuvenation strategies using the Yamanaka factors (e.g., in settings with elevated NFI via copy-number gains\u003csup\u003e\u003cspan additionalcitationids=\"CR53\" citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e). DoseH-seq provides a scalable framework to quantify such interaction rules and support more controlled, and potentially safer, cell-state transitions.\u003c/p\u003e \u003cp\u003eIn summary, we introduce DoseH-seq as a generalisable multiome platform for mapping gene-dosage effects during cell-state transitions. We leverage DoseH-seq to explore the role of \u003cem\u003eNfix\u003c/em\u003e, linked to youthful cell states, in driving developmental reversion, differentiation and its dose-dependent interaction with Yamanaka factor expression. In doing so, DoseH-seq uncovered a class of transient reprogramming cCREs with bi-directional accessibility, linking erasure of somatic identity with controlled exit from pluripotency. DoseH-seq constitutes a technology platform to dissect TF dosage logic in broad contexts including adult stem cell compartments, tumour lineage programs, and engineered immune cells (e.g., CAR-T/NK), where fine control of TF activity is likely to determine regenerative potential, lineage stability, and therapeutic efficacy.\u003c/p\u003e\n\u003ch3\u003eLimitations of the study\u003c/h3\u003e\n\u003cp\u003eOur two-step hashing/lysis preserved RNA integrity (\u003cb\u003eFigure S1E\u003c/b\u003e) and yielded highly robust multiome QC data (\u003cb\u003eFigures S2J\u0026ndash;R\u003c/b\u003e and \u003cb\u003eS4G\u0026ndash;M\u003c/b\u003e) as benchmarked against the existing Multi-perturb workflows (\u003cb\u003eTable S1\u003c/b\u003e), with coherent reprogramming/differentiation trajectories (Figs.\u0026nbsp;2A\u003cb\u003e\u0026ndash;B\u003c/b\u003e, \u003cb\u003e3G\u003c/b\u003e, \u003cb\u003e5D\u0026ndash;E\u003c/b\u003e). While we did not benchmark against standard 10x nuclei isolation (no hashing), our primary conclusions rely on within-run, dose-stratified comparisons under identical chemistry. For pooled multi-perturb samples, timepoints were assigned by label transfer; analyses therefore focus on high-confidence early-stage cells and within-dataset contrasts. Where cross-experiment comparisons were required (Fig.\u0026nbsp;3), we compared effect sizes anchored to each experiment\u0026rsquo;s matched reference controls (e.g., mESC or rMEF) to prevent batch-driven artefacts. Nfix barcode detection declines at late reprogramming stages, consistent with lentiviral silencing, limiting longitudinal dose tracking. We did not directly profile NFIX or OCT4/SOX2 occupancy; motif and accessibility analyses provide indirect evidence of potential TF binding site competition.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cp\u003e\u003cstrong\u003eAnimal studies and use of human cell cultures\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll animal experiments were reviewed and approved by The University of Queensland Animal Ethics Committee and the Monash Animal Services Ethics Committee and were conducted in compliance with the specified ethical regulations. Biopsy sample collection for derivation of fibroblast cultures was approved by The University of Queensland Human Ethics Committee and conducted in compliance with the specified ethical regulations.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348876\"\u003eCell culture\u003c/h2\u003e\n\u003cp\u003e\u003cstrong\u003eFibroblasts and 293T cell lines\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eHuman fibroblasts (human dermal fibroblasts, HDFs), 293T cell lines, and mouse embryonic fibroblasts (MEF) were cultured in MEF media (DMEM basal medium; Thermo Fisher Scientific, Cat#11995065) supplemented with 10% (v/v) fetal bovine serum (FBS; Thermo Fisher Scientific, Cat#10099141), 1% (v/v) Penicillin-Streptomycin (Thermo Fisher Scientific, Cat#15070063), 1% (v/v) MEM-NEAA (non-essential amino acids; Thermo Fisher Scientific, Cat#11140050), and 0.1% (v/v) 55 mM 2-mercaptoethanol (Thermo Fisher Scientific, Cat#21985023). Media change was performed every two to three days.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eHuman Pluripotent Stem Cells\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo culture human pluripotent stem cells (H9 hPSC cell line; WiCell, WA09), 6-well plates were first coated with 5 \u0026micro;g/mL Vitronectin (VTN-N; Thermo Fisher Scientific, Cat#A14700) in DPBS (Thermo Fisher Scientific, Cat#14190144) at room temperature for 1 hr. H9 hPSCs were cultured in Essential 8\u0026trade; basal medium supplemented with 2% E8 Supplement (Thermo Fisher Scientific, Cat#A1517001), with daily media replacement.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMouse Pluripotent Stem Cells\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMouse pluripotent stem cells (mESCs or iPSCs) were cultured as previously described\u003csup\u003e55\u003c/sup\u003e in ESC media (KnockOut\u0026trade; DMEM basal medium supplemented with 15% (v/v) FBS, 1% (v/v) Penicillin-Streptomycin, 1% (v/v) MEM-NEAA, 1% (v/v) GlutaMAX\u0026trade; Supplement (Thermo Fisher Scientific, Cat#35050061), 0.1% (v/v) 55 mM 2-mercaptoethanol (Thermo Fisher Scientific, Cat#21985023), and 10 ng/mLleukemia inhibitory factor (LIF; Sigma-Aldrich, Cat#ESG1107).\u0026nbsp;For feeder-dependent culture of mouse pluripotent stem cells, growth-inactivated MEFs (iMEFs, produced as previously described\u003csup\u003e55\u003c/sup\u003e) were seeded onto the cell culture vessels coated with 0.1% gelatin (Thermo Fisher Scientific, Cat#G1393) at the density of 50,000 per cm\u003csup\u003e2\u0026nbsp;\u003c/sup\u003eas feeder layer one day before seeding mouse pluripotent stem cells. Media change was performed every other day. For feeder layer-free culture,\u0026nbsp;iMEF depletion was performed leveraging differences in cell attachment speed between iMEFs and mESCs. Cells (mESC cultured on iMEFs) harvested from a 6-well were reseeded into a T75 culture flask coated with 0.1% gelatin and placed back into the incubator for ~40 mins. Afterwards, the flask was examined under the microscope to ensure most iMEFs were attached. The supernatant containing the mESCs was then carefully aspirated and transferred to vessels coated with 0.1% gelatin. The iMEF-depleted mESCs were cultured in iMEF-conditioned ESC media (i.e. ESC media collected from 6 wells with iMEFs after 24 hrs and then filtered through a 0.22 \u0026micro;m membrane (Sigma-Aldrich, Cat#SLGPR33RS). iMEF-conditioned mESC media was changed every day.\u003c/p\u003e\n\u003cp\u003eAll cells were maintained at 37\u0026ordm;C in a humidified incubator with 5% CO\u003csub\u003e2\u003c/sub\u003e.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348880\"\u003eLentiviral particle generation and titer determination\u003c/h2\u003e\n\u003cp\u003eLentiviral transfer vectors encoding viral insert to knockdown or overexpress genes of interest were designed and purchased from VectorBuilder Inc. (see \u003cstrong\u003eTable S2.2\u003c/strong\u003e). As a reporter, blue fluorescence protein (TagBFP2) was employed in nearly all lentiviral constructs (on its own or linked to genes of interest via an internal ribosome entry site [IRES]). Lentiviruses were generated using a 2nd generation lentivirus packaging system. Briefly, 8,500,000 293T cells were seeded into a T75 flask in 15 mL MEF media one day before viral packaging. On the following day, MEF media were replaced with Opti-MEM\u0026trade; medium (Thermo Fisher Scientific, Cat#31985070). For transfection, Lipofectamine\u0026trade; 3000 reagent mix (Thermo Fisher Scientific, Cat#L3000015) and DNA vector mix (5.9 \u0026mu;g transfer vector carrying gene of interest, 10.8 \u0026mu;g psPAX2 and 7 \u0026mu;g pMD2G) were prepared separately as per manufacturer\u0026rsquo;s instructions (packaging plasmids psPAX2 and pMD2G as per Nefzger, C. M. et al., \u003cem\u003eStem Cells\u003c/em\u003e, 2011\u003csup\u003e56\u003c/sup\u003e. The two solutions were then combined and incubated at room temperature for 15 mins. Lipid-DNA complexes were then added dropwise to the T75 flask previously overlaid with Opti-MEM\u0026trade; medium followed by gentle rocking to ensure even distribution. After 6 hrs of transfection, media were replaced with fresh, pre-warm virus production media (VPM; Advanced DMEM (Thermo Fisher Scientific, Cat#12491015) supplemented with 2% (v/v) FBS, 1% (v/v) GlutaMAX\u0026trade; Supplement, 1% (v/v) MEM-NEAA, and 1% (v/v)\u0026nbsp;P/S). VPM media containing lentiviral particles were collected at 24 hrs and 48 hrs post transfection and concentrated ~150 times by Amicon\u0026reg; Ultra Centrifugal Filters (Merck Life Science, Cat# UFC910024) at 3,181xg for 15 mins. Concentrated lentiviruses were aliquoted and stored in a --80\u0026ordm;C freezer until usage.\u003c/p\u003e\n\u003cp\u003eTiters of each concentrated lentivirus stock were determined through serial dilution on fibroblast cultures followed by flow cytometry analysis for quantification as described\u003csup\u003e57\u003c/sup\u003e.\u003csup\u003e57\u003c/sup\u003e. Briefly, MEFs were seeded at a density of 20,000 per cm\u003csup\u003e2\u0026nbsp;\u003c/sup\u003e4\u0026ndash;24 hrs ahead of infection for titration. Duplicate wells were designed for each viral dilution. On the day of titration, lentiviral concentrates were serially diluted in 10-fold steps at 1:1,000 down to 1:1,000,000 into MEF culture media containing transfection reagent polybrene (Merck, Cat# TR-1003-G) at 5.8 \u0026micro;g/mL. MEFs were overlayed with the viral dilutions followed by a centrifugal inoculation at 610xg, room temperature for 60 mins. Cells were subsequently cultured in a 5% CO2, 37\u0026ordm;C incubator. The infected fibroblasts were harvested 72 hrs post transduction and stained with Propidium iodide (PI; Sigma-Aldrich, Cat#P4864) at 1:1000 dilution for flow cytometry analysis. Cultures with fluorescence reporter positive cells between 1% to 30% were leveraged for titer calculations.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348881\"\u003eGene overexpression/knockdown using lentivirus\u003c/h2\u003e\n\u003cp\u003erMEFs or wild type MEFs were plated at a density of 20,000 per cm\u003csup\u003e2\u003c/sup\u003e in 6-well plates one day before lentiviral transduction. Viral media were prepared using target cell type culture media supplemented with 5.8 \u0026micro;g/mLpolybrene and lentivirus at the indicated MOI. For overexpression, cells were infected with lentiviruses encoding genes of interest (see vector information in \u003cstrong\u003eTable S2.2\u003c/strong\u003e) at MOI of 2.5 unless specified otherwise. Lentivirus carrying only the \u003cem\u003eTagBFP2\u003c/em\u003e reporter at matched MOI were used as experimental controls. For \u003cem\u003ec-Jun\u003c/em\u003e knockdown, lentivirus carrying shRNAs against \u003cem\u003ec-Jun\u003c/em\u003e (see vector information in \u003cstrong\u003eTable S2.2\u003c/strong\u003e) were utilized at MOI of 2.5 or 10. Lentiviruses carrying non-targeting shRNA with scrambled sequence were used as experimental controls. Cells overlaid with virus-containing media were centrifuged at 610xg, room temperature for 1hr and afterwards maintained in a 5% CO\u003csub\u003e2\u003c/sub\u003e, 37\u0026ordm;C incubator. Spent media containing viral particles were removed after overnight incubation. The efficiency of gene knockdown or overexpression via lentiviruses was measured by qPCR assay at day 3 post transduction. For reprogramming experiments with TF perturbations, infected rMEFs were reseeded at the indicated density at\u0026nbsp;day 3 post transduction in ESC media on 0.1% gelatin-coated tissue culture plates and reprogramming was initiated by 1\u0026micro;g/mL DOX.\u0026nbsp;For overexpression experiments with mouse pluripotent stem cells, mouse ESCs or iPSCs (the latter carrying a \u003cem\u003eGfp\u003c/em\u003e reporter in the endogenous \u003cem\u003eOct4\u003c/em\u003e locus) were thawed and cultured on iMEFs for 1-2 passages. Prior to lentiviral infection, PSCs were depleted of iMEFs (see \u0026ldquo;Cell culture - Mouse Pluripotent Stem Cells\u0026rdquo; section) and seeded at 20,000 per cm\u003csup\u003e2\u003c/sup\u003e on vessels coated with 0.1% gelatin in conditioned ESC media supplemented with additional 10 ng/mL LIF and allowed to attach for at least 6 hrs. Lentiviral infection was performed\u0026nbsp;via centrifugal inoculation as described for MEFs. An MOI of 10 (note that titers were determined on MEFs) was employed for each lentivirus for efficient PSC infection.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eImmunofluorescence staining\u003c/h2\u003e\n\u003cp\u003eTo confirm \u003cem\u003eNfix\u003c/em\u003e overexpression via lentivirus in MEFs at the protein level, MEFs infected with lentivirus overexpressing either \u003cem\u003eNfix\u003c/em\u003e or \u003cem\u003eTagBFP2\u003c/em\u003e only were subjected to immunofluorescence staining. MEFs at day 5 post viral infection were fixed on tissue culture plates with 4% (w/v) paraformaldehyde (PFA; ProSciTech, Cat#C004; diluted to 4% with DPBS) at room temperature for 15 mins. Following three DPBS washes, cells were permeabilized with 0.3% % (v/v) Triton X-100 (Sigma-Aldrich, Cat#X100) in DPBS at room temperature for 30 mins. After three washes with PBST buffer (0.1% (v/v) Tween-20 (Sigma-Aldrich, Cat# P5927) in DPBS), cells were then incubated with blocking solution, 2% (w/v) BSA (Sigma-Aldrich, Cat#A9418) in PBST at room temperature for 1hr. \u0026nbsp;For antibody labelling, mouse monoclonal anti-NFIX antibody (Sigma-Aldrich, Cat#SAB1401263; clone 3D2) was employed at 1:800 dilution for overnight incubation at 4\u0026ordm;C. Following three PBST washes, cells were stained with donkey anti-mouse IgG conjugated with Alexa Fluor\u0026trade; Plus 488 (Thermo Fisher Scientific, Cat# A32766) at 1:500 dilution in dark at room temperature for 1hr. For nuclei visualization,\u0026nbsp;4\u0026prime;,6-diamidino-2-phenylindole (DAPI; Sigma-Aldrich, Cat#D9542) was used at 1:1000 dilution at room temperature for 10mins. Following the DAPI labelling step, cells were washed with PBST again and left in PBS for microscopic inspection.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348884\"\u003eQuantitative Polymerase Chain Reaction (qPCR)\u003c/h2\u003e\n\u003cp\u003eHarvested cells were pelleted and washed with DPBS. Cell pellets were then lysed using TRIzol reagent (Thermo Fisher Scientific, Cat#15596018). TRIzol-treated cell pellets were stored at -80\u0026ordm;C freezer until processing for RNA extraction. Total RNA was extracted using Direct-zol\u0026trade; RNA Microprep kit (Zymo Research, Cat#R2062) as per manufacturer\u0026rsquo;s instructions. RNA was quantified by Nanodrop Spectrophotometer (Thermo Fisher Scientific) or Qubit\u0026trade; RNA High Sensitivity assay (Thermo Fisher Scientific, Cat#Q32855) and stored at -80\u0026ordm;C freezer until further processing. To synthesize cDNA, 200-600 ng RNA was reverse transcribed using QuantiTect\u0026reg; reverse transcription kit (Qiagen, Cat#205311) according to the manufacturer\u0026apos;s recommendation. To assess expression levels, qPCR was performed using Viia\u0026trade; 7 qPCR platform using 384-wells plates (Applied Biosystems). Each qPCR reaction mix contained 2.5 \u0026micro;L of 2X SYBR\u0026trade; Green qPCR Master Mix (Thermo Fisher Scientific, Cat#4312704), 0.8 \u0026micro;M of each forward primer and reverse primer, 2.5 ng of cDNA template from reverse-transcribed RNA, PCR grade water to a total volume of 5 \u0026micro;L. Housekeeper genes mouse HPRT and mouse \u0026beta;-actin were utilized as dual internal references for normalisation. qPCR reactions were performed in triplicate for each sample and thermal cycling for qPCR was run as follow: 1 cycle of denaturation at 95\u0026ordm;C for 10 mins, 40 cycles of annealing/extension at 95\u0026ordm;C for 15 seconds, and at 60\u0026ordm;C for 60 seconds. \u003cstrong\u003eTable S2.3\u0026nbsp;\u003c/strong\u003elists the primer pairs used in this study.\u003c/p\u003e\n\u003ch2\u003ePluripotency induction in fibroblasts from reprogrammable mouse model\u003c/h2\u003e\n\u003cp\u003erMEFs (from E13.5 embryos as per\u003csup\u003e58\u003c/sup\u003e) were isolated from a reprogrammable mouse strain\u003csup\u003e24\u003c/sup\u003e and were heterozygous for both a DOX-inducible polycistronic cassette harbouring \u003cem\u003eOct4\u003c/em\u003e, \u003cem\u003eSox2\u003c/em\u003e, \u003cem\u003eKlf4\u003c/em\u003e, and \u003cem\u003ec-Myc\u0026nbsp;\u003c/em\u003e(TRE-OKSM) at the \u003cem\u003eCol1a1\u003c/em\u003e locus and a reverse tetracycline transactivator (m2rtTA) cassette expressed from the \u003cem\u003eRosa26\u003c/em\u003e locus. rMEFs also harboured a heterozygous GFP reporter knocked in at the end of the genomic \u003cem\u003eOct4\u003c/em\u003e locus to allow monitoring of re-activation of the endogenous pluripotency network.\u0026nbsp;For reprogramming experiments, cryopreserved rMEFs were thawed and recovered in MEF media. rMEFs were at a seeded at a density of ~50,000 per cm\u003csup\u003e2\u003c/sup\u003e onto 12-well plates coated with 0.1% gelatin and maintained in ESC media supplemented with DOX (Sigma-Aldrich, Cat#D9891) at a final concentration of 1 \u0026micro;g/mL (reprogramming media). Reprogramming of fibroblast was initiated no later than at passage 3. Media was replaced every other day or as required. Ectopic OKSM expression through DOX was maintained for 10 days, unless otherwise specified. Afterwards, DOX was then withdrawn for 3 days to allow formation of stable iPSC colonies independent of ectopic OKSM expression. Cells were subsequently used for quantitative analysis of reprogramming efficiency and/or when indicated cultured for additional passages. For reprogramming experiments in the presence of Vitamin C (L-ascorbic acid; Sigma-Aldrich, Cat#A92902), Vitamin C was freshly added to reprogramming media at a final concentration of 50 \u0026micro;g/mL.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348886\"\u003eAlkaline Phosphatase Staining\u003c/h2\u003e\n\u003cp\u003eTo visualize stable iPSC colonies (in addition to assessing \u003cem\u003eOct4\u003c/em\u003e-GFP expression\u003csup\u003e24\u003c/sup\u003e), Alkaline Phosphatase (AP) staining was performed 3 days post DOX withdrawal using Vector\u0026reg; Black Substrate Kit (Vector Laboratories, Cat#SK-5200). Briefly, cells were washed with DPBS following and fixed with 4% PFA at room temperature for 15 mins. AP staining solution was freshly prepared using 100 mM Tris-HCl buffer (Thermo Fisher Scientific, Cat#15567027) with pH adjusted to 9.5 as per manufacturer\u0026rsquo;s recommendation. Cultures exposed to staining reagent were incubated at room temperature in dark for 20\u0026ndash;35 mins. After incubation, staining solution was rinsed off with DPBS and subsequently with distilled water, followed by imaging.\u0026nbsp;\u003c/p\u003e\n\u003ch2 id=\"_Toc194348887\"\u003eFlow Cytometry\u003c/h2\u003e\n\u003cp\u003eCells samples were disassociated from culture plates using 0.25% EDTA/Trypsin (Thermo Fisher Scientific, Cat#25200072) at 37\u0026ordm;C for 3 mins. After washing cells once with 2% FBS in PBS via a pelleting step, cells were counted and ready for flow cytometry experiments. To measure reprogramming efficiency in case of rMEF experiments, cell samples from reprogramming experiments were stained with PI and fully-reprogrammed cells were quantified by assessing the percentage of live cells that had activated the endogenous \u003cem\u003eOct4\u003c/em\u003e-GFP reporter as per previous work\u003csup\u003e30,59\u003c/sup\u003e. To assess the degree of differentiation in mouse ESC/PSCs, a two-step labelling protocol was performed using established pluripotency and somatic state associated cell surface markers\u003csup\u003e30,58\u003c/sup\u003e. Briefly, cells were labelled with monoclonal anti-mouse SSEA1-biotin (Invitrogen, Cat#13-8813-82) at 1:400 following by a second labelling step with anti-mouse EpCAM-PeCy7 (BioLegend, Cat#118215) at 1:400 and anti-mouse THY1.2-PE (BD Biosciences, Cat#553006) at 1:800 and Streptavidin-APC (BD Biosciences, Cat#554067) at 1:800. Labelling solutions were prepared in 2% FBS in PBS and antibody labelling was conducted on ice for 10 mins in a volume of 100 \u0026micro;L containing ~1 million cells. DAPI (Sigma-Aldrich, Cat#D9542) at 1:1000 dilution was used for live/dead cell discrimination. To account for spectral spillover in multi-color labelling, for each flow cytometry run either single-colour compensation beads (BD Biosciences, Cat#552845, 552843) or single-colour stained cells were used for compensation. Prior to flow cytometry acquisition, labelled samples were filtered through a 35\u0026mu;m meshed strainer (Corning, Cat#352235). For flow cytometry analysis, filtered samples were loaded on a Becton Dickinson LSRFortessa\u0026trade; X-20 (BD Biosciences) and data for more than 10,000 live cells were recorded. For fluorescence-activated cell sorting (FACS), filtered samples were loaded on either a BD FACSAria Fusion (BD Biosciences) or a MoFlo Astrios Cell Sorter (Beckman Coulter Life Sciences) to isolate specific cell populations. All FACS experiments were performed using a 100-\u0026mu;m nozzle setup. Flow cytometry data were processed with FlowJo (v10) software for analysis and visualization.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348888\"\u003eSKM-mediated pluripotency induction\u003c/h2\u003e\n\u003cp\u003eTo perform SKM-mediated reprogramming, wild-type MEFs (isolated from E13.5 embryos as per\u003csup\u003e58\u003c/sup\u003e) at passage 1 were infected with lentiviruses carrying a inducible TRE-SKM cassette (\u003cstrong\u003eTable S2.2\u003c/strong\u003e, VB230504-1547mzs; a DOX-inducible polycistronic SKM cassette\u003csup\u003e48,49\u003c/sup\u003e) at an MOI of 10 and \u003cem\u003em2rtTA\u003c/em\u003e-expressing lentivirus (\u003cstrong\u003eTable S2.2\u003c/strong\u003e, VB220105-1463rbh) at an MOI of 3. For testing the effects of \u003cem\u003eNfix\u003c/em\u003e overexpression on SKM-induced reprogramming, cells were co-spin-infected with \u003cem\u003eNfix\u003c/em\u003e-overexpressing lentivirus (\u003cstrong\u003eTable S2.2\u003c/strong\u003e, VB201207-1202ujp). As a control, lentivirus carrying \u003cem\u003eTagBFP2\u003c/em\u003e only (\u003cstrong\u003eTable S2.2\u003c/strong\u003e, VB191219-1157xye) were used for coinfection. 72 hrs after lentiviral infection MEFs were reseeded onto 0.1% gelatin-coated 12-well plates at 15,000-30,000 cells per well. To start reprogramming by inducing SKM expression, 1 \u0026micro;g/mL DOX was added to ESC media containing 50 \u0026micro;g/mL Vitamin C (note that in our hands reprogramming with the inefficient SKM system was only possible in the presence of Vitamin C). Media were replaced every other day or as required. On day 11 or 12 post SKM induction, DOX was withdrawn for 3 days before quantification via flow cytometry. Due to the lack of a endogenous OCT4-GFP reporter in case of wild-type MEFs, surface markers labelling was performed (as per \u003cstrong\u003eFlow Cytometry\u0026nbsp;\u003c/strong\u003esection) to quantify iPSC generation on day 3 post DOX removal leveraging using an established cell surface marker pannel\u003csup\u003e30,58\u003c/sup\u003e. Briefly, cells were labelled with monoclonal anti-mouse SSEA1-biotin (Invitrogen, Cat#13-8813-82) at 1:400 following by a second labelling step with anti-mouse EpCAM-PeCy7 (BioLegend, Cat#118215) at 1:400, anti-mouse THY1.2-FITC (BD Biosciences, Cat#553003) at 1:400 and Streptavidin-APC-Cy\u0026trade;7 (BD Biosciences, Cat#554063) at 1:400. DAPI (Sigma-Aldrich, Cat#D9542) at 1:1000 dilution was used for live/dead cell discrimination.\u003c/p\u003e\n\u003ch2\u003eStatistical Analysis (Wet-lab experiments)\u003c/h2\u003e\n\u003cp\u003eStatistical analysis for wet-lab experiments and qPCR was performed with GraphPad Prism 9.0 (GraphPad Software Inc), and statistical significance in the report was defined as P values \u0026lt; 0.05. Unpaired or paired (as appropriate) two-tailed t-test was used to measure difference between two groups with \u0026ge;3 replicates. Statistical comparisons among multiple groups were performed using the one-way ANOVA analysis with Dunnett\u0026rsquo;s multiple comparison tests. Data in bar charts are presented as Mean \u0026plusmn; Standard Error of the Mean (SEM) unless specified differently.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348889\"\u003eDesign of barcoded lentiviral constructs\u003c/h2\u003e\n\u003cp\u003eFor barcoded lentiviral constructs in \u0026ldquo;sense\u0026rdquo; direction (used for DoseH-seq experiments), the barcoding cassette was placed right before the 3\u0026rsquo;LTR of the lentiviral expression constructs. For \u0026ldquo;antisense\u0026rdquo;/inverted barcoded lentivirus constructs, a pA element (polyadenylation; BGH pA) was inserted at the end of the antisense-oriented promoter-ORF cassette and\u0026nbsp;the barcoding cassette placed upstream of pA element. Below is the general sequence design for the uniquely barcoded cassettes integrated into sense or antisense lentiviral constructs. A primer binding site for library amplification is highlighted in cyan, and the variable part of a 15-bp barcoding cassette is highlighted in yellow with 4000 possible barcodes provided in \u003cstrong\u003eTable S2.4\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cu\u003eSequence design for barcoded cassettes in sense constructs:\u003c/u\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTTTCCCATGATTCCTTCATATTTGC\u003c/strong\u003eGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG\u003cstrong\u003eXXXXXXXXXXXXXXX\u003c/strong\u003eGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAAC\u003c/p\u003e\n\u003cp\u003e\u003cu\u003eSequence design for barcoded cassettes in antisense constructs:\u003c/u\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTTTCCCATGATTCCTTCATATTTGC\u003c/strong\u003eGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG\u003cstrong\u003eXXXXXXXXXXXXXXX\u003c/strong\u003eGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAACGGCGAGTGGCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAG\u003c/p\u003e\n\u003cp\u003eLentiviral vectors for both barcoded construction designs were purchased from VectorBuilder Inc. (\u003cstrong\u003eTable S2.2\u003c/strong\u003e).\u0026nbsp;The corresponding lentiviral titers were listed in \u003cstrong\u003eTable S2.5\u003c/strong\u003e.\u003c/p\u003e\n\u003ch2 id=\"_Toc194348890\"\u003eSingle-cell multiome DoseH-seq assay (scRNA-seq + scATAC-seq)\u003c/h2\u003e\n\u003ch3 id=\"_Toc194348891\"\u003eHuman and mouse cell sample collection\u003c/h3\u003e\n\u003cp\u003eTo validate nuclei hashtag, HDFs from different donors were thawed and maintained in standard culture conditions as described in cell culture section. Cells were harvested using 0.25% trypsin/EDTA, with HDFs collected and used for experiments at passage 8 and hPSCs (H9 line, karyotypically normal) at passage 69 (\u003cstrong\u003eTable S2.1\u003c/strong\u003e). Cell samples were subsequently processed for nuclei isolation (see Nuclei isolation / 1\u003csup\u003est\u003c/sup\u003e \u0026ldquo;soft\u0026rdquo; lysis-step).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTo study mouse reprogramming under different stages and conditions, cells were harvested using 0.25% trypsin/EDTA at indicated time points (see \u003cstrong\u003eFigure 1D, Figure 5E\u003c/strong\u003e, day 0 [= three days post infection with lentiviruses]) and immediately frozen down using cryopreserve solution composed of 90% FBS and 10% DMSO. To study \u003cem\u003eNfix\u003c/em\u003e-induced mESC differentiation (\u003cstrong\u003eFigure 3F\u003c/strong\u003e), mESCs cultured under feeder-free conditions were employed and immediately frozen down at day 0, day 1.5, day 2, day 3 post lentiviral infection. To process cryopreserved samples for nuclei preparation (reprogramming/differentiation), samples were thawed and resuspended in cell culture medium. Cell sorting was performed to exclude dead cells and/or to enrich cells progressing towards to pluripotency (based on cell surface marker profile) as described in the figure legends, followed by nuclei isolation (see next section).\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003eNuclei isolation / 1st \u0026ldquo;soft\u0026rdquo; lysis-step\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eDetailed compositions for solutions in nuclei isolation are listed in \u003cstrong\u003eTable S2.6,\u0026nbsp;\u003c/strong\u003eunless otherwise indicated.\u003c/p\u003e\n\u003cp\u003eFor each sample, at least 100,000-750,000 cells were pelleted by centrifugation at 300xg for 5 mins at 4℃ and supernatant was carefully removed. To the cell pellet, 100 \u0026micro;L of cold Lysis Buffer was added and gently pipetted 6-8 times (the 1st \u0026ldquo;soft\u0026rdquo; lysis step). Human cell samples (HDFs and hPSCs) and mouse cell samples (rMEFs, reprogramming rMEF cultures and mESCs) were incubated on ice for 3 minutes. After incubation, 0.5 mL of chilled Wash Buffer was added to the lysed cells and mixed by gentle pipetting 5 times with a 1000 \u0026micro;L tip. Following lysis, permeabilised cells were pelleted by centrifugation. All centrifugations to pellet the cells/nuclei post lysis were performed at 500xg for 5 mins at 4℃. In case of DoseH-seq experiments, permeabilised cells/nuclei were immediately used for the steps described in the \u0026ldquo;Sample multiplexing with nuclear hashtag antibodies and 10x capture\u0026rdquo; section.\u003c/p\u003e\n\u003ch3\u003eTitration of anti-Nuclear Pore Complex Proteins Antibody\u003c/h3\u003e\n\u003cp\u003eAlexa Fluor\u0026reg;647 conjugated anti-Nuclear Pore Complex Proteins Antibody (mouse monoclonal antibody, clone: Mab414, BioLegend, Cat#682203) was used for initial antibody titration experiments on both HDFs and hPSCs. The antibody titration/optimisation was performed at four different concentrations (1 ng/\u0026micro;L, 2.5 ng/\u0026micro;L, 5 ng/\u0026micro;L, 10 ng/\u0026micro;L). Briefly, cell samples were lysed with Lysis Buffer 2 (\u003cstrong\u003eTable S2.6\u003c/strong\u003e). For each antibody concentration, 500,000-800,000 nuclei were resuspended in 100 \u0026micro;L of the chilled staining buffer solution (ST-SB, \u003cstrong\u003eTable S2.6\u003c/strong\u003e) containing AF647 anti-Nuclear\u0026nbsp;Pore Complex Antibody. Nuclei samples were incubated for 10 minutes on ice, followed by three washes with 1 mL Wash Buffer\u0026nbsp;(\u003cstrong\u003eTable S2.6\u003c/strong\u003e). Finally, strained/filtered nuclei samples, stained with PI (Sigma-Aldrich, Cat#P4864) at 1:1000, were used for flow cytometry analysis. AF647 signals of PI positive nuclei/permeabilised cells were assessed to determine the optimal antibody concentration. RNase inhibitor was not added into the buffers used for these optimisation experiments. Note that the optimised antibody concentration (2.5 ng/ul) was also determined to be effective on mouse cell types of interest.\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003eOptimisation of RNase inhibitor treatment conditions\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eTo optimize the use of RNase inhibitor type and concentration (\u003cstrong\u003eFigure S1D\u003c/strong\u003e), HDFs were first subjected to the \u0026ldquo;soft\u0026rdquo; lysis step using cold Lysis Buffer (\u003cstrong\u003eTable S2.6\u003c/strong\u003e) containing different RNase inhibitors or no inhibitor (negative control). Two different brands of RNase inhibitor were selected for testing, purchased from Sigma-Aldrich (recommended by the 10x Genomics protocol; Cat#03335399001) and Takara Bio (used in publication performing sample multiplexing via anti-Nuclear Pore Complex antibodies for transcriptional profiling\u003csup\u003e60\u003c/sup\u003e; Cat#2313A). For the Sigma-Aldrich inhibitor, 1 U/\u0026micro;L, 0.2 U/\u0026micro;L and 0.1 U/\u0026micro;L were tested. For the Takara Bio RNase inhibitor, 0.04 U/\u0026micro;L was tested as per\u003csup\u003e60\u003c/sup\u003e. To simulate the time required for the labelling with anti-Nuclear Pore Complex antibodies, the lysed cells/nuclei were left on ice for 60 mins. RNA extraction was performed for both freshly lysed cells and the lysed samples after 60-minute incubation on ice. To exam RNA integrity preservation throughout the procedure (\u003cstrong\u003eFigure S1E\u003c/strong\u003e) HDFs and hPSCs were subjected to the entire two-step lysis process with the Sigma RNase inhibitor at optimized concentrations (0.1 U/\u0026micro;L during and after the \u0026ldquo;soft\u0026rdquo; lysis step and 1 U/\u0026micro;L during and after the \u0026ldquo;harsh\u0026rdquo; lysis step). For these experiments, RNA extraction was performed for fresh cells and cell/nuclei samples collected throughout the procedure. Values of RNA integrity number (RIN) for extracted RNA were determined by Agilent Bioanalyzer.\u0026nbsp;\u003c/p\u003e\n\u003ch3 id=\"_Toc194348893\"\u003eSample multiplexing with hashtag antibodies and 10x capture\u003c/h3\u003e\n\u003cp\u003eNuclear hashtag antibodies solutions were prepared by adding 0.25 \u0026micro;g of antibodies (BioLegend, see \u003cstrong\u003eTable S2.7\u003c/strong\u003e for details) into 125 \u0026micro;L of ST-SB solution (\u003cstrong\u003eTable S2.6\u003c/strong\u003e) per sample and kept on ice during the last spin of nuclei isolation procedure (see \u003cstrong\u003eNuclei isolation\u003c/strong\u003e section). For 5 antibodies with lower labelling efficiency (see \u003cstrong\u003eTable S2.7)\u003c/strong\u003e, 0.5 \u0026micro;g was used instead. Following the last of the first \u0026ldquo;soft\u0026rdquo; lysis-step(see section: Nuclei isolation / 1st \u0026ldquo;soft\u0026rdquo; lysis-step), all supernatant was carefully removed to leave behind the pellet. The pellet was resuspended in the prepared antibody solution by gentle pipetting and incubated on ice for 10 mins. After incubation, 0.5 mL of chilled Wash Buffer was added into the tubes and gently pipetted 5 times. Tubes were then spun down, and supernatant was carefully removed. Next, nuclei were washed two more times by resuspending the pellet in 0.5 mL of chilled Wash Buffer (\u003cstrong\u003eTable S2.6\u003c/strong\u003e), nuclei were filtered through a 40 \u0026micro;m strainer (Corning, Cat# 352340) and nuclei counts were performed. Subsequently, nuclei were pooled at a desired ratio into one sample tube. After centrifugation, the pellet were resuspended with 100 \u0026micro;L of Lysis Buffer 2 (\u003cstrong\u003eTable S2.6\u003c/strong\u003e) by gently pipetting 5 times (the 2nd \u0026ldquo;harsh\u0026rdquo; lysis-step). Human samples (HDFs and hPSCs), rMEFs and reprogramming rMEFs were incubated on ice for 2 mins and mESCs and their early differentiation products were incubated on ice for 1 min. Right afterwards, 0.5 mL of chilled Wash Buffer 2 (\u003cstrong\u003eTable S.6\u003c/strong\u003e) was added to the lysed cells and gently mixed by pipetting. The tubes were then centrifuged, and the supernatant was carefully removed. This washing step was repeated twice more. At the third wash, prior to centrifugation to pellet the cells, the nuclei were filtered through a 40 \u0026micro;m strainer. Nuclei counts were performed again to quantify nuclei concentration and ensure optimal cell lysis. Finally, pooled nuclei were pelleted and resuspended in NBuffer (\u003cstrong\u003eTable S2.6\u003c/strong\u003e) at a density of 15,000 nuclei/\u0026micro;L and processed immediately for Tn5 reaction and nuclei capture with 10x Genomics instrument according to the manufacturer\u0026rsquo;s instructions (Chromium Next GEM Single Cell Multiome ATAC + Gene Expression, CG000338 Rev F).\u003c/p\u003e\n\u003ch3 id=\"_Toc194348894\"\u003ePreparation of the four library types\u003c/h3\u003e\n\u003cp\u003eSingle cell multiome (RNA + ATAC) libraries were prepared using 10x Genomics Chromium Next GEM Single Cell Multiome ATAC + Gene Expression Reagents (PN-1000285) according to the User Guide (Document CG000338) with the following modifications:\u003c/p\u003e\n\u003col class=\"decimal_type\"\u003e\n \u003cli\u003eNuclei input concentration was 15,000/\u0026micro;L, with targeted nuclei recovery of 30,000.\u003c/li\u003e\n \u003cli\u003eFor Post-GEM Incubation Cleanup - SPRISelect (Step 3.2), bead volume was adjusted to 100\u0026micro;L (2.0X bead ratio).\u003c/li\u003e\n \u003cli\u003e1uL of 5uM HTO_additive_v2 primer (GTGACTGGAGTTCAGACGTGTGCTCTTCCGAT*C*T) was added to Pre-amplification Mix (Step 4.1) for final reaction volume of 101\u0026micro;L.\u003c/li\u003e\n \u003cli\u003eFor Pre-amplification SPRI Cleanup (Step 4.3), reaction was transferred to 1.5mL tube for clean-up, and bead volume was adjusted to 202\u0026micro;L (2.0X bead ratio).\u003c/li\u003e\n \u003cli\u003e1uL of 5uM HTO_additive_v2 primer (GTGACTGGAGTTCAGACGTGTGCTCTTCCGAT*C*T) and 1\u0026micro;L of 5uM Chromium_Read2N primer (CTCGTGGGCTCGGAGATGTGTATAAGAGAC) were added to cDNA Amplification Mix (Step 6.1) for final reaction volume of 102\u0026micro;L.\u003c/li\u003e\n \u003cli\u003ecDNA Cleanup - SPRISelect (Step 6.2) was not performed. Instead, a SPRISelect was performed to separate fragment sizes as per Step 2.3A and 2.3B of Chromium Next GEM Single Cell 3\u0026rsquo; Reagent Kit v3.1 with Feature Barcoding Technology for Cell Surface Protein User Guide (Document CG000206). The cDNA eluted from the pellet fraction (2.3A) was used for gene expression library construction.\u003c/li\u003e\n \u003cli\u003eFor gene expression Sample Index PCR (Step 7.5), Dual Index TT Set A was not used. Instead, Single Index Kit T Set A and SI-PCR primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC) were used according to User Guide for single indexing gene expression workflow (Document CG000206; Step 3.5).\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eNuclear hashtag antibody libraries were generated according to BioLegend protocol for Total-SeqA, using 2x Kapa Hifi HotStart Readymix (Roche, KK2601), 5\u0026micro;L of purified supernatant (2.3B) fraction from cDNA cleanup as input, and 12 cycles of PCR.\u003c/p\u003e\n\u003cp\u003eFor Lentiviral perturbation libraries, an additional 0.6X/0.9X double-sided SPRISelect cleanup was performed with 15\u0026micro;L of purified supernatant (2.3B) fraction from cDNA cleanup, eluted in 40\u0026micro;L Buffer EB. 12.5\u0026micro;L re-purified cDNA was used as input into the library prep reaction, with 2x Kapa HiFi HotStart Readymix (Roche, KK2601), 2.5\u0026micro;L 10\u0026micro;M SI-PCR primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC), and 2.5 \u0026micro;L Single Index Kit N Set A index in a final volume of 50 \u0026micro;L. 14 cycles of 98\u0026deg;C denaturation, 62\u0026deg;C annealing and 72\u0026deg;C extension were performed, with final product cleaned up using 0.65X SPRISelect bead ratio.\u003c/p\u003e\n\u003cp\u003eLibrary size distribution of all four purified libraries (snRNA-seq, snATAC-seq, Hashtag antibody library, Lentiviral perturbation library) was assessed using a High Sensitivity DNA Kit (Agilent, 5067-4626) and a BioAnalyzer 2100.\u003c/p\u003e\n\u003ch3\u003eConcurrent sequencing of all 4 libraries with BGI T7 technology\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eWe benchmarked BGI sequencing technology against Illumina sequencing technology in the past and demonstrated comparable outcomes and reduced costs in case of BGI technology\u003csup\u003e61\u003c/sup\u003e. Library sequencing was performed on the BGI DNBSEQ-T7 instrument for which we optimised a strategy to sequence all four libraries concurrently at 30/60/5/5 ratio for Gene expression, ATAC, Hashtag antibody library and Lentiviral perturbation libraries, respectively. Sequencing was performed by MGI Australia on a DNBSEQ-T7 instrument in 2x100bp paired end mode using run configuration Read1 50, Read2 90, i7 Index 8, i5 Index 24 (i7 length of 8bp is sufficient due to single indexing strategy). As part of one T7 run yielding 5 billion read pairs, we concurrently sequenced the 8 libraries from two DoseH-seq 10x Genomics capture runs. The data sets introduced in \u003cstrong\u003eFigure 3E\u003c/strong\u003e and the multi-perturb data set introduced in \u003cstrong\u003eFigure 5C\u003c/strong\u003e were sequenced as described above, enabling discreet demultiplexing of the different library types and economic sequencing costs.\u003c/p\u003e\n\u003cp\u003ePreceding establishment of the DNBSEQ-T7 sequencing strategy for concurrent sequencing of all 4 library types in a single run (used for differentiation and multiperturb experiments part of \u003cstrong\u003eFigure 3\u003c/strong\u003e and \u003cstrong\u003eFigure 5\u003c/strong\u003e, respectively), data sets introduced in \u003cstrong\u003eFigure 1B\u003c/strong\u003e and \u003cstrong\u003e1D\u0026nbsp;\u003c/strong\u003ewere sequenced through a strategy utilising BGI T7, BGI G400 and Illumina NextSeq500 instruments. Initially, Gene Expression (scRNA/GEX) libraries were sequenced together with lentiviral barcode and HTO libraries on an MGI DNBSEQ-T7 using a paired-end run configured as R1 50\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003ecycles, R2 90 cycles, i5 16 cycles, i7 16 cycles, with demultiplexing performed using the i7 sample index; for human data set generation (\u003cstrong\u003eFigure 1B\u003c/strong\u003e) the HTO library was sequenced on a NextSeq500 instrument. Multiome ATAC libraries were sequenced separately on an MGI DNBSEQ-G400 instrument (R1 50, i7 8, i5 24, R2 49), where the i5 read captures the full ATAC barcode structure (16-bp 10x cell barcode + 8-bp spacer).\u003c/p\u003e\n\u003cp\u003eIn accordance with 10x guidance that index reads may be longer than required, any extra index bases beyond the required sample index were trimmed prior to downstream.\u003c/p\u003e\n\u003ch2\u003eHuman fibroblast scATAC+RNA-seq processing\u003c/h2\u003e\n\u003cp\u003ePrior to mapping, we performed preprocessing to ensure compatibility between BGI sequencing Fastq files and CellRanger. In order to demultiplex the hashtag reads from BGI Fastq files (which included transcript reads), we first converted raw Fastqs from BGI format to illumina format (https://github.com/powellgenomicslab/BGI_vs_Illumina_Benchmark). Next, R2 reads were trimmed to 101bp using trimmomatic\u003csup\u003e62\u003c/sup\u003e. For each lane, we filtered the trimmed R2 with the 15 antibodies at the beginning of each read. Finally, the reads were separated into Fastq files containing HTO or transcript reads. Fastq files were then processed using CellRanger ARC v 2.0.0 and mapped to the hg38 genome (10x CellRanger ARC reference GRCh38-2020-A-2.0.0). CellRanger-ready Fastq files are provided in the data repository. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDemultiplexing and removal of multiplets was performed using the deMULTIplex2 R package (v 1.0.1)\u003csup\u003e63\u003c/sup\u003e. The demultiplexTags function was run on the HTO counts with default settings and fibroblast cells annotated as \u0026lsquo;multiplet\u0026rsquo; or \u0026lsquo;negative\u0026rsquo; were filtered out prior to further analysis. Individual fibroblast donors were determined based on deMULTIplex2 HTO assignments. As validation of HTO demultiplexing we compared to SNP-based demultiplexing using the Souporcell package\u003csup\u003e64\u003c/sup\u003e, which was run on the gene expression BAM file produced by CellRanger ARC.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDownstream analysis of scATAC+RNA-seq data was performed using Seurat (v 5.1.0)\u003csup\u003e65\u003c/sup\u003e and Signac (v 1.14.0)\u003csup\u003e66\u003c/sup\u003e. We generated a UMAP plot of fibroblast and ESC cells based on the multimodal RNA and ATAC-seq data, following the Seurat/Signac protocol for multimodal analysis. For RNA data we first ran the Seurat SCTransform and RunPCA functions. For the ATAC data we used the RunTFIDF function, followed by FindTopFeatures with min.cutoff set to \u0026lsquo;q75\u0026rsquo; to capture the top quartile of peaks, and finally the RunSVD function. We used the FindMultiModalNeighbors on the PCA and LSI coordinates, with k nearest neighbors of 20 and based on the first 15 dimensions of both PCA and LSI. Coverage track over \u003cem\u003eGAPDH\u003c/em\u003e was created using the Signac CoveragePlot function.\u003c/p\u003e\n\u003ch2\u003eMouse scATAC+RNA-seq processing\u003c/h2\u003e\n\u003cp\u003ePrior to mapping, preprocessing was performed to ensure compatibility between BGI Fastq files and CellRanger. The provided R2 reads were processed to extract the 16bp I2 read from the end of the read and generate updated R2 reads. Fastq files were then processed using CellRanger ARC v 2.0.2 and mapped to the mm10 genome (10x CellRanger ARC reference mm10-2020-A-2.0.0). The processed data were subsequently analysed using Seurat and Signac. For both DoseH-seq runs we required a minimum of 5000 RNA UMI and ATAC fragment counts per cell (see \u003cstrong\u003eTable S2.8\u003c/strong\u003e for full filtering criteria). Demultiplexing and removal of multiplets was performed using the deMULTIplex2 R package\u003csup\u003e63\u003c/sup\u003e as above, with cells annotated as \u0026lsquo;multiplet\u0026rsquo; or \u0026lsquo;negative\u0026rsquo; filtered out prior to further analysis. Sample conditions were determined based on deMULTIplex2 HTO assignments (see \u003cstrong\u003eTable S2.7\u003c/strong\u003e for HTO sample mapping). CellRanger-ready Fastq files are provided in the data repository. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003ePeak calling was performed in a condition-specific manner using the MACS2\u003csup\u003e67\u003c/sup\u003e wrapper in Signac (CallPeaks) and counts regenerated against the updated peak set. We then generated UMAP coordinates in Signac based on the ATAC-seq data. The top 20% of peaks was selected with FindTopFeatures and min.cutoff = \u0026apos;q80\u0026apos;. The RunSVD function was run with default parameters and RunUMAP with reduction = \u0026lsquo;lsi\u0026rsquo;, dimensions = 1-15 and n.neighbors = 50. The UMAP coordinates were used as an initial input to SCANPY\u003csup\u003e68\u003c/sup\u003e to generate a force-directed layout visualisation using 50 nearest neighbours calculated on the first 15 LSI dimensions. The FDL map revealed that the day 10+3 cells were largely indistinguishable from the day 6 cells, implying that many of the cells in this group were refractory to reprogramming. These were therefore removed from downstream analyses to mitigate potential confounding effects in UMAP and DFL calculations. UMAP and DFL were regenerated on the filtered cells using the same parameters as above.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTo calculate dendrograms of average transcriptional similarity, we used the Seurat BuildClusterTree function, utilising the top 2000 variable genes for the subset of conditions without Vitamin C, and 3000 for the dendrogram on all conditions. The dendrogram was then visualised using the dendextend R package\u003csup\u003e69\u003c/sup\u003e.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLenti score generation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLenti scores were obtained using logistic regression models trained to predict perturbation status based on lenti count data. First, the raw lenti counts were normalised using SCTransform\u003csup\u003e70\u003c/sup\u003e. For the first DoseH-seq, which had one gene perturbation (\u003cem\u003eNfix\u0026nbsp;\u003c/em\u003eOE), a model was trained to predict OE vs control samples. The R glm function was used with family = binomial(link=\u0026apos;logit\u0026apos;) and the model Condition ~ OE + Control, where Condition is a binary vector (1=OE, 0=BFP), OE is the normalised \u003cem\u003eNfix\u003c/em\u003e lenti counts and Control is the normalised BFP lenti counts. The log-odds scores from the model were then used to represent the lenti score.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFor the multi-perturb dataset, we built logistic regression models as above, but incorporating intersections between the multiple perturbations. For O+X (\u003cem\u003eOct4\u003c/em\u003e + \u003cem\u003eNfix\u003c/em\u003e), we built a model predicting the O+X conditions vs the remainder using the formula: Condition ~ Oct4 + Control + shRNA +Nfix + Nfix:Oct4. For J+X (\u003cem\u003ec-Jun\u003c/em\u003e shRNA + \u003cem\u003eNfix\u003c/em\u003e), we built a model predicting the J+X conditions vs the remainder using the formula: Condition ~ shRNA + Oct4 + Control + Nfix + Nfix:shRNA. For \u003cem\u003eNfix\u0026nbsp;\u003c/em\u003e(which included \u003cem\u003eNfix\u003c/em\u003e by itself as well as O+X and J+X) we built a model predicting all \u003cem\u003eNfix\u003c/em\u003e OE conditions vs BFP control using the formula: Condition ~ Nfix + Control + Oct4 + shRNA+ Nfix:shRNA + Nfix:Oct4.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eLow, mid and high cell stratifications were determined by ranking the lenti score within a timepoint/condition and sub-setting the cells by thirds. Low cells were defined as the bottom 33% of ranked cells, high as the top 33% and mid as the remainder.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eIntegration across DoseH-seq runs\u003c/h2\u003e\n\u003cp\u003eIntegration of conditions between the first and second DoseH-seq runs was performed on the scATAC-seq data. For integration of the reprogramming conditions, we included from the first run: the \u003cem\u003eNfix\u003c/em\u003e overexpression conditions (rMEF [day 0], day 2, day 6 and THY1\u003csup\u003e-\u003c/sup\u003e), the BFP +VitC samples (day 2, day 6 and THY1\u003csup\u003e-\u003c/sup\u003e) as well as the mESC cells and BFP rMEF control. From the second run we included all reprogramming samples, as well as the mESCs and rMEF control.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe merged the peak sets from both runs using the GRanges\u003csup\u003e71\u003c/sup\u003e reduce function. The merged peak sets were then recounted using the Signac FeatureMatrix function. Integration analysis was performed on the scATAC-seq data in Signac by first running FindTopFeatures with min.cutoff = \u0026lsquo;q80\u0026rsquo; followed by RunTFIDF and RunSVD for both datasets. We identified integration anchors using FindIntegrationAnchors with reduction = \u0026lsquo;rsli\u0026rsquo; and dims = 1:15. Finally, LSI embeddings were integrated using the IntegrateEmbeddings function, with dims.to.integrate set to 1:15. A UMAP of integrated conditions was then generated using the integrated LSI, with dims = 1:15.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAs the second run \u0026lsquo;pool\u0026rsquo; reprogramming conditions contained pools of reprogramming timepoints, we mapped the originating timepoint of individual cells based on timepoints from run 1 using the Seurat/Signac label transfer approach. The mapping was performed separately for the BFP (+VitC) reprogramming timepoints and the \u003cem\u003eNfix\u003c/em\u003e OE reprogramming timepoints. In both cases, transfer anchors were first identified using the FindTransferAnchors function with the first 25 LSI dimensions and used as input to the MapQuery function.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe performed integration between the first run and the differentiation samples from run 2 using the process as above. For peak merging, recounting and integration we used the full reprogramming dataset from the first DoseH-seq run and from the second run integrated the samples from the differentiation time course as well as the mESCs and BFP and \u003cem\u003eNfix\u003c/em\u003e OE rMEFs. For the UMAP of differentiation vs \u003cem\u003eNfix\u003c/em\u003e OE reprogramming samples, we retained only the \u003cem\u003eNfix\u003c/em\u003e overexpressing samples from the first DoseH-seq experiment, in addition to the mESC and BFP rMEF control.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eCytoTRACE2 analysis\u003c/h2\u003e\n\u003cp\u003eThe CytoTRACE2\u003csup\u003e25\u003c/sup\u003e v1.0.0 R package was used for CytoTRACE2 analyses. For the first DoseH-seq run, corrected UMI counts were generated using SCTransform, with percent mitochondrial content specified in the vars.to.regress parameter. The corrected UMI counts were then input to the cytotrace2 function with default parameters. For the TCGA samples, Ensembl IDs for unstranded, protein coding counts were first converted to HGNC gene symbols, then input to the cytotrace2 function with species = \u0026ldquo;human\u0026rdquo;.\u003c/p\u003e\n\u003ch2\u003eCalling differential accessible regions\u003c/h2\u003e\n\u003cp\u003eDifferentially accessible region calling was performed using a pseudo-bulk approach adopted from\u003csup\u003e72\u003c/sup\u003e. Cells within groups to be tested were randomly pooled into 10 pseudo-bulk replicates and ATAC fragments summed. Testing was performed using DESeq2\u003csup\u003e29\u003c/sup\u003e and peaks with FDR \u0026lt; 0.001 were considered significant.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eDifferentially expressed genes\u003c/h2\u003e\n\u003cp\u003eDifferential gene expression testing was performed using the Seurat FindMarkers function with MAST testing\u003csup\u003e73\u003c/sup\u003e. Genes with adjusted p-value \u0026lt; 0.05 and absolute log\u003csub\u003e2\u003c/sub\u003e fold-change \u0026gt; 0.5 were considered significant. \u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eMotif enrichment analysis\u003c/h2\u003e\n\u003cp\u003eMotif enrichment analysis was performed using the HOMER\u003csup\u003e74\u003c/sup\u003e findMotifsGenome.pl (mm10) function with parameters -mknown to match against known vertebrate motifs. Motif enrichment was assessed against the default HOMER GC-matched background unless otherwise indicated.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003ePeak:gene linkage analysis\u003c/h2\u003e\n\u003cp\u003eWe performed peak:gene linkage analysis for the run 2 samples. We used the LinkPeaks function in Signac, with distance set to 500,000, min.cells set to 20 and a p-value cutoff of 0.05.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eGene ontology terms\u003c/h2\u003e\n\u003cp\u003eGO term tests were performed on genes linked to peaks for the indicated analysis. GO term over-representation testing was performed using the ViSEAGO\u003csup\u003e75\u003c/sup\u003e (v 1.11.0) R package. The BiomaRt package\u003csup\u003e76\u003c/sup\u003e was used to convert gene symbols to Uniprot/Swissprot identifiers and GO annotations for the genes were performed using the ViSEAGO Uniprot2GO and annotate functions. Over-representation testing was performed using the TopGO\u003csup\u003e77\u003c/sup\u003e (v 2.56.0) runTest function, with algorithm = \u0026ldquo;classic\u0026rdquo; and statistic = \u0026ldquo;fisher\u0026rdquo; and the full set of protein coding genes with potential links to peaks used as background. The returned p-values were adjusted for multiple testing using Benjamini \u0026amp; Hochberg correction, with GO terms considered significant if they obtained an adjusted p-value/FDR \u0026lt; 0.05.\u003c/p\u003e\n\u003ch2\u003eSTRING network analysis\u003c/h2\u003e\n\u003cp\u003eSTRING network analysis was performed on genes linked to the 670 opening DARs overlapping between the reprogramming day 2 \u003cem\u003eNfix\u003c/em\u003e (vs BFP rMEF) and differentiation days 2+3 \u003cem\u003eNfix\u003c/em\u003e (vs mESCs) conditions. The linked genes were submitted to the STRING (v 12)\u003csup\u003e78\u003c/sup\u003e web server and high confidence (score \u0026gt; 700) connections were retained. The connections were plotted in Cytoscape\u003csup\u003e79\u003c/sup\u003e, with connected subnetworks of over size 2 retained for visualisation and genes coloured according to annotation with the GO Biological Process term \u0026ldquo;Developmental process\u0026rdquo;.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eMotif scoring\u003c/h2\u003e\n\u003cp\u003eWe mapped motif position weight matrix (PWM) scores using HOMER findMotifsGenome.pl with the -find parameter set to a PWM set of interest. For analysis of OCT4 PWM scores in O+X and \u003cem\u003eNfix\u003c/em\u003e high vs \u003cem\u003eNfix\u003c/em\u003e low cells, we used the HOMER OCT4 motif (Oct4(POU,Homeobox)/mES-Oct4-ChIP-Seq(GSE11431)/Homer). For comparison of PWM scores in iPSCs to reprogramming and differentiation, we added the same HOMER OCT4 motif as well as the SOX2 motif (Sox2(HMG)/mES-Sox2-ChIP-Seq(GSE11431)/Homer). For the OCT4:SOX2 heterodimer we used the JASPAR Pou5f1::Sox2 PWM (MA0142.1). To provide a consistent scoring threshold we set a cutoff at 25% of the maximum motif score to enable comparison of low affinity motif scores.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eComparison with pan-tissue fibroblast atlas\u003c/h2\u003e\n\u003cp\u003eWe downloaded the mouse perturbed-state fibroblast atlas\u003csup\u003e28\u003c/sup\u003e as a Seurat object from https://www.fibroxplorer.com/download and calculated differentially expressed genes for each of the fibroblast subtypes compared to the remainder. We compared the log\u003csub\u003e2\u003c/sub\u003e fold-changes for each of the fibroblast subtypes to the reprogramming comparisons using the R cor.test function with method = \u0026ldquo;spearman\u0026rdquo;. The reprogramming comparisons were: 1) rMEF \u003cem\u003eNfix\u003c/em\u003e low vs rMEF BFP, 2) day 6 BFP vs rMEF BFP and 3) day 6 \u003cem\u003eNfix\u003c/em\u003e low vs rMEF \u003cem\u003eNfix\u003c/em\u003e.\u003c/p\u003e\n\u003ch2\u003eComparison with TCGA RNA-seq\u003c/h2\u003e\n\u003cp\u003eTCGA tumour and normal tissue samples were downloaded via the TCGAbiolinks R/Bioconductor package\u003csup\u003e80\u003c/sup\u003e. The GDCquery function was used to download samples annotated as \u0026ldquo;Primary Tumor\u0026rdquo; or \u0026ldquo;Sold Tissue Normal\u0026rdquo; and RNA-seq counts were obtained with data.category = \u0026ldquo;Transcriptome Profiling\u0026rdquo;, data.type = \u0026ldquo;Gene Expression Quantification\u0026rdquo; and workflow.type = \u0026ldquo;STAR - Counts\u0026rdquo;. Unstranded, protein coding counts were retained and used for log\u003csub\u003e2\u003c/sub\u003e 1 + counts per-million (CPM) normalization. The R biomaRt\u003csup\u003e76\u003c/sup\u003e package (v. 2.60.1) was used to convert Ensembl IDs to HGNC (HUGO Gene Nomenclature Committee) gene symbols. The \u0026lsquo;project_id\u0026rsquo; and \u0026lsquo;name\u0026rsquo; columns from the sample meta data were used to categorize the frequency of tumour types in the \u003cem\u003eNFI\u003c/em\u003e/\u003cem\u003ePOU5F1\u003c/em\u003e expression stratifications. For expression stratifications, we defined \u003cem\u003ePOU5F1\u003c/em\u003e activation (bottom bar in \u003cstrong\u003eFigure 6A\u003c/strong\u003e) as above top 95% of normal expression and high (top bar) as over top 99% of tumour expression. For each \u003cem\u003eNFI\u003c/em\u003e factor (left and right bars in \u003cstrong\u003eFigure 6A\u003c/strong\u003e represent the 25\u003csup\u003eth\u003c/sup\u003e and 75\u003csup\u003eth\u003c/sup\u003e percentiles, respectively, of all samples).\u003c/p\u003e\n\u003cp\u003eFor analysis of \u003cem\u003ePOU5F1\u003c/em\u003e/OCT4 isoforms, Kallisto\u003csup\u003e81\u003c/sup\u003e transcript count estimates were downloaded from UCSC\u003csup\u003e82\u003c/sup\u003e (https://xenabrowser.net/datapages/?dataset=tcga_Kallisto_est_counts\u0026amp;host=https%3A%2F%2Ftoil.xenahubs.net\u0026amp;removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). After cross-referencing with meta data, the samples were grouped into subsets based on disease status (normal tissue/tumour) and tumour type. 727 normal tissue samples and 9,634 tumour samples were obtained. To detect the counts of \u003cem\u003ePOU5F1\u003c/em\u003e isoforms, the HGNC symbols of the Ensembl transcript IDs were first annotated using biomaRt (v. 2.62.1)\u003csup\u003e83,84\u003c/sup\u003e. To evaluate percentages of sample types positive for OCT4A, we considered samples with an estimated count of the OCT4A isoform (Ensembl ID ENST00000259915.12) greater than 10 to be OCT4A positive.\u003c/p\u003e\n\u003ch2\u003eComparison with kidney scRNA-seq\u003c/h2\u003e\n\u003cp\u003eThe filtered_gene_bc_matrcies_h5.h5 files were downloaded from GEO (GSE159115) along with the annotation files. For joint UMAP visualisation, we followed the Seurat\u003csup\u003e65\u003c/sup\u003e integration pipeline employed by the original authors. For each individual sample, NormalizeData and FindVariableFeatures were run, followed by FindIntegrationAnchors with dims = 1:30 and IntegrateData with dims = 1:30. We then run ScaleData, RunPCA and RunUMAP with dims = 1:30.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFor analysis of module scores, we first identified peaks from the run 1/run 2 diff integration that contained both a match for the HOMER NFI motif \u0026ldquo;NF1-halfsite(CTF)/LNCaP-NF1-ChIP-Seq(Unpublished)/Homer\u0026rdquo; and the OCT4 motif \u0026ldquo;Oct4(POU,Homeobox)/mES-Oct4-ChIP-Seq(GSE11431)/Homer\u0026rdquo;. For the OCT4 motif, we allowed for low affinity motif occurrences as described in subsection \u0026lsquo;Motif scoring\u0026rsquo; above. Using the peak:gene links described above (subsection \u0026lsquo;Peak:gene linkage analysis\u0026rsquo;) we identified genes linked the NFI/OCT4 positive peaks and overlapped with genes upregulated in reprogramming day 2 NFIX cells relative to BFP rMEF cells (MAST testing; p\u003csub\u003eadj\u003c/sub\u003e \u0026lt; 0.05 \u0026amp; LFC \u0026gt; 0.5). These genes were converted to human using the nichenetr\u003csup\u003e85\u003c/sup\u003e convert_mouse_to_human_symbols function and input to the Seurat AddModuleScore function.\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe acknowledge the high-quality scientific and technical assistance of the Translational Research Institute Flow Cytometry facility. The authors thank the BGI Australia facility for sequencing on BGI instruments and IMB/UQ sequencing facility for sequencing on Illumina Instruments. We also thank UQ Biological Resources (Queensland Bioscience Precinct) and Monash Animal Services for help with animal husbandry. Imaging\u0026nbsp;was performed at the IMB Microscopy Facility at UQ. This work was supported by an NHMRC ideas grants to C.M.N. (APP2013574), a collaborative project grant between C.M.N. and UQ’s Genome Innovation Hub, and through start-up funding committed by Prof Brandon Wainwright from the Institute of Molecular Bioscience, UQ. It received further support through a MGI Collaborative grant from BGI Australia and through two Innovations Connections grants from the Australian Government with C.M.N. as the academic lead and financial support through industry partner Scott Needham (Systemic Medicine). M.A. and W.A. extend their appreciation to the Deputyship for Research and Innovation, Ministry of Education, Saudi Arabia, for funding through project numbers 1-441-120 and 1-441-121. M.N-S. and X.C. received support through the Women’s Research Assistance Program from the Queensland state government. Results shown in this study include data generated by the TCGA Research Network: https://www.cancer.gov/tcga. Finally, we would like to thank Dr Geoffrey McDermott and Dr Cathrine King from 10x Genomics for practical advice as part of DoseH-seq assay development.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eC.M.N., Y.Y., R.P. designed the experiments with additional support from S.A.; Y.Y and R.P. contributed equally to this study. Y.Y. performed the study’s cell culture experiments and wet-lab based aspects of DoseH-seq assay development with support from X.C., S.A., J.Z., Y.H., M.E., M.G., D.P., C.M.S., S.W. under supervision of C.M.N, J.B., F.J.S, S.T.N.; R.P. performed the study’s computational analyses, with additional support from S.L., K.T. and J.C.M under guidance of C.M.N.; BioRender schematics were created by Y.Y and R.P.; W.A., M.J.S., M.P., S.C., S.S., and M.D.W. contributed intellectually. M.N-S., R.P., Y.Y and C.M.N. prioritised NFI transcription factors as the study’s targets. Y.Y, R.P., and C.M.N. wrote the manuscript. All authors approved and contributed to the final version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDeclaration of interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSingle-nucleus RNA+ATAC-seq DoseH-seq runs generated as part of this study will be made publicly available upon publication.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eNair, S., Ameen, M., Sundaram, L., Pampari, A., Schreiber, J., Balsubramani, A., Wang, Y.X., Burns, D., Blau, H.M., Karakikes, I., Wang, K.C., and Kundaje, A. (2023). Transcription factor stoichiometry, motif affinity and syntax regulate single-cell chromatin dynamics during fibroblast reprogramming to pluripotency. bioRxiv. 10.1101/2023.10.04.560808.\u003c/li\u003e\n\u003cli\u003eFei, L., Zhang, K., Poddar, N., Hautaniemi, S., and Sahu, B. (2023). Single-cell epigenome analysis identifies molecular events controlling direct conversion of human fibroblasts to pancreatic ductal-like cells. Dev Cell \u003cem\u003e58\u003c/em\u003e, 1701-1715 e1708. 10.1016/j.devcel.2023.08.023.\u003c/li\u003e\n\u003cli\u003eBoudreau-Pinsonneault, C., David, L.A., Lourenco Fernandes, J.A., Javed, A., Fries, M., Mattar, P., and Cayouette, M. (2023). Direct neuronal reprogramming by temporal identity factors. Proc Natl Acad Sci U S A \u003cem\u003e120\u003c/em\u003e, e2122168120. 10.1073/pnas.2122168120.\u003c/li\u003e\n\u003cli\u003eSchraivogel, D., Gschwind, A.R., Milbank, J.H., Leonce, D.R., Jakob, P., Mathur, L., Korbel, J.O., Merten, C.A., Velten, L., and Steinmetz, L.M. (2020). Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat Methods \u003cem\u003e17\u003c/em\u003e, 629-635. 10.1038/s41592-020-0837-5.\u003c/li\u003e\n\u003cli\u003eReplogle, J.M., Norman, T.M., Xu, A., Hussmann, J.A., Chen, J., Cogan, J.Z., Meer, E.J., Terry, J.M., Riordan, D.P., Srinivas, N., et al. (2020). Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat Biotechnol \u003cem\u003e38\u003c/em\u003e, 954-961. 10.1038/s41587-020-0470-y.\u003c/li\u003e\n\u003cli\u003eLiu, W., Saelens, W., Rainer, P., Biocanin, M., Gardeux, V., Gralak, A.J., van Mierlo, G., Gebhart, A., Russeil, J., Liu, T., Chen, W., and Deplancke, B. (2025). Dissecting the impact of transcription factor dose on cell reprogramming heterogeneity using scTF-seq. Nat Genet \u003cem\u003e57\u003c/em\u003e, 2522-2535. 10.1038/s41588-025-02343-7.\u003c/li\u003e\n\u003cli\u003eJost, M., Santos, D.A., Saunders, R.A., Horlbeck, M.A., Hawkins, J.S., Scaria, S.M., Norman, T.M., Hussmann, J.A., Liem, C.R., Gross, C.A., and Weissman, J.S. (2020). Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat Biotechnol \u003cem\u003e38\u003c/em\u003e, 355-364. 10.1038/s41587-019-0387-5.\u003c/li\u003e\n\u003cli\u003eMetzner, E., Southard, K.M., and Norman, T.M. (2025). Multiome Perturb-seq unlocks scalable discovery of integrated perturbation effects on the transcriptome and epigenome. Cell Syst \u003cem\u003e16\u003c/em\u003e, 101161. 10.1016/j.cels.2024.12.002.\u003c/li\u003e\n\u003cli\u003eYan, R.E., Corman, A., Katgara, L., Wang, X., Xue, X., Gajic, Z.Z., Sam, R., Farid, M., Friedman, S.M., Choo, J., et al. (2024). Pooled CRISPR screens with joint single-nucleus chromatin accessibility and transcriptome profiling. Nat Biotechnol. 10.1038/s41587-024-02475-x.\u003c/li\u003e\n\u003cli\u003eLiu, X., Nefzger, C.M., Rossello, F.J., Chen, J., Knaupp, A.S., Firas, J., Ford, E., Pflueger, J., Paynter, J.M., Chy, H.S., et al. (2017). Comprehensive characterization of distinct states of human naive pluripotency generated by reprogramming. Nat Methods \u003cem\u003e14\u003c/em\u003e, 1055-1062. 10.1038/nmeth.4436.\u003c/li\u003e\n\u003cli\u003eNaqvi, S., Kim, S., Hoskens, H., Matthews, H.S., Spritz, R.A., Klein, O.D., Hallgr\u0026iacute;msson, B., Swigut, T., Claes, P., Pritchard, J.K., and Wysocka, J. (2023). Precise modulation of transcription factor levels identifies features underlying dosage sensitivity. Nat Genet \u003cem\u003e55\u003c/em\u003e, 841-851. 10.1038/s41588-023-01366-2.\u003c/li\u003e\n\u003cli\u003eFane, M., Harris, L., Smith, A.G., and Piper, M. (2017). Nuclear factor one transcription factors as epigenetic regulators in cancer. Int J Cancer \u003cem\u003e140\u003c/em\u003e, 2634-2641. 10.1002/ijc.30603.\u003c/li\u003e\n\u003cli\u003eChen, K.S., Lim, J.W.C., Richards, L.J., and Bunt, J. (2017). The convergent roles of the nuclear factor I transcription factors in development and cancer. Cancer Lett \u003cem\u003e410\u003c/em\u003e, 124-138. 10.1016/j.canlet.2017.09.015.\u003c/li\u003e\n\u003cli\u003ePatrick, R., Naval-Sanchez, M., Deshpande, N., Huang, Y., Zhang, J., Chen, X., Yang, Y., Tiwari, K., Esmaeili, M., Tran, M., et al. (2024). The activity of early-life gene regulatory elements is hijacked in aging through pervasive AP-1-linked chromatin opening. Cell Metab \u003cem\u003e36\u003c/em\u003e, 1858-1881 e1823. 10.1016/j.cmet.2024.06.006.\u003c/li\u003e\n\u003cli\u003eLu, J.Y., Tu, W.B., Li, R., Weng, M., Sanketi, B.D., Yuan, B., Reddy, P., Rodriguez Esteban, C., and Izpisua Belmonte, J.C. (2025). Prevalent mesenchymal drift in aging and disease is reversed by partial reprogramming. Cell. 10.1016/j.cell.2025.07.031.\u003c/li\u003e\n\u003cli\u003eLiu, J., Han, Q., Peng, T., Peng, M., Wei, B., Li, D., Wang, X., Yu, S., Yang, J., Cao, S., et al. (2015). The oncogene c-Jun impedes somatic cell reprogramming. Nat Cell Biol \u003cem\u003e17\u003c/em\u003e, 856-867. 10.1038/ncb3193.\u003c/li\u003e\n\u003cli\u003eKnaupp, A.S., Buckberry, S., Pflueger, J., Lim, S.M., Ford, E., Larcombe, M.R., Rossello, F.J., de Mendoza, A., Alaei, S., Firas, J., et al. (2017). Transient and Permanent Reconfiguration of Chromatin and Transcription Factor Occupancy Drive Reprogramming. Cell Stem Cell \u003cem\u003e21\u003c/em\u003e, 834-845.e836. 10.1016/j.stem.2017.11.007.\u003c/li\u003e\n\u003cli\u003eLi, D., Liu, J., Yang, X., Zhou, C., Guo, J., Wu, C., Qin, Y., Guo, L., He, J., Yu, S., et al. (2017). Chromatin Accessibility Dynamics during iPSC Reprogramming. Cell Stem Cell \u003cem\u003e21\u003c/em\u003e, 819-833.e816. 10.1016/j.stem.2017.10.012.\u003c/li\u003e\n\u003cli\u003eXing, Q.R., El Farran, C.A., Gautam, P., Chuah, Y.S., Warrier, T., Toh, C.D., Kang, N.Y., Sugii, S., Chang, Y.T., Xu, J., et al. (2020). Diversification of reprogramming trajectories revealed by parallel single-cell transcriptome and chromatin accessibility sequencing. Sci Adv \u003cem\u003e6\u003c/em\u003e. 10.1126/sciadv.aba1190.\u003c/li\u003e\n\u003cli\u003eChronis, C., Fiziev, P., Papp, B., Butz, S., Bonora, G., Sabri, S., Ernst, J., and Plath, K. (2017). Cooperative Binding of Transcription Factors Orchestrates Reprogramming. Cell \u003cem\u003e168\u003c/em\u003e, 442-459.e420. 10.1016/j.cell.2016.12.016.\u003c/li\u003e\n\u003cli\u003eMarkov, G.J., Mai, T., Nair, S., Shcherbina, A., Wang, Y.X., Burns, D.M., Kundaje, A., and Blau, H.M. (2021). AP-1 is a temporally regulated dual gatekeeper of reprogramming to pluripotency. Proc Natl Acad Sci U S A \u003cem\u003e118\u003c/em\u003e. 10.1073/pnas.2104841118.\u003c/li\u003e\n\u003cli\u003eWille, C.K., and Sridharan, R. (2022). DOT1L inhibition enhances pluripotency beyond acquisition of epithelial identity and without immediate suppression of the somatic transcriptome. Stem Cell Reports \u003cem\u003e17\u003c/em\u003e, 384-396. 10.1016/j.stemcr.2021.12.004.\u003c/li\u003e\n\u003cli\u003eKang, H.M., Subramaniam, M., Targ, S., Nguyen, M., Maliskova, L., McCarthy, E., Wan, E., Wong, S., Byrnes, L., Lanata, C.M., et al. (2018). Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol \u003cem\u003e36\u003c/em\u003e, 89-94. 10.1038/nbt.4042.\u003c/li\u003e\n\u003cli\u003eStadtfeld, M., Maherali, N., Borkent, M., and Hochedlinger, K. (2010). A reprogrammable mouse strain from gene-targeted embryonic stem cells. Nat Methods \u003cem\u003e7\u003c/em\u003e, 53-55. 10.1038/nmeth.1409.\u003c/li\u003e\n\u003cli\u003eKang, M., Gulati, G.S., Brown, E.L., Qi, Z., Avagyan, S., Armenteros, J.J.A., Gleyzer, R., Zhang, W., Steen, C.B., D\u0026apos;Silva, J.P., et al. (2025). Improved reconstruction of single-cell developmental potential with CytoTRACE 2. Nat Methods \u003cem\u003e22\u003c/em\u003e, 2258-2263. 10.1038/s41592-025-02857-2.\u003c/li\u003e\n\u003cli\u003eMasur, S.K., Dewal, H.S., Dinh, T.T., Erenburg, I., and Petridou, S. (1996). Myofibroblasts differentiate from fibroblasts when plated at low density. Proc Natl Acad Sci U S A \u003cem\u003e93\u003c/em\u003e, 4219-4223. 10.1073/pnas.93.9.4219.\u003c/li\u003e\n\u003cli\u003ePatrick, R., Janbandhu, V., Tallapragada, V., Tan, S.S.M., McKinna, E.E., Contreras, O., Ghazanfar, S., Humphreys, D.T., Murray, N.J., Tran, Y.T.H., et al. (2024). Integration mapping of cardiac fibroblast single-cell transcriptomes elucidates cellular principles of fibrosis in diverse pathologies. Sci Adv \u003cem\u003e10\u003c/em\u003e, eadk8501. 10.1126/sciadv.adk8501.\u003c/li\u003e\n\u003cli\u003eBuechler, M.B., Pradhan, R.N., Krishnamurty, A.T., Cox, C., Calviello, A.K., Wang, A.W., Yang, Y.A., Tam, L., Caothien, R., Roose-Girma, M., et al. (2021). Cross-tissue organization of the fibroblast lineage. Nature \u003cem\u003e593\u003c/em\u003e, 575-579. 10.1038/s41586-021-03549-5.\u003c/li\u003e\n\u003cli\u003eLove, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol \u003cem\u003e15\u003c/em\u003e, 550. 10.1186/s13059-014-0550-8.\u003c/li\u003e\n\u003cli\u003ePolo, J.M., Anderssen, E., Walsh, R.M., Schwarz, B.A., Nefzger, C.M., Lim, S.M., Borkent, M., Apostolou, E., Alaei, S., Cloutier, J., et al. (2012). A molecular roadmap of reprogramming somatic cells into iPS cells. Cell \u003cem\u003e151\u003c/em\u003e, 1617-1632. 10.1016/j.cell.2012.11.039.\u003c/li\u003e\n\u003cli\u003eWang, T., Chen, K., Zeng, X., Yang, J., Wu, Y., Shi, X., Qin, B., Zeng, L., Esteban, M.A., Pan, G., and Pei, D. (2011). The histone demethylases Jhdm1a/1b enhance somatic cell reprogramming in a vitamin-C-dependent manner. Cell Stem Cell \u003cem\u003e9\u003c/em\u003e, 575-587. 10.1016/j.stem.2011.10.005.\u003c/li\u003e\n\u003cli\u003eCimmino, L., Neel, B.G., and Aifantis, I. (2018). Vitamin C in Stem Cell Reprogramming and Cancer. Trends Cell Biol \u003cem\u003e28\u003c/em\u003e, 698-708. 10.1016/j.tcb.2018.04.001.\u003c/li\u003e\n\u003cli\u003eChen, J., Guo, L., Zhang, L., Wu, H., Yang, J., Liu, H., Wang, X., Hu, X., Gu, T., Zhou, Z., et al. (2013). Vitamin C modulates TET1 function during somatic cell reprogramming. Nat Genet \u003cem\u003e45\u003c/em\u003e, 1504-1509. 10.1038/ng.2807.\u003c/li\u003e\n\u003cli\u003eG\u0026ouml;\u0026ouml;s, H., Kinnunen, M., Salokas, K., Tan, Z., Liu, X., Yadav, L., Zhang, Q., Wei, G.H., and Varjosalo, M. (2022). Human transcription factor protein interaction networks. Nat Commun \u003cem\u003e13\u003c/em\u003e, 766. 10.1038/s41467-022-28341-5.\u003c/li\u003e\n\u003cli\u003eBanito, A., Rashid, S.T., Acosta, J.C., Li, S., Pereira, C.F., Geti, I., Pinho, S., Silva, J.C., Azuara, V., Walsh, M., Vallier, L., and Gil, J. (2009). Senescence impairs successful reprogramming to pluripotent stem cells. Genes Dev \u003cem\u003e23\u003c/em\u003e, 2134-2139. 10.1101/gad.1811609.\u003c/li\u003e\n\u003cli\u003eHong, H., Takahashi, K., Ichisaka, T., Aoi, T., Kanagawa, O., Nakagawa, M., Okita, K., and Yamanaka, S. (2009). Suppression of induced pluripotent stem cell generation by the p53-p21 pathway. Nature \u003cem\u003e460\u003c/em\u003e, 1132-1135. 10.1038/nature08235.\u003c/li\u003e\n\u003cli\u003eKawamura, T., Suzuki, J., Wang, Y.V., Menendez, S., Morera, L.B., Raya, A., Wahl, G.M., and Izpisua Belmonte, J.C. (2009). Linking the p53 tumour suppressor pathway to somatic cell reprogramming. Nature \u003cem\u003e460\u003c/em\u003e, 1140-1144. 10.1038/nature08311.\u003c/li\u003e\n\u003cli\u003eLi, H., Collado, M., Villasante, A., Strati, K., Ortega, S., Canamero, M., Blasco, M.A., and Serrano, M. (2009). The Ink4/Arf locus is a barrier for iPS cell reprogramming. Nature \u003cem\u003e460\u003c/em\u003e, 1136-1139. 10.1038/nature08290.\u003c/li\u003e\n\u003cli\u003eMarion, R.M., Strati, K., Li, H., Murga, M., Blanco, R., Ortega, S., Fernandez-Capetillo, O., Serrano, M., and Blasco, M.A. (2009). A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature \u003cem\u003e460\u003c/em\u003e, 1149-1153. 10.1038/nature08287.\u003c/li\u003e\n\u003cli\u003eUtikal, J., Polo, J.M., Stadtfeld, M., Maherali, N., Kulalert, W., Walsh, R.M., Khalil, A., Rheinwald, J.G., and Hochedlinger, K. (2009). Immortalization eliminates a roadblock during cellular reprogramming into iPS cells. Nature \u003cem\u003e460\u003c/em\u003e, 1145-1148. 10.1038/nature08285.\u003c/li\u003e\n\u003cli\u003eSarig, R., Rivlin, N., Brosh, R., Bornstein, C., Kamer, I., Ezra, O., Molchadsky, A., Goldfinger, N., Brenner, O., and Rotter, V. (2010). Mutant p53 facilitates somatic cell reprogramming and augments the malignant potential of reprogrammed cells. J Exp Med \u003cem\u003e207\u003c/em\u003e, 2127-2140. 10.1084/jem.20100797.\u003c/li\u003e\n\u003cli\u003eHuyghe, A., Furlan, G., Schroeder, J., Cascales, E., Trajkova, A., Ruel, M., St\u0026uuml;der, F., Larcombe, M., Yang Sun, Y.B., Mugnier, F., et al. (2022). Comparative roadmaps of reprogramming and oncogenic transformation identify Bcl11b and Atoh8 as broad regulators of cellular plasticity. Nat Cell Biol \u003cem\u003e24\u003c/em\u003e, 1350-1363. 10.1038/s41556-022-00986-w.\u003c/li\u003e\n\u003cli\u003eWang, Y.J., and Herlyn, M. (2015). The emerging roles of Oct4 in tumor-initiating cells. Am J Physiol Cell Physiol \u003cem\u003e309\u003c/em\u003e, C709-718. 10.1152/ajpcell.00212.2015.\u003c/li\u003e\n\u003cli\u003eChen, W., and Wang, Y.J. (2025). Multifaceted roles of OCT4 in tumor microenvironment: biology and therapeutic implications. Oncogene \u003cem\u003e44\u003c/em\u003e, 1213-1229. 10.1038/s41388-025-03408-x.\u003c/li\u003e\n\u003cli\u003eTerekhanova, N.V., Karpova, A., Liang, W.W., Strzalkowski, A., Chen, S., Li, Y., Southard-Smith, A.N., Iglesia, M.D., Wendl, M.C., Jayasinghe, R.G., et al. (2023). Epigenetic regulation during cancer transitions across 11 tumour types. Nature \u003cem\u003e623\u003c/em\u003e, 432-441. 10.1038/s41586-023-06682-5.\u003c/li\u003e\n\u003cli\u003eZhang, Y., Narayanan, S.P., Mannan, R., Raskind, G., Wang, X., Vats, P., Su, F., Hosseini, N., Cao, X., Kumar-Sinha, C., et al. (2021). Single-cell analyses of renal cell cancers reveal insights into tumor microenvironment, cell of origin, and therapy response. Proc Natl Acad Sci U S A \u003cem\u003e118\u003c/em\u003e. 10.1073/pnas.2103240118.\u003c/li\u003e\n\u003cli\u003eChan, M., Yuan, H., Soifer, I., Maile, T.M., Wang, R.Y., Ireland, A., O\u0026apos;Brien, J.J., Goudeau, J., Chan, L.J.G., Vijay, T., et al. (2022). Novel insights from a multiomics dissection of the Hayflick limit. Elife \u003cem\u003e11\u003c/em\u003e. 10.7554/eLife.70283.\u003c/li\u003e\n\u003cli\u003eAn, Z., Liu, P., Zheng, J., Si, C., Li, T., Chen, Y., Ma, T., Zhang, M.Q., Zhou, Q., and Ding, S. (2019). Sox2 and Klf4 as the Functional Core in Pluripotency Induction without Exogenous Oct4. Cell Rep \u003cem\u003e29\u003c/em\u003e, 1986-2000.e1988. 10.1016/j.celrep.2019.10.026.\u003c/li\u003e\n\u003cli\u003eVelychko, S., Adachi, K., Kim, K.P., Hou, Y., MacCarthy, C.M., Wu, G., and Sch\u0026ouml;ler, H.R. (2019). Excluding Oct4 from Yamanaka Cocktail Unleashes the Developmental Potential of iPSCs. Cell Stem Cell \u003cem\u003e25\u003c/em\u003e, 737-753.e734. 10.1016/j.stem.2019.10.002.\u003c/li\u003e\n\u003cli\u003eO\u0026rsquo;Hara, R., and Banaszynski, L.A. (2022). Loss of heterochromatin at endogenous retroviruses creates competition for transcription factor binding. bioRxiv, 2022.2004.2028.489907. 10.1101/2022.04.28.489907.\u003c/li\u003e\n\u003cli\u003eOhnishi, K., Semi, K., Yamamoto, T., Shimizu, M., Tanaka, A., Mitsunaga, K., Okita, K., Osafune, K., Arioka, Y., Maeda, T., et al. (2014). Premature termination of reprogramming in vivo leads to cancer development through altered epigenetic regulation. Cell \u003cem\u003e156\u003c/em\u003e, 663-677. 10.1016/j.cell.2014.01.005.\u003c/li\u003e\n\u003cli\u003eShiraishi, R., Cancila, G., Kumegawa, K., Torrejon, J., Basili, I., Bernardi, F., Silva, P., Wang, W., Chapman, O., Yang, L., et al. (2024). Cancer-specific epigenome identifies oncogenic hijacking by nuclear factor I family proteins for medulloblastoma progression. Dev Cell \u003cem\u003e59\u003c/em\u003e, 2302-2319 e2312. 10.1016/j.devcel.2024.05.013.\u003c/li\u003e\n\u003cli\u003eWu, N., Jia, D., Ibrahim, A.H., Bachurski, C.J., Gronostajski, R.M., and MacPherson, D. (2016). NFIB overexpression cooperates with Rb/p53 deletion to promote small cell lung cancer. Oncotarget \u003cem\u003e7\u003c/em\u003e, 57514-57524. 10.18632/oncotarget.11583.\u003c/li\u003e\n\u003cli\u003eSemenova, E.A., Kwon, M.C., Monkhorst, K., Song, J.Y., Bhaskaran, R., Krijgsman, O., Kuilman, T., Peters, D., Buikhuisen, W.A., Smit, E.F., et al. (2016). Transcription Factor NFIB Is a Driver of Small Cell Lung Cancer Progression in Mice and Marks Metastatic Disease in Patients. Cell Rep \u003cem\u003e16\u003c/em\u003e, 631-643. 10.1016/j.celrep.2016.06.020.\u003c/li\u003e\n\u003cli\u003ePaynter, J.M., Chen, J., Liu, X., and Nefzger, C.M. (2019). Propagation and Maintenance of Mouse Embryonic Stem Cells. Methods Mol Biol \u003cem\u003e1940\u003c/em\u003e, 33-45. 10.1007/978-1-4939-9086-3_3.\u003c/li\u003e\n\u003cli\u003eNefzger, C.M., Haynes, J.M., and Pouton, C.W. (2011). Directed expression of Gata2, Mash1, and Foxa2 synergize to induce the serotonergic neuron phenotype during in vitro differentiation of embryonic stem cells. Stem Cells \u003cem\u003e29\u003c/em\u003e, 928-939. 10.1002/stem.640.\u003c/li\u003e\n\u003cli\u003eLarcombe, M.R., Manent, J., Chen, J., Mishra, K., Liu, X., and Nefzger, C.M. (2019). Production of High-Titer Lentiviral Particles for Stable Genetic Modification of Mammalian Cells. Methods Mol Biol \u003cem\u003e1940\u003c/em\u003e, 47-61. 10.1007/978-1-4939-9086-3_4.\u003c/li\u003e\n\u003cli\u003eNefzger, C.M., Alaei, S., Knaupp, A.S., Holmes, M.L., and Polo, J.M. (2014). Cell surface marker mediated purification of iPS cell intermediates from a reprogrammable mouse model. J Vis Exp, e51728. 10.3791/51728.\u003c/li\u003e\n\u003cli\u003eNefzger, C.M., Rossello, F.J., Chen, J., Liu, X., Knaupp, A.S., Firas, J., Paynter, J.M., Pflueger, J., Buckberry, S., Lim, S.M., et al. (2017). Cell Type of Origin Dictates the Route to Pluripotency. Cell Rep \u003cem\u003e21\u003c/em\u003e, 2649-2660. 10.1016/j.celrep.2017.11.029.\u003c/li\u003e\n\u003cli\u003eGaublomme, J.T., Li, B., McCabe, C., Knecht, A., Yang, Y., Drokhlyansky, E., Van Wittenberghe, N., Waldman, J., Dionne, D., Nguyen, L., et al. (2019). Nuclei multiplexing with barcoded antibodies for single-nucleus genomics. Nat Commun \u003cem\u003e10\u003c/em\u003e, 2907. 10.1038/s41467-019-10756-2.\u003c/li\u003e\n\u003cli\u003eNaval-Sanchez, M., Deshpande, N., Tran, M., Zhang, J., Alhomrani, M., Alsanie, W., Nguyen, Q., and Nefzger, C.M. (2022). Benchmarking of ATAC Sequencing Data From BGI\u0026apos;s Low-Cost DNBSEQ-G400 Instrument for Identification of Open and Occupied Chromatin Regions. Front Mol Biosci \u003cem\u003e9\u003c/em\u003e, 900323. 10.3389/fmolb.2022.900323.\u003c/li\u003e\n\u003cli\u003eBolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics \u003cem\u003e30\u003c/em\u003e, 2114-2120. 10.1093/bioinformatics/btu170.\u003c/li\u003e\n\u003cli\u003eZhu, Q., Conrad, D.N., and Gartner, Z.J. (2024). deMULTIplex2: robust sample demultiplexing for scRNA-seq. Genome Biol \u003cem\u003e25\u003c/em\u003e, 37. 10.1186/s13059-024-03177-y.\u003c/li\u003e\n\u003cli\u003eHeaton, H., Talman, A.M., Knights, A., Imaz, M., Gaffney, D.J., Durbin, R., Hemberg, M., and Lawniczak, M.K.N. (2020). Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat Methods \u003cem\u003e17\u003c/em\u003e, 615-620. 10.1038/s41592-020-0820-1.\u003c/li\u003e\n\u003cli\u003eHao, Y., Stuart, T., Kowalski, M.H., Choudhary, S., Hoffman, P., Hartman, A., Srivastava, A., Molla, G., Madad, S., Fernandez-Granda, C., and Satija, R. (2024). Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol \u003cem\u003e42\u003c/em\u003e, 293-304. 10.1038/s41587-023-01767-y.\u003c/li\u003e\n\u003cli\u003eStuart, T., Srivastava, A., Madad, S., Lareau, C.A., and Satija, R. (2021). Single-cell chromatin state analysis with Signac. Nat Methods \u003cem\u003e18\u003c/em\u003e, 1333-1341. 10.1038/s41592-021-01282-5.\u003c/li\u003e\n\u003cli\u003eZhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol \u003cem\u003e9\u003c/em\u003e, R137. 10.1186/gb-2008-9-9-r137.\u003c/li\u003e\n\u003cli\u003eWolf, F.A., Angerer, P., and Theis, F.J. (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biol \u003cem\u003e19\u003c/em\u003e, 15. 10.1186/s13059-017-1382-0.\u003c/li\u003e\n\u003cli\u003eGalili, T. (2015). dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering. Bioinformatics \u003cem\u003e31\u003c/em\u003e, 3718-3720. 10.1093/bioinformatics/btv428.\u003c/li\u003e\n\u003cli\u003eHafemeister, C., and Satija, R. (2019). Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol \u003cem\u003e20\u003c/em\u003e, 296. 10.1186/s13059-019-1874-1.\u003c/li\u003e\n\u003cli\u003eLawrence, M., Huber, W., Pages, H., Aboyoun, P., Carlson, M., Gentleman, R., Morgan, M.T., and Carey, V.J. (2013). Software for computing and annotating genomic ranges. PLoS Comput Biol \u003cem\u003e9\u003c/em\u003e, e1003118. 10.1371/journal.pcbi.1003118.\u003c/li\u003e\n\u003cli\u003ePatrick, R., Humphreys, D.T., Janbandhu, V., Oshlack, A., Ho, J.W.K., Harvey, R.P., and Lo, K.K. (2020). Sierra: discovery of differential transcript usage from polyA-captured single-cell RNA-seq data. Genome Biol \u003cem\u003e21\u003c/em\u003e, 167. 10.1186/s13059-020-02071-7.\u003c/li\u003e\n\u003cli\u003eFinak, G., McDavid, A., Yajima, M., Deng, J., Gersuk, V., Shalek, A.K., Slichter, C.K., Miller, H.W., McElrath, M.J., Prlic, M., Linsley, P.S., and Gottardo, R. (2015). MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol \u003cem\u003e16\u003c/em\u003e, 278. 10.1186/s13059-015-0844-5.\u003c/li\u003e\n\u003cli\u003eDuttke, S.H., Guzman, C., Chang, M., Delos Santos, N.P., McDonald, B.R., Xie, J., Carlin, A.F., Heinz, S., and Benner, C. (2024). Position-dependent function of human sequence-specific transcription factors. Nature \u003cem\u003e631\u003c/em\u003e, 891-898. 10.1038/s41586-024-07662-z.\u003c/li\u003e\n\u003cli\u003eBrionne, A., Juanchich, A., and Hennequet-Antier, C. (2019). ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity. BioData Min \u003cem\u003e12\u003c/em\u003e, 16. 10.1186/s13040-019-0204-1.\u003c/li\u003e\n\u003cli\u003eDurinck, S., Spellman, P.T., Birney, E., and Huber, W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc \u003cem\u003e4\u003c/em\u003e, 1184-1191. 10.1038/nprot.2009.97.\u003c/li\u003e\n\u003cli\u003eAlexa, A., and Rahnenfuhrer, J. (2024). topGO: Enrichment Analysis for Gene Ontology.\u003c/li\u003e\n\u003cli\u003eSzklarczyk, D., Kirsch, R., Koutrouli, M., Nastou, K., Mehryary, F., Hachilif, R., Gable, A.L., Fang, T., Doncheva, N.T., Pyysalo, S., et al. (2023). The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res \u003cem\u003e51\u003c/em\u003e, D638-D646. 10.1093/nar/gkac1000.\u003c/li\u003e\n\u003cli\u003eShannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res \u003cem\u003e13\u003c/em\u003e, 2498-2504. 10.1101/gr.1239303.\u003c/li\u003e\n\u003cli\u003eColaprico, A., Silva, T.C., Olsen, C., Garofano, L., Cava, C., Garolini, D., Sabedot, T.S., Malta, T.M., Pagnotta, S.M., Castiglioni, I., et al. (2016). TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res \u003cem\u003e44\u003c/em\u003e, e71. 10.1093/nar/gkv1507.\u003c/li\u003e\n\u003cli\u003eBray, N.L., Pimentel, H., Melsted, P., and Pachter, L. (2016). Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol \u003cem\u003e34\u003c/em\u003e, 525-527. 10.1038/nbt.3519.\u003c/li\u003e\n\u003cli\u003eVivian, J., Rao, A.A., Nothaft, F.A., Ketchum, C., Armstrong, J., Novak, A., Pfeil, J., Narkizian, J., Deran, A.D., Musselman-Brown, A., et al. (2017). Toil enables reproducible, open source, big biomedical data analyses. Nature Biotechnology \u003cem\u003e35\u003c/em\u003e, 314-316. 10.1038/nbt.3772.\u003c/li\u003e\n\u003cli\u003eDurinck, S., Moreau, Y., Kasprzyk, A., Davis, S., De Moor, B., Brazma, A., and Huber, W. (2005). BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics \u003cem\u003e21\u003c/em\u003e, 3439-3440. 10.1093/bioinformatics/bti525.\u003c/li\u003e\n\u003cli\u003eDurinck, S., Spellman, P.T., Birney, E., and Huber, W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nature Protocols \u003cem\u003e4\u003c/em\u003e, 1184-1191. 10.1038/nprot.2009.97.\u003c/li\u003e\n\u003cli\u003eBrowaeys, R., Saelens, W., and Saeys, Y. (2020). NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods \u003cem\u003e17\u003c/em\u003e, 159-162. 10.1038/s41592-019-0667-5.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"University of Queensland","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8400524/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8400524/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eCell identity is governed by graded transcription factor (TF) activity, yet current single-cell tools do not resolve how TF dosage shapes both gene expression and chromatin accessibility. We present DoseH-seq, a dosage-resolved single-nucleus RNA+ATAC multiome assay built on standard 10x Genomics workflows. DoseH-seq integrates sample hashing with quantitative tracking of continuous lentiviral overexpression and knockdown across multiplexed conditions and time points. We validate DoseH-seq in both single and multiperturb designs, including dual overexpression and overexpression/knockdown experiments, across different cell types and conditions. We apply DoseH-seq to resolve the dose and context-dependent roles of NFIX, a somatic TF enriched at regulatory elements more active in youthful cells, in fibroblasts, pluripotent stem cells (PSCs) and during reprogramming. In fibroblasts, increased NFIX opens regulatory elements whose motifs compete with myofibroblast identity TFs, consistent with counteracting mesenchymal drift and a restricted developmental reversion. During reprogramming, high NFIX overexpression activates AP-1 and stabilizes the somatic state. Conversely, transitory moderate-level NFIX overexpression, when Yamanaka factors are limiting, synergistically opens chromatin transiently to dismantle the somatic network, with potential analogous roles in oncogenic identity remodelling. During NFIX-induced PSC differentiation, transient reprogramming elements bearing NFI and degenerate pluripotency TF motifs (including OCT4) are re-engaged, consistent with developmental roles, mechanistically linking reprogramming with differentiation. Our data reveal graded dosage effects in somatic and pluripotency TF interactions, highlighting DoseH-seq as a generalisable perturbation-multiomics platform for resolving gene-dosage interactions governing cell identity and cell-state transitions.\u003c/p\u003e","manuscriptTitle":"DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-23 15:04:46","doi":"10.21203/rs.3.rs-8400524/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"568abd16-ba5e-438c-9b56-540015d8588e","owner":[],"postedDate":"December 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-12-23T15:04:46+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-23 15:04:46","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8400524","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8400524","identity":"rs-8400524","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.