Untangling the Molecular Mechanism of SpCas9 Catalytic Activation: A Gear-and-Wedge Fitting Model

preprint OA: closed
Full text JSON View at publisher
Full text 149,108 characters · extracted from preprint-html · click to expand
Untangling the Molecular Mechanism of SpCas9 Catalytic Activation: A Gear-and-Wedge Fitting Model | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Untangling the Molecular Mechanism of SpCas9 Catalytic Activation: A Gear-and-Wedge Fitting Model Shaoyong Lu, Xinyi Li, Jiacheng Wei, Feiying Chen, Mingyu Li, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6018412/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The CRISPR-associated endonuclease Streptococcus pyogenes Cas9 (SpCas9) enables site-specific DNA cleavage by transitioning from a pre-catalytic conformation to a catalytically active state, yet how its HNH catalytic domain undergoes an approximately 40 Å displacement towards the target DNA has remained elusive. Here, we combined extensive unbiased molecular dynamics simulations, spanning a cumulative timescale of 160 µs, with Markov state modeling to map the kinetic pathway of SpCas9 activation. In vitro DNA cleavage assays and a cellular fluorescence reporter system further validated the atomic-level mechanisms revealed by our simulations. We found that the folding of the L1 linker and unfolding of the L2 linker serve as the principal driving force, inducing a “gear-and-wedge” cooperative motion within the HNH domain. Concurrently, the REC2 domain moved outward to accommodate the displaced HNH domain and formed transient stabilizing interactions with the HNH domain along the activation route. Site-directed mutagenesis of key L2 linker residues and REC2 loops markedly reduced SpCas9 cleavage efficiency in both HEK293T cells and biochemical assays, underscoring their critical role in SpCas9 ribonucleoprotein activation. Collectively, this study provides a high-resolution view of SpCas9 catalytic activation and opens up new avenues for the rational design of SpCas9 variants with enhanced performance and specificity. Biological sciences/Computational biology and bioinformatics/Protein function predictions Biological sciences/Biophysics/Computational biophysics Biological sciences/Microbiology/CRISPR-Cas systems/CRISPR-Cas9 genome editing CRISPR-Cas9 Molecular dynamics simulations Conformational dynamics Gene editing Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Clustered regularly interspaced short palindromic repeats (CRISPR-Cas) systems, derived from adaptive immune mechanisms in bacteria, have revolutionized genome editing by enabling precise target DNA cleavage guided by programmable RNA 1 – 5 . Currently, owing to its high editing efficiency and adaptability, Streptococcus pyogenes Cas9 (SpCas9) is one of the most widely employed endonucleases for CRISPR-Cas-relevant genome-editing applications 3 , 6 , 7 . Crystallographic or cryo-electron microscopy (cryo-EM) studies have revealed a bilobed structure adopted by SpCas9 (Fig. 1 ), consisting of a recognition (REC) lobe and a nuclease (NUC) lobe 8 – 10 . The REC lobe encompasses three α-helical domains, REC1, REC2 and REC3, all of which have been suggested to play an important role in mediating nucleotide interactions 11 . The NUC lobe harbors two nuclease domains, HNH and RuvC, followed by a C-terminal domain (CTD). The HNH and RuvC domains are connected by L1 and L2 linkers and are responsible for the concerted cleavage of target and non-target DNA strand 9 , 12 , 13 , respectively. The process of the SpCas9 endonuclease action has been elaborated in great detail 7 , 9 . In brief, SpCas9 first complexes with a single-guide RNA (sgRNA) containing the CRISPR sequence for the specific recognition of target DNA (tDNA) with complementarity. Recognition of the species-specific protospacer adjacent motif (PAM) on the non-target strand initiates this process, followed by the invasion of sgRNA into tDNA and the formation of an sgRNA:tDNA heteroduplex. To this stage, the ternary SpCas9:sgRNA:tDNA ribonucleoprotein complex (SpCas9 RNP) has gathered all necessary elements for cleavage, except that the HNH nuclease domain locates more than approximately 40 Å away from its target nucleotide, adopting a pre-catalytic/inactive conformation 8 , 10 , 14 (Fig. 1 B). For the cleavage reaction to occur, the H840 catalytic residue within the HNH domain must undergo a massive displacement of approximately 40 Å and rotation of approximately 110° to approach its target phosphate (Fig. 1 C). Much effort has been devoted to understanding the process of catalytic activation of the SpCas9 RNP, focusing on elucidating how the HNH domain fulfills such a dramatic relocation. Crystallographic and cryo-EM studies have attempted to capture the structure of SpCas9 RNP at various stages of activation 9 , 10 , 15 – 17 . However, owing to the highly dynamic and transient nature of the activation process, the majority of the obtained structures only represent the system in its endpoint pre-catalytic/catalytic structures. The critical molecular basis of HNH domain relocation towards tDNA within SpCas9 RNP during catalytic activation remains unknown. In this study, we combined extensive unbiased molecular dynamics (MD) simulations with in vitro DNA cleavage assay and green fluorescent protein reporter system in mammalian cells to disclose the conformational dynamics underlying SpCas9 RNP catalytic activation. Nudged elastic band method was used to generate intermediate conformations between solved inactive and active states 18 , followed by extensive MD simulations to reach an unprecedent 160 µs timescale. A Markov state model was constructed to elucidate the stepwise structural transition pattern of the SpCas9 RNP, where HNH domain relocation was predominantly driven by the intrinsic conformational plasticity of its adjacent L1 and L2 linkers. During activation, the L1 linker folds into a highly ordered helical structure, whereas the L2 linker disassembles into a disordered loop. The two concurrent structural changes drive the HNH domain to redock approximately 40 Å from L2-proximal to L1-proximal region. Additionally, we revealed the structural basis of HNH domain rotation during activation, identifying it as a result of rotational forces exerted on the HNH N-terminal α-helical units by the contraction of L1 linker. This rotation propagates to the HNH C-terminal α-helical units through extensive interactions, resembling the mechanism of a rotating gear repositioning its fitted wedge. In addition to the primary driving force of the L1 and L2 linkers, the REC2 domain was found to relocate cooperatively with the HNH domain, enabling formation of distinctive interactions with the HNH domain to stabilize the transitional structures during activation. Furthermore, we designed a fluorescence-based system to evaluate SpCas9 RNP activation under various cellular conditions. Together with in vitro DNA cleavage assay, we experimentally verified the contribution of the L1 and L2 linker structural flexibility and the REC2 domain to SpCas9 RNP catalytic activation, providing further support for our activation model. Taken together, by combining in silico, in vitro , and intracellular fluorescence-based reporter systems, our work illuminated the detailed atomic mechanism for the catalytic activation of SpCas9 RNP and highlighted the contribution of the L1, L2 linker, and REC2 domains in the process. The resolved mechanism of SpCas9 RNP activation could advance our ability to rationally design SpCas9-based genome editing tools with enhanced functionality 1 , 3 , 19 , 20 . Results Extensive Unbiased MD Simulations Reveal Activation Pattern of SpCas9 RNP at Millisecond Timescale To elucidate the SpCas9 RNP activation pathway, we employed the nudged elastic band (NEB) method to generate 30 intermediate conformations bridging the experimentally resolved inactive/pre-catalytic (PDB ID: 6O0Z) and active/catalytic (PDB ID: 6O0Y) states 10 , 18 . The two solved cryo-EM structures, which differed significantly in their domain organization, only represented static end snapshots during the activation process of SpCas9 RNP. The 30 intermediate states produced by NEB method formed a hypothetical transition pathway connecting the two endpoint structures during SpCas9 RNP activation. These NEB-generated snapshots revealed progressive structural reorganization towards the active state, as exemplified by the relocation of the HNH domain towards its cleavage site on the tDNA strand ( Supplementary Fig. 1 ). To understand the molecular basis underlying SpCas9 RNP activation and to identify targetable transition states in the process, each of the 32 structures was subjected to 500 ns extensive unbiased MD simulations and independently repeated 10 times per structure, gathering a cumulative timescale of 160 µs (MD simulation details provided in Methods ). According to previous studies 21 – 27 , the timescale reached in the current study is sufficient to investigate SpCas9 RNP activation kinetics. To capture the essential dynamics, we first performed time-lagged independent component analysis (TICA) on the obtained trajectories 28 . The first two time-lagged independent components (tIC1 and tIC2), which captured the most principle kinetic dynamics, were used to generate a TICA free-energy landscape (FEL) to provide preliminary insights into the activation dynamics of SpCas9 RNP (Fig. 2 A). The inactive and active starting structures were found locating within the two major energy basins at opposite ends of the TICA landscape (Fig. 2 A, Supplementary Fig. 2A, B ). The TICA FEL revealed a continuous conformational transition pathway connecting the two energy basins, confirming that our simulations efficiently recapitulated the activation pathway. To further investigate structural intermediates with biological relevance during SpCas9 RNP activation, two distance parameters were selected: the distance between the catalytic H840 residue in the HNH nuclease domain and the tDNA cleavage site (d HNH−N ) and the distance between the REC2 domain and the sgRNA:tDNA heteroduplex (d REC2−N ). By comparing the two distances in the cryo-EM structures of SpCas9 RNP in its pre-catalytic and catalytic states, we observed an approximately 6 Å displacement of REC2 away from the sgRNA:tDNA duplex and an approximately 40 Å movement of the HNH domain towards the cleavage site ( Supplementary Fig. 2C ). Various crystal or cryo-EM structures have captured the REC2 and HNH domains at distinct spatial positions 9 , 10 , 15 – 17 , further confirming the importance of the conformational dynamics of these domains during activation. Additionally, the FEL of the two distance parameters exhibited a continuous distribution of states with distinct energy basins, indicating the existence of potential metastable states along the activation pathway ( Supplementary Fig. 3A, B ). To construct a kinetic model of structural transitions during SpCas9 RNP activation, we first discretized the two distance parameters calculated from all 160 µs simulations into 600 clusters using K-means clustering algorithm 29 . The number of clusters chosen here maximized the kinetic variance within the chosen feature, as estimated by the variational approach for Markov processes (VAMP2) score 30 ( Supplementary Fig. 3C ). These clusters are treated as discrete states within the system. As the system evolved, transitions between clusters over time formed the basis for establishing a Markov state model (MSM). We constructed an MSM for SpCas9 RNP activation with a lag time of 10 ns, and rigorously tested its Markovianity ( Supplementary Fig. 3D, E ). Perron Cluster Cluster Analysis (PCCA) 31 identified the presence of four metastable states (denoted as S0 , MS1 , MS2 , and S1 ). Figure 2 B). S0 and S1 encompassed the majority of the simulated samples, occupying 21.72% and 52.83% of the conformational space, respectively. S0 and S1 also represented kinetically stable conformational ensembles of catalytically inactive and active SpCas9 RNP, as the majority of the solved structures fell into the two states ( Supplementary Fig. 3F, G ). The two minor metastable states, MS1 and MS2 , which represented 15.26% and 10.20% of the conformational space, respectively, constituted the key intermediate states during SpCas9 RNP activation. Based on the established MSM, transition path theory 32 disclosed that SpCas9 RNP catalytic activation followed a sequential pathway from S0 → MS1 → MS2 → S1 (Fig. 2 C). The relevant time of activation was further estimated by calculating the mean first passage time (MFPT) between the metastable states S0 and S1 along the sequential transition path. MFPT indicated a timescale of ~ 4.5 millisecond for SpCas9 RNP catalytic activation (Fig. 2 D). The millisecond timescale of SpCas9 catalytic activation calculated using our model aligned well with previous MD simulations and experimental observations 11 , 24 , 33 – 35 . Taken together, our extensive 160 µs unbiased MD simulations enabled construction of an MSM based on two critical distance parameters. This model effectively recapitulated the kinetics of SpCas9 RNP activation, revealing stepwise activation dynamics that prime the complex for tDNA cleavage. Coordinated Structural Dynamics of L1 and L2 Linkers Modulate HNH Domain Relocation To further understand the key conformational rearrangement events during stepwise activation, representative structures were extracted from the metastable ensembles ( Supplementary Fig. 4A-D ). Consistent with previous studies 8 , 12 , our simulations also suggested that the overall bilobed architecture of SpCas9 was preserved during catalytic activation, with prominent structural dynamics observed in the REC2 and HNH domains ( Supplementary Fig. 4E-G ). The HNH domain of SpCas9 consists of three N-terminal helices (referred as 3N, including helix αN1, αN2, and αN3) and two C-terminal helices (referred as 2C, including helix αC1 and αC2) ( Supplementary Fig. 5 ). 3N and 2C are connected by a long, disordered loop harboring the catalytic residue H840. The 3N helical region directly connects to the L1 linker through αN1, while the 2C region connects to the L2 linker via αC1. Throughout our simulations, 2C helices remained largely stable, whereas the 3N helical regions, especially αN2, underwent noticeable secondary structural rearrangements during activation ( Supplementary Fig. 6C ). In the metastable state S0 (Fig. 3 A, Supplementary Fig. 4A ), the HNH domain was located near the PAM-distal region of the sgRNA:tDNA heteroduplex, away from the scissile phosphate, rendering SpCas9 RNP catalytically inactive. The molecular basis underlying such HNH domain location is the highly ordered helical structure of the L2 linker ( Supplementary Fig. 6B ), which anchors the HNH domain in the proximity of RuvC domain through the 2C helical region. Residue A889 and Y882 of αC2 from the tethered 2C helices interact with R783 of αN1 and Y815 of αN3 from the HNH N-terminal 3N region. Under these circumstances, the disordered loop sandwiched between the N- and C-terminal helical regions is positioned away from the catalytic core. Additionally, K878 of αC2 interacts with M822 and D825 within the disordered loop, effectively concealing the catalytic H840 residue within the NUC lobe. Activation of SpCas9 RNP is initiated as S0 transitions towards MS1 . Key events in this process included the partial folding of the L1 linker accompanied by the unfolding of the L2 linker (Fig. 3 B, Supplementary Fig. 6A, B ). With the collapse of the L2 linker helical conformation, the tethered HNH domain gained increased flexibility, which was the basis for its subsequent translocation. The partially folded L1 linker directly exerts a pulling force on its adjacent helix αN1 in 3N helical region, resulting in ~ 40° rotation of 3N while dragging HNH domain towards the PAM-proximal region of the heteroduplex. The rotation of 3N tuple by L1 linker, together with coordinated unfolding of L2 linker, further propagates to the whole HNH domain through the emerging interaction between R783 of αN1 with K890 of αC2. The disordered loop between two helical regions swings ~ 30° anticlockwise, enabling novel interactions with αC2. Together, these events contributed to a substantial ~ 16 Å displacement (approximately 22 Å for H840) and approximately 49° rotation of the HNH domain (around the SpCas9 RNP central axis, Supplementary Fig. 4B, E, 7A, B ). The subsequent transition from MS1 to MS2 involved further concerted folding of the L1 linker and unfolding of the L2 linker (Fig. 3 C, F, and Supplementary Fig. 6A, B ). Continued contraction of the L1 linker keeps spinning the 3N helical tuple while pulling it towards the tDNA, which further rotates 2C through structural complementarity. Compared with S0 , the 3N helical region undergoes an approximately 90° rotation, which further reposes 2C. A new interaction forms between Y882 of αC2 and N818 of the disordered region, facilitating coordinated movement of 3N and 2C. This results in a further approximately 17 Å movement (approximately 18 Å for H840) and approximately 40° rotation of the HNH domain (around the SpCas9 RNP central axis), relocating the HNH domain to the center of the SpCas9 RNP ( Supplementary Fig. 4C, 4F, 7A, 7C ). During the transition from metastable state S0 to MS1 and then to MS2 , the HNH domain mainly undergoes horizontal displacement to reach the center of the SpCas9 RNP. However, HNH domain still locates ~ 10 Å away from its target in metastable state MS2 . The catalytically active metastable state S1 is reached when L1 adopts a fully helical structure that anchors the HNH domain, whose flexibility is conferred by a completely extended L2 linker (Fig. 3 D, E). The highly ordered L1 linker further spins 3N approximately 30°, shifting 2C to its lower position, thus posing the H840-haboring disordered loop even closer to the tDNA. Such conformation is further “locked” by the interaction between A889, K890 of αC2 and R778 of L1 linker. Notably, apart from the approximately 27° rotation (around the SpCas9 RNP central axis), the transition from MS2 to S1 mainly involved an approximately 11 Å para-axis upward shift of the HNH domain (also approximately 11 Å for H840) to dock onto the sgRNA:tDNA heteroduplex ( Supplementary Fig. 7D ). Taken together, through analyzing the experimentally inaccessible transition states, we found that the HNH domain underwent synchronous relocation and rotation during activation. Compared to the catalytically incompetent metastable state S0 , the HNH N-terminal helical region 3N undergoes approximately 120° counterclockwise rotation ( Supplementary Fig. 7A ), which drives approximately 27 Å displacement (approximately 37 Å for the H840 catalytic residue), accompanied by approximately 104° rotation (around the SpCas9 RNP central axis) of the HNH domain to approach its target nucleotide to reach the catalytically competent state S1 . Interestingly, our model also suggested that stepwise relocation of the HNH domain begins with a horizontal motion to proximate the central canal ( Supplementary Fig. 4E, F ), followed by a shift upward to approach the tDNA ( Supplementary Fig. 4G ). Moreover, our MD simulation-based study revealed a mechanism by which the cooperative structural dynamics of the L1 and L2 linkers drive relocation of the HNH domain. The contracting L1 linker rotates its adjacent HNH N-terminal helical region 3N, which through structural complementarity, spins the HNH C-terminal helical region 2C, resembling the way a “ gear” (3N) orientates its matching “wedge” (2C). Meanwhile, the disassembling L2 linker unleashes its restraint on 2C to enable its displacement, further permitting the H840-habouring disordered loop sandwiched between the two helical regions to relocate near the nucleotide substrate. Hence, the structural dynamics of the L1 and L2 linkers play a pivotal role in SpCas9 RNP catalytic activation. Coupled Motion between REC2 and HNH Domains Drives SpCas9 RNP Catalytic Activation The REC lobe mainly facilitates the binding and recognition of nucleotide substrates; however, recent studies have suggested a role in mediating SpCas9 RNP catalytic activation. In line with experimental observations, in our model, as SpCas9 RNP trends towards activation following S0 -> MS1 -> MS2 -> S1 transition path, REC2 domain gradually positions away from the sgRNA:tDNA heteroduplex, thus providing space for accommodating HNH domain (Fig. 2 D). The pattern of motion correlation between the residues during catalytic activation was analyzed using a generalized cross-correlation matrix (GCCM) algorism 36 . In general, the inter-residue correlation decreased when trending towards the active state, and the highest level of correlation was found in S0 metastable state ( Supplementary Fig. 8A, B ), suggesting a tight information flow within the complex under catalytically inactive state. The coordinated motion between the REC2 and HNH domains was retained throughout activation (Fig. 4 ), supporting the importance of the REC2 domain in regulating HNH domain relocation. As catalytic activation progressed, the correlated motion between the REC2 domain and L2 linker gradually diminished, whereas that between REC2 and L1 linker persisted. This observation aligns well with the structural folding and refolding dynamics of L1 and L2 linkers during activation. To understand the essential structural events underlying the motional correlation between REC2 and the HNH domains, we looked more closely at the interfacing residues between the two domains (Fig. 3 ). In the inactive metastable state, S0 , the REC2 domain docked close to the central sgRNA:tDNA heteroduplex immediately upstream of the target nucleotide, thus occluding HNH domain from accessing the tDNA. At this stage, E197 and E198 of REC2 interact with R780 and K782 of αN1 in HNH domain N-terminal region 3N, stabilizing HNH domain in its position distant from the cleavage site (Fig. 3 A, Supplementary Fig. 4A ). As activation is initiated, the HNH domain begins to rotate due to the structural plasticity of the L1 and L2 linkers, thereby changing its interfacing residues with REC2. In the intermediate metastable conformation MS1 (Fig. 3 B, Supplementary Fig. 4B, E ), HNH rotation results in displacement of αN1, weakening its interaction with REC2 domain. The rotation of 3N brings αN2 close to REC2, allowing contacts to be established between E802 and Q805 of αN2 and R220 and S219 in the REC2 domain. With further contraction of L1 linker, SpCas9 RNP transits to metastable state MS2 (Fig. 3 C, Supplementary Fig. 4C, F ), in which αN1 rotating away from the REC2 domain. Under this state, transient interaction between E223 of REC2 and K797 of αN2 establishes, contributing to stabilization of HNH domain in a transitional metastable state. Ultimately, the HNH domain approaches tDNA, which is enabled by the approximately 11 Å outward relocation of the REC2 domain away from the central heteroduplex (Fig. 2 D). In the corresponding metastable ensemble S1 (Fig. 3 D, Supplementary Fig. 4D, G ), R221 of REC2 interacts with E798 of αN2. REC2 domain undergoes the most prominent outward shift while transiting to metastable state S1 (Fig. 2 D), and such displacement is in well coordination with the ~ 11 Å vertical shift (parallel to SpCas9 RNP central axis) of HNH domain during MS2 -> S1 transition to dock onto the target scissile phosphate. Interestingly, αN2 provides the major interactions between HNH and REC2 domains when HNH domain starts to rotate by L1 linker contraction during catalytic activation. As αN2 is also the only component exhibiting prominent conformational flexibility within HNH N-terminal region 3N ( Supplementary Fig. 6C ), it is obvious that disassembly of αN2 helical structure into flexible loop enables its versatile interaction with REC2 interfacing residues. These transient interactions can assist in stabilizing the HNH domain at different spatial positions during catalytic activation. In summary, together with the structural plasticity of the L1 and L2 linkers, the REC2 domain relocated from the SpCas9 RNP center in a coordinated manner. Furthermore, the departing REC2 domain establishes different interactions with the HNH N-terminal helical regions, in which αN2 plays an important role, to dictate and guide the HNH domain to the cleavage site. In vitro and Intracellular Experimental Validation of SpCas9 RNP Stepwise Activation Model Based on the MSM established with the unprecedent 160 µs unbiased MD simulations, we disclosed the importance of L1 and L2 linkers in driving HNH domain relocation, which was further dictated by REC2 domain. Our results suggested that the synergistic folding and unfolding of the L1 and L2 linkers are key events that promote catalytic activation. The REC2 domain, located distant from the HNH domain on the other side of the heteroduplex, contributes to HNH domain relocation through cooperative motion and establishes distinctive interactions during stepwise activation. To further validate our proposed model, we designed an green fluorescent protein (GFP) reporter system in mammalian cells to evaluate the SpCas9 RNP activation capacity 37 – 39 . Briefly, a reporter plasmid (GFxFP reporter plasmid) was designed, in which the cDNA encoding GFP was separated into GF and FP segments by a fragment that contained an early stop codon. The GFxFP reporter plasmid was co-transfected into HEK293T cells with a plasmid encoding the SpCas9 endonuclease and an sgRNA sequence. The sgRNA sequence guided SpCas9 cleavage in the GFxFP reporter plasmid. Based on the overlapping homology of the GF and FP segments, the cleaved GFxFP reporter plasmid restored GFP cDNA levels via complementarity-directed single-strand annealing (Fig. 5 A), thus enabling the expression of GFP. Therefore, the SpCas9 catalytic capacity could be inferred from the population of GFP-positive HEK293T cells through fluorescence-activated cell sorting (FACS). In addition to the cellular fluorescence-based reporter plasmid, we also reconstituted the SpCas9 RNP cleavage reaction in vitro using the synthesized double-stranded DNA oligos as a substrate 40 . Activated SpCas9 RNP introduces a double-stranded break in the substrate DNA, which can be resolved by gel electrophoresis as cleaved and uncleaved nucleotide bands. Under these circumstances, the ratio between the cleaved and uncleaved nucleotide substrates directly reflects the catalytic activation potential of SpCas9 RNP (Fig. 5 B). According to our model, the L2 linker restrained HNH domain at its helical conformation, and the structural flexibility of the L2 linker enables HNH domain relocation. G906 and G907 of the L2 linker were highly dynamic during the simulation. As the least bulky amino acids, G906 and G907 are expected to provide the L2 linker structural flexibility during the unfolding process. Therefore, we generated SpCas9 mutants in which the two glycines were substituted with bulky phenylalanine/tryptophan (SpCas9 FF : G906F G907F, SpCas9 WW : G906W G907W; Fig. 5 C) to decrease L2 linker flexibility. As expected, HEK293T cells transfected with SpCas9 FF and SpCas9 WW mutants exhibited a reduced GFP-positive population compared to the wild-type (WT) SpCas9. We also observed more uncleaved tDNA substrates in the SpCas9 cleavage assay using these L2 linker mutants (Fig. 5 D-G). However, mutation of adjacent non-glycine residues (SpCas9 L908A and SpCas9 L909A ) did not induce a noticeable change in SpCas9 RNP catalytic activity, further validating the importance of L2 linker flexibility in the activation process ( Supplementary Fig. 9 ). The cellular and in vitro experiments confirmed that SpCas9 RNP activation was impaired when the flexibility of the L2 linker was restrained. Similarly, for the REC2 domain, we identified two loops, 196 FEENPIN 202 and 230 PGEKKN 235 connecting adjacent regions interfacing with the HNH domain during catalytic activation. Mutation (substitute all residues of the two loops with alanine, SpCas9 AA ) and truncation (truncate the two loops, SpCas9 trunc ) of the two loops significantly inhibited SpCas9 RNP activation both in vitro and in cellular conditions (Fig. 5 D-G), conforming the importance of REC2 domain in dictating catalytic activation process of SpCas9 RNP. It is worth noting that mutations in the L2 linker and REC2 domain did not significantly change either the thermostability or secondary structure composition of the SpCas9 endonuclease ( Supplementary Fig. 10 ), proving that the effects we observed resulted from the disrupted activation process of SpCas9 RNP per se . In summary, using both in vitro and cellular experimental approaches, we demonstrated the importance of the L2 linker structural plasticity and the REC2 domain in the catalytic activation of SpCas9 RNP, providing further support for our stepwise activation model of SpCas9 RNP. Discussion In this work, we integrated MD simulations with both in vitro cleavage assay and a cellular fluorescence-based reporter to delineate how SpCas9 transitions from its pre-catalytic to the catalytic state. Through the construction of an MSM encompassing unprecedented 160 µs simulation data, we uncovered four metastable states ( S0 , MS1 , MS2 , and S1 ) that form a sequential activation pathway. A central feature of this activation process is the coupled folding and unfolding dynamics of the L1 and L2 linkers, which coordinate the approximately 40 Å large-scale relocation of the HNH domain towards the tDNA strand. Our MD simulations show that L1 folds into a more ordered α-helix, exerting a pulling and rotational force on the HNH N-terminal helical region 3N. Concurrently, L2 undergoes helix disassembly, relieving its restraint on the HNH C-terminal helical region 2C, thereby facilitating the concerted domain reorganization required for catalysis. This “gear-and-wedge” coupling of 3N and 2C — powered by the opposing folding states of L1 and L2 — accounts for the striking amplitude of HNH domain motion, directing HNH domain to first undergo horizontal movement followed by vertical shift to precisely relocate the catalytic residue H840 at the target nucleotide (Fig. 6 ). Our model underscores the critical role of the REC2 domain, which cooperates with the HNH domain during activation. Initially positioned in close proximity to the sgRNA:tDNA heteroduplex and sterically hindering the HNH domain from engaging the scissile phosphate, REC2 gradually moved outward in synchrony with the conformational changes in the HNH domain. Along this path, transient interactions between HNH αN2 and several REC2 loop residues stabilize intermediate metastable states MS1 and MS2 , ensuring a smooth, stepwise transition to the fully active S1 ensemble. The importance of these structural elements (L1, L2, and REC2) in regulating SpCas9 activation was further validated by complementary experiments. Mutations that disrupt L2 linker flexibility (SpCas9 FF and SpCas9 WW ) significantly impair HNH domain relocation, leading to reduced cleavage efficiency, both under cellular conditions and in vitro . Likewise, alanine replacement or truncation of key REC2 loops leads to the deleterious inhibition of SpCas9-mediated DNA cleavage. These observations confirm that perturbations in either linker plasticity or REC2 integrity abrogate proper SpCas9 RNP activation, while leaving the overall protein stability largely intact. Interestingly, we observed a small population of HEK293T cells with visibly elevated GFP signals (GFP-high), which were readily distinguishable by FACS (Fig. 5 D, Supplementary Fig. 9 ). The presence of this GFP-high population may reflect a heightened activation state that enables more rapid and efficient gene editing. Notably, this subgroup was absent in cells expressing SpCas9 mutants, even when the overall GFP-positive cells were detectable. These observations raise the possibility that REC2 and L1, L2 linkers not only facilitate the initial catalytic transition, but may also help sustain robust editing activity/efficiency. Ongoing investigations in our laboratory have aimed to clarify whether the integrity of these regions is crucial for maintaining long-term or repeated rounds of SpCas9-mediated DNA cleavage. Taken together, our findings provide a high-resolution view of the SpCas9 RNP activation mechanism, highlight the dynamic roles of the L1, L2, and REC2 domains, and demonstrate how their coordinated conformational dynamics enable catalytic activation. In addition to advancing the fundamental understanding of SpCas9 function, this mechanistic framework may inform targeted protein engineering strategies aimed at modulating Cas9’s kinetic profile and specificity. By tuning linker dynamics or recalibrating key domain interactions, it may be possible to develop Cas9 variants optimized for diverse applications ranging from gene therapy to synthetic biology. More broadly, our approach of integrating large-scale MD simulations with biochemical and cellular assays underscores the power of combining computational and experimental methods to elucidate complex transient processes in biomolecular machines. Materials and Methods Minimum Energy Path Exploration with Nudge Elastic Band Sampling The cryo-EM structures of the SpCas9:sgRNA:tDNA ternary complex in its pre-catalytic/inactive (PDB ID: 6O0Z) and catalytic/active (PDB ID: 6O0Y) states were used in the current study 10 , with missing amino acids complemented based on sequence homology with available structures or remodeled in SWISS-MODEL 41 . The obtained systems were first subjected to a 10,000-step minimization using the steepest descent algorithm with the CHARMM force field to eliminate potential structural conflicts 42 . To explore the underlying conformational transition path between inactive and active SpCas9 RNP endpoint structures, the nudge elastic band (NEB) method was employed to disclose the transition pathway 18 . The NEB method first predicted the transition pathway from inactive to active SpCas9 RNP as an elastic band consisting of discretized snapshots interconnected by virtual springs. The springs impose forces on their connecting snapshots to ensure that they do not slide towards each other and are evenly distributed along the reaction path, thereby forming a minimum energy path (MEP) for the conformational transition between two designated end structures. The total force on intermediate snapshot \(\:i\) ( \(\:{F}_{i}\) ) can be orthogonally decomposed into parallel ( \(\:{F}_{i}^{//}\) ) and perpendicular ( \(\:{F}_{i}^{\perp\:}\) ) components: $$\:\begin{array}{c}{F}_{i}={F}_{i}^{//}+{F}_{i}^{\perp\:}\#\left(1\right)\end{array}$$ For a system that consists of \(\:N\) atoms, the 3N-dimensional coordinate vector of intermediate snapshot \(\:i\) ( \(\:{R}_{i}\) ) and the 3N-dimensional tangent unit vector ( \(\:{\tau\:}_{i}\) ) could be calculated, which enables calculation of the parallel force \(\:{F}_{i}^{//}\) and perpendicular forces \(\:{F}_{i}^{\perp\:}\) through the following equations: \(\:\begin{array}{c}{F}_{i}^{//}=\left[{k}_{i+1}\left({R}_{i+1}-{R}_{i}\right)-{k}_{i}\left({R}_{i}-{R}_{i-1}\right)\bullet\:\tau\:\right]\tau\:\#\left(2\right)\end{array}\) \(\:\begin{array}{c}{F}_{i}^{\perp\:}=-\nabla\:V\left({R}_{i}\right)+\left(\nabla\:V\left({R}_{i}\right)\tau\:\right)\tau\:\#\left(3\right)\end{array}\) where, \(\:{k}_{i}\) means the elastic spring constant between intermediate snapshot \(\:i\) and \(\:i+1\) , and \(\:\nabla\:V\left({R}_{i}\right)\) indicates the potential energy gradient with respect to the coordinate vector in the entire system of intermediate snapshot \(\:i\) . In our study, the NEB suite within AMBER20 43,44 was used to generate 30 intermediate structures bridging inactive and active SpCas9 RNP. The process included heating the systems to 300 K at 0.5 fs timestep with 1 kcal/mol/Å spring force and 1 ns − 1 Langevin collision efficient. During the subsequent equilibration, annealing, and cooling steps, 50 kcal/mol/Å spring force constant was used. Equilibration runs of the replicates were performed at 300 K with a time step of 1 fs. During the annealing process, the systems were first heated to 500 K and then gradually cooled to 0 K at 0.5 fs timestep. Finally, the replicas were cooled completely at 0 K for 2 ns with a time step of 1 fs. MD Simulation Setup Thirty intermediate snapshots, together with the two end structures, were subjected to unbiased MD simulations. The LEaP program was used to prepare the structures, and the ff14SB force field was employed to describe the ribonucleoprotein complex 45 . All systems were first solvated in an orthorhombic transferable intermolecular potential three-point (TIP3P) water box, followed by adding Na + and Cl − counterions to neutralize the system electrostatics while mimicking in vivo physiological cleavage conditions. Two rounds of energy minimization were first carried out with the whole protein scaffold fixed, followed by removing all constraints for 5000- and 10000-step maximum minimization cycles, respectively. Subsequently, all systems were equilibrated in a canonical ensemble for 700 ps after being heated from 0 K to 300 K within 300ps. Finally, 10 independent 500 ns classical unbiased MD simulations were performed on all 32 systems embedded in an isothermal and isobaric ensemble with periodic boundaries, generating 320 independent trajectories and accumulating 160 µs conformational sampling in total. Langevin dynamics using 1 ps − 1 collision frequency was applied to control the temperature during the simulation. Long-range electrostatic interactions were analyzed using the Particle Mesh Ewald method, and a 10 Å nonbonded cutoff was introduced for short-range electrostatic and van der Waals interactions. The covalent bond interactions involving hydrogen atoms were constrained using the SHAKE algorithm. Snapshots were written out every 50 ps 43 , 44 . Markov State Model Construction Integrating Markov state modeling (MSM) with MD simulations is gaining increasing popularity for the efficiency and accuracy that can be reached when interpreting biomolecular dynamics, and this combination has been proven reproducible when verified with experimental techniques. The Python library PyEMMA was used for the estimation, validation, and analysis of MSM based on simulation trajectories 46 . Implied timescale test confirmed that the activation process of SpCas9 RNP was Markovian and reliable with a 800 microstate model and a lag time of 10 ns. The microstates were then clustered into four macrostates using the PCCA + algorithm, which was confirmed by the Chapman–Kolmogorov test. Using the transition path theory, we measured the transition probability matrix of the MSMs and computed the mean first-passage time between macrostates. Trajectories close to the microstate cluster centers were extracted using the mdtraj package as representative trajectories for each metastable ensemble. Representative conformations of different metastable states were obtained based on representative trajectories 47 . Generalized Cross Correlation Analysis Generalized cross-correlation matrix (GCCM) analysis, as proposed by Grubmüller and Lange 36 , was employed to understand both linear and nonlinear correlated motions between residues. GCCM adopted the fundamental definition of independence of random variables, and treated variables \(\:{x}_{i},\:{x}_{j}\) correlated only when the product of their marginal distribution \(\:{p(x}_{i})\bullet\:p({x}_{j})\) is larger than their joint distribution \(\:p({x}_{i},\:{x}_{j})\) . Thus, mutual information ( \(\:MI\) ) between \(\:{x}_{i}\) and \(\:{x}_{i}\) is defined as \(\:\begin{array}{c}MI\left({x}_{i},\:{x}_{j}\right)=\iint\:p\left({x}_{i},\:{x}_{j}\right)\text{l}\text{n}\frac{p\left({x}_{i},\:{x}_{j}\right).\:}{{p(x}_{i})\bullet\:p({x}_{j})}d{x}_{i}d{x}_{j}\#\left(4\right)\end{array}\) Thus, with the g_correlation tool in Gromacs, generalized correlation coefficients between residue \(\:i\) and residue \(\:j\) ( \(\:{GC}_{ij}\) ) could be calculated through: \(\:\begin{array}{c}{GC}_{ij}={\left\{1-{e}^{-\frac{2MI\left({x}_{i},\:{x}_{j}\right)}{d}}\right\}}^{\frac{1}{2}}\#\left(5\right)\end{array}\) In which \(\:d\) represents the dimensionality of \(\:{x}_{i}\) and \(\:{x}_{j}\) , which is equal to three in our study. To further represent the extent to which the domains are correlated with each other, we introduced the inter-domain correlation \(\:{GC}_{XY}^{domain}\) between domains X and Y, which can be calculated as \(\:\begin{array}{c}{GC}_{XY}^{domain}=\sum\:_{i\in\:X,\:j\in\:Y}{GC}_{ij}\#\left(6\right)\end{array}\) In our study, only \(\:{GC}_{ij}\) above the threshold value of 0.65 is calculated. Cell Culture and Transfection HEK293T cells were used in our study and maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% FBS at 37°C in 5% CO2. One day before transfection, cells were trypsinized and seeded at 1.0 × 10 5 cells/well in 24-well plates. 1µg of plasmids (0.7µg of SpCas9/sgRNA plasmid and 0.3µg of GFxFP reporter plasmid) were co-transfected into HEK293T cells at ~ 60% confluence with 2µL of ExFect Transfection Reagent (Vazyme Biotech, Nanjing, China) according to the manufacturer's instructions. Cells were harvested 24h, 48h and 72h after transfection, and a CytoFlex flow cytometer (Beckman) was used to analyze EGFP fluorescence. Plasmids and Reagents The cDNA for SpCas9 endonuclease purification from E. coli was purchased from Saiheng Biological Technology (Shanghai, China) and inserted into the pET28a vector after fusion with an N-terminal Hisx6 tag. Mutations were introduced into the SpCas9 endonuclease using the Mut Express II Fast Mutagenesis Kit V2 (Vazyme Biotech, Nanjing, China) and verified by DNA sequencing (Personalbio, Shanghai, China). The nucleotide sequences used in our study are provided on request. The cDNA for in vivo expression of SpCas9, sgRNA, and the PAM sequence (between the GF and FP segments) were also purchased from Saiheng Biological Technology and inserted into the pCMV or pBFP vector by Golden Gate assembly. The sgRNAs were synthesized by GENEWIZ (Suzhou, China). Target double-stranded DNA (tDNA) substrates for in vitro cleavage assay were synthesized by Saiheng Biological Technology (Shanghai, China). SpCas9 Endonuclease Purification Chemically competent E. coli Rosetta (DE3) cells (Weidi Biotechnology) were transformed with wild-type or mutant SpCas9 endonucleases. A single colony was picked to inoculate 2xYT medium containing 50µg/mL kanamycin at 37°C. 0.5mM IPTG was added to the bacteria culture when OD 600 reached 0.6–0.8, followed by induction of protein expression at 16°C for 16–18 h. To harvest proteins, the bacteria were lysed under high pressure using a lysis buffer (containing 20mM HEPES and 500 mM NaCl at pH 7.5) and then loaded onto a nickel column (GE Healthcare, Buckinghamshire, UK). After washing with 40 mM imidazole, the proteins were eluted with 250 mM imidazole. The elution was then dialyzed against storage buffer (consisting of 20mM HEPES, 100 mM NaCl, pH 7.5). The purified protein was snap frozen in liquid nitrogen and stored at − 80°C. in vitro DNA Cleavage Assays The 1579-bp tDNA substrate for in vitro cleavage assay contained the target and PAM sequences. in vitro cleavage reactions were performed in 20 µL reaction buffer (20 mM HEPES, 100 mM KCl, 1 mM DTT, 10 mM MgCl2, at pH 7.5) containing 5nM linearized tDNA substrates, 200 nM purified wildtype/mutant SpCas9, and 200 nM sgRNA. The reaction was incubated at 37°C for 60 min and quenched by the addition of 50 mM EDTA, 20 µg Proteinase K for 30 min at room temperature. The products were analyzed by electrophoresis on a 1% agarose, 0.5x TBE gel stained with 4S red plus dye (Sangon, Shanghai, China). The gels were imaged using a Tanon-3500 gel imaging system (Tanon, Shanghai, China) and quantified using ImageJ software. Protein Thermal Shift Assays Mixtures of 10µM wildtype/mutant SpCas9 with 5x SYPRO orange dye (Sigma-Aldrich, St. Louis, MO, USA) were prepared in 1x PBS solution. Samples were analyzed using a Light Cycler 480 real-time PCR instrument system II (Roche, Basel, Switzerland). The temperature was gradually increased at a rate of 0.05°C/s over a range of 25–95°C while the fluorescence was monitored through the SYPRO orange channel. The melting temperature (Tm) was calculated from the melting curve using Light Cycler 480 software (Roche). CD Spectroscopy CD spectra were measured on a Chirascan and Chirascan-plus Circular Dichroism spectrometer (Applied Photophysics of Leatherhead, Surrey, UK) using a 1 mm path length quartz cuvette. The samples were prepared at 0.2 a concentration of ddH2O. Ten scans were performed for each sample, and three independent experiments were performed. Data Analysis Significance levels for comparisons between groups were determined using a paired two-tailed Student’s t -test using GraphPad Prism version 7.00 (La Jolla, CA, USA). Declarations Supporting Information Cryo-EM structures and simulated snapshots for SpCas9 structures, markovianity of the established model, representative structures extracted from metastable ensemble, secondary structural analysis, structural dynamics of HNH domain during activation, correlated motion within SpCas9 during activation, cellular experiments, protein thermos shift assays, and CD spectroscopy. Acknowledgements This study was supported by grants from the National Key R&D Program of China (No. 2023YFC3404700), the National Natural Science Foundation of China (No. 22077082 and No. 81925034), and the Innovative Research Team of High-Level Local Universities in Shanghai. Conflicts of interest The authors declare no conflicts of interest regarding this manuscript. Data availability Starting structures (PDB ID: 6O0Z, 6O0Y) were obtained from RCSB PDB database [https://www.rcsb.org/]. NEB calculations within the AMBER suite were performed to get initial intermediate structures. MD simulations were performed with AMBER suite [https://ambermd.org/]. All 32 structures for MD simulation, as well as scripts for setting simulation parameters, were provided in Source data file. The analysis protocol for Markov State Model referred to PyEMMA [http://www.emma-project.org/latest/]. References Doudna JA (2020) The promise and challenge of therapeutic genome editing. Nature 578:229–236 Anzalone AV, Koblan LW, Liu DR (2020) Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat Biotechnol 38:824–844 Mali P et al (2013) RNA-guided human genome engineering via Cas9. Science 339:823–826 Deveau H, Garneau JE, Moineau S (2010) CRISPR/Cas system and its role in phage-bacteria interactions. Annu Rev Microbiol 64:475–493 Doudna JA, Charpentier E (2014) The new frontier of genome engineering with CRISPR-Cas9. Science 346:1258096 Li T et al (2023) CRISPR/Cas9 therapeutics: progress and prospects. Signal Transduct Target Ther 8:189 Hsu PD, Lander ES, Zhang F (2014) Development and applications of CRISPR-Cas9 for genome engineering. Cell 157:1262–1278 Nishimasu H et al (2014) Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156:935–949 Anders C, Niewoehner O, Duerst A, Jinek M (2014) Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513:569–573 Zhu X et al (2019) Cryo-EM structures reveal coordinated domain motions that govern DNA cleavage by Cas9. Nat Struct Mol Biol 26:679–685 Palermo G et al (2018) Key role of the REC lobe during CRISPR-Cas9 activation by sensing, regulating, and locking the catalytic HNH domain. Q Rev Biophys 51:e9 Jinek M et al (2014) Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science 343:1247997 Babu K et al (2021) Coordinated actions of Cas9 HNH and RuvC nuclease domains are regulated by the bridge helix and the target DNA sequence. Biochemistry 60:3783–3800 Jiang F et al (2016) Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage. Science 351:867–871 Jiang F, Zhou K, Ma L, Gressel S, Doudna JA (2015) A Cas9-guide RNA complex preorganized for target DNA recognition. Science 348:1477–1481 Bravo JPK et al (2022) Structural basis for mismatch surveillance by CRISPR–Cas9. Nature 603:343–347 Pacesa M et al (2022) R-loop formation and conformational activation mechanisms of Cas9. Nature 609:191–196 Bergonzo C, Campbell AJ, Walker RC, Simmerling C (2009) A partial nudged elastic band implementation for use with large or explicitly solvated systems. Int J Quantum Chem 109:3781–3790 Ran FA et al (2013) Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154:1380–1389 Cong L et al (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 339:819–823 Li M et al (2024) Delineating the stepwise millisecond allosteric activation mechanism of the class C GPCR dimer mGlu5. Nat Commun 15:7519 Li X et al (2021) Atomic-scale insights into allosteric inhibition and evolutional rescue mechanism of Streptococcus thermophilus Cas9 by the anti-CRISPR protein AcrIIA6. Comput Struct Biotechnol J 19:6108–6124 Lu S et al (2021) Activation pathway of a G protein-coupled receptor uncovers conformational intermediates as targets for allosteric drug design. Nat Commun 12:4721 Palermo G et al (2017) Protospacer adjacent motif-induced allostery activates CRISPR-Cas9. J Am Chem Soc 139:16028–16031 Palermo G, Miao Y, Walker RC, Jinek M, McCammon JA (2016) Striking plasticity of CRISPR-Cas9 and key role of non-target DNA, as revealed by molecular simulations. ACS Cent Sci 2:756–763 Palermo G, Miao Y, Walker RC, Jinek M, McCammon JA (2017) CRISPR-Cas9 conformational activation as elucidated from enhanced molecular simulations. Proc. Natl Acad. Sci. USA 114, 7260–7265 Lu S et al (2019) Deactivation pathway of Ras GTPase underlies conformational substates as targets for drug design. ACS Catal 9:7188–7196 Schultze S, Grubmüller H (2021) Time-lagged independent component analysis of random walks and protein dynamics. J Chem Theory Comput 17:5766–5776 Ikotun AM, Ezugwu AE, Abualigah L, Abuhaija B, Heming J (2023) K-means clustering algorithms: a comprehensive review, variants analysis, and advances in the era of big data. Inf Sci 622:178–210 Wu H, Noé F (2020) Variational approach for learning Markov processes from time series data. J Nonlinear Sci 30:23–66 Deuflhard P, Weber M (2005) Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl 398:161–184 Kube S, Weber M (2007) A coarse graining method for the identification of transition rates between molecular conformations. J Chem Phys 126:024103 Shibata M et al (2017) Real-space and real-time dynamics of CRISPR-Cas9 visualized by high-speed atomic force microscopy. Nat Commun 8:1430 Ricci CG et al (2019) Deciphering off-target effects in CRISPR-Cas9 through accelerated molecular dynamics. ACS Cent Sci 5:651–662 East KW et al (2020) Allosteric motions of the CRISPR-Cas9 HNH nuclease probed by NMR and molecular dynamics. J Am Chem Soc 142:1348–1358 Lange OF, Grubmüller H (2006) Generalized correlation for biomolecular dynamics. Proteins 62:1053–1061 Wu Z et al (2021) Programmed genome editing by a miniature CRISPR-Cas12f nuclease. Nat Chem Biol 17:1132–1138 Zhang H et al (2023) An engineered xCas12i with high activity, high specificity, and broad PAM range. Protein Cell 14:538–543 Nierzwicki Ł et al (2021) Enhanced specificity mutations perturb allosteric signaling in CRISPR-Cas9. eLife 10, e73777 Hsu PD et al (2013) DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31:827–832 Waterhouse A et al (2018) SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46:W296–W303 Brooks BR et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30:1545–1614 Case DA et al (2023) AmberTools J Chem Inf Model 63:6183–6191 Roe DR, Cheatham TE (2013) Ptraj and cpptraj: software for processing and analysis of molecular dynamics trajectory data. J Chem Theory Comput 9:3084–3095 Maier JA et al (2015) ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J Chem Theory Comput 11:3696–3713 Scherer MK et al (2015) PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J Chem Theory Comput 11:5525–5542 McGibbon RT et al (2015) MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J 109:1528–1532 Additional Declarations There is NO Competing Interest. Supplementary Files Supplementarymaterial20250212.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6018412","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":423443202,"identity":"2e9d20fc-247b-405f-8773-1e9a669a5dca","order_by":0,"name":"Shaoyong Lu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA40lEQVRIiWNgGAWjYBACPgYGNiiT+QCQOEBYCxtCC1sCVAsz0Vp4DIjUIpH+7MGPisNy5vxrvkl8qLnDYM7ej991bBI55oY9Zw4bW854u01yxrFnDJY9hwnZksMmwdt2OHHDjbPbpHnYDjMY3Egm7DDJv2AtZ55J//kH1HL/MSEtCWbSYFvO97BJM7aBbCHkfZ43ZtIyZ9KNDW6wGVv29h3mMTiTbIBXCz870GFvKqzlDM4ffnjjx7fDcgbHDz7Abw0ENDMwSCSwSABZPMQoB4E6oH0HmD8Qq3wUjIJRMApGFgAAsRRKH3LIG4kAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-1334-6292","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":true,"prefix":"","firstName":"Shaoyong","middleName":"","lastName":"Lu","suffix":""},{"id":423443203,"identity":"4d03e071-55cd-4613-8375-564faa468518","order_by":1,"name":"Xinyi Li","email":"","orcid":"","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Xinyi","middleName":"","lastName":"Li","suffix":""},{"id":423443204,"identity":"1430449e-8fdc-4621-9e6b-48a11035e9a9","order_by":2,"name":"Jiacheng Wei","email":"","orcid":"","institution":"Shanghai Jiao Tong University","correspondingAuthor":false,"prefix":"","firstName":"Jiacheng","middleName":"","lastName":"Wei","suffix":""},{"id":423443205,"identity":"ca6ed215-14c4-4065-87c4-1496721f2621","order_by":3,"name":"Feiying Chen","email":"","orcid":"","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Feiying","middleName":"","lastName":"Chen","suffix":""},{"id":423443206,"identity":"21f57caf-9e2a-4015-be20-259442906fb3","order_by":4,"name":"Mingyu Li","email":"","orcid":"","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Mingyu","middleName":"","lastName":"Li","suffix":""},{"id":423443207,"identity":"dc02bc2d-34d2-4e23-abe1-df42cffbea15","order_by":5,"name":"Ning Liu","email":"","orcid":"","institution":"Ningxia Medical University","correspondingAuthor":false,"prefix":"","firstName":"Ning","middleName":"","lastName":"Liu","suffix":""},{"id":423443208,"identity":"7988692f-75f7-4734-8112-f14ce615fba0","order_by":6,"name":"Jian Zhang","email":"","orcid":"https://orcid.org/0000-0002-6558-791X","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Jian","middleName":"","lastName":"Zhang","suffix":""}],"badges":[],"createdAt":"2025-02-13 00:10:21","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6018412/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6018412/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":77697739,"identity":"b3c1f284-8411-45de-acfb-4c58d685098c","added_by":"auto","created_at":"2025-03-04 10:46:15","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1337875,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eStructure overview of SpCas9 RNP in its pre-catalytic/catalytic state\u003c/strong\u003e. (\u003cstrong\u003eA)\u003c/strong\u003e Domain architecture of SpCas9 endonuclease. Structural overview of SpCas9 RNP in pre-catalytic (inactive) (\u003cstrong\u003eB)\u003c/strong\u003e and catalytic (active) (\u003cstrong\u003eC)\u003c/strong\u003e state. SpCas9 is shown in molecular surface (left), with each domain color-coded as in Fig. 1A. HNH domain and L1, L2 linker are highlighted with cartoon representation in the middle, followed by a zoom-in view of L1 and L2 linker on the right. sgRNA (hotpink) and tDNA (purple) are shown as cartoon. The nucleotide sequence of sgRNA (magenta) and DNA (purple) is shown in (\u003cstrong\u003eD)\u003c/strong\u003e. Cyan arrow indicates the site of cleavage, and PAM is highlighted with red rectangle.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/551a49c8ad539a3b27cf9cf0.png"},{"id":77698820,"identity":"3e70c82d-5506-4ff9-96af-72973cf63fe7","added_by":"auto","created_at":"2025-03-04 10:54:14","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":615575,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eStepwise activation of SpCas9 RNP unveiled by Markov state model.\u003c/strong\u003e (\u003cstrong\u003eA)\u003c/strong\u003e Free-energy landscape (FEL) of the first two time-lagged independent components (tIC1 and tIC2). Relative energy level is color-coded by the colormap on the right. Colored round disks denoted the location of representative structures from identified metastable states in Fig. 2B. (\u003cstrong\u003eB)\u003c/strong\u003e Four metastable states \u003cem\u003eS0\u003c/em\u003e, \u003cem\u003eMS1\u003c/em\u003e, \u003cem\u003eMS2\u003c/em\u003e, \u003cem\u003eS1\u003c/em\u003e on the FEL of two distance parameters d\u003csub\u003eHNH-N\u003c/sub\u003e and d\u003csub\u003eREC2-N\u003c/sub\u003e. Percentage of each metastable state is shown next to the corresponding colormap. (\u003cstrong\u003eC)\u003c/strong\u003e Transition path theory committor map representing the forward committor probability for transition from \u003cem\u003eS0\u003c/em\u003e to \u003cem\u003eS1\u003c/em\u003e. Committor value is color-coded by the colormap shown on the right. Revealed a sequential pathway from \u003cem\u003eS0\u003c/em\u003e → \u003cem\u003eMS1\u003c/em\u003e → \u003cem\u003eMS2\u003c/em\u003e → \u003cem\u003eS1.\u003c/em\u003e (\u003cstrong\u003eD)\u003c/strong\u003e An overview of SpCas9 RNP stepwise activation following\u003cem\u003e S0\u003c/em\u003e → \u003cem\u003eMS1\u003c/em\u003e → \u003cem\u003eMS2\u003c/em\u003e → \u003cem\u003eS1\u003c/em\u003e transition. Representative structures were extracted from each metastable state for demonstration. Mean first passage time (MFPT) between metastable structures indicated transition timescale. d\u003csub\u003eHNH-N\u003c/sub\u003e and d\u003csub\u003eREC2-N\u003c/sub\u003e are measured in each structure. Domains are color-coded as in \u003cstrong\u003eFig. 1\u003c/strong\u003e.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/fe08f29211d0247d01e571ba.png"},{"id":77698822,"identity":"fd8c6175-0074-4b4f-8e04-fe6bbd490a62","added_by":"auto","created_at":"2025-03-04 10:54:15","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":2270761,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSpCas9 RNP stepwise activation model.\u003c/strong\u003eSpCas9 RNP catalytic activation follows transition pathway from metastable state \u003cem\u003eS0 \u003c/em\u003e(\u003cstrong\u003eA)\u003c/strong\u003e, to \u003cem\u003eMS1\u003c/em\u003e\u003cstrong\u003e (B)\u003c/strong\u003e, then to \u003cem\u003eMS2\u003c/em\u003e\u003cstrong\u003e (C)\u003c/strong\u003e, finally reaching \u003cem\u003eS1\u003c/em\u003e\u003cstrong\u003e (D)\u003c/strong\u003e. In each panel, upper left graph shows the secondary structures of REC2, L1 linker, HNH domain and L2 linker (from left to right) from representative trajectories in the corresponding metastable ensemble. Secondary structures are encoded by different colors as the colormap shows. The upper middle graph shows the major interactions between REC2 and HNH domain. The violin graph next to it calculates the distances between bonding residues in REC2 and HNH domain in representative trajectories from the corresponding metastable ensemble. The background of the violin graph is colored by the prevalence of each interaction in different metastable states: grey for \u003cem\u003eS0\u003c/em\u003e, green for \u003cem\u003eMS1\u003c/em\u003e, blue for \u003cem\u003eMS2\u003c/em\u003e and pink for \u003cem\u003eS1\u003c/em\u003e. The lower middle graph shows the bonds within HNH domain, between the N-terminal 3N and C-terminal 2C helical regions. Catalytic residue H840 is shown as sphere and highlighted with red arrow. The distance between the implicated bonding residues within HNH domain is also shown as violin plot in lower right panel. The lower left is a schematic “\u003cem\u003egear”\u003c/em\u003e (3N) and \u003cem\u003e“wedge”\u003c/em\u003e (2C) model summarizing the metastable state, where the position of “\u003cem\u003egear”\u003c/em\u003e (3N), manipulated by contraction of its adjacent L1 linker, orientates \u003cem\u003e“wedge”\u003c/em\u003e(2C) to determine HNH domain relocation. Structural superimposition between \u003cem\u003eS0\u003c/em\u003eand \u003cem\u003eMS1\u003c/em\u003e\u003cstrong\u003e (E)\u003c/strong\u003e, \u003cem\u003eMS1\u003c/em\u003e and \u003cem\u003eMS2\u003c/em\u003e\u003cstrong\u003e (F)\u003c/strong\u003e, \u003cem\u003eMS2\u003c/em\u003e and \u003cem\u003eS1\u003c/em\u003e (\u003cstrong\u003eG)\u003c/strong\u003e representative structures to show the stepwise conformational change, with \u003cem\u003eS0\u003c/em\u003e, \u003cem\u003eMS1\u003c/em\u003e and \u003cem\u003eMS2\u003c/em\u003e rendering semi-transparent to indicate the starting structure of SpCas9 RNP during each transition process. The structural differences suggested that HNH domain first moved horizontally during \u003cem\u003eS0 -\u0026gt; MS1\u003c/em\u003e and \u003cem\u003eMS1 -\u0026gt; MS2\u003c/em\u003etransition, then shifted vertically \u003cem\u003eMS2 -\u0026gt; S1\u003c/em\u003e to approach target.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/35d9a7ec8a433f32520ea811.png"},{"id":77698821,"identity":"d9f20094-e8d9-4aea-8986-901d19db6119","added_by":"auto","created_at":"2025-03-04 10:54:15","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":779060,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCorrelated motion of REC2, L1, HNH and L2 domains during catalytic activation.\u003c/strong\u003e Inter-residue (upper triangle) and inter-domain (lower triangle) correlation are calculated with GCCM based on representative trajectories from metastable state \u003cem\u003eS0\u003c/em\u003e\u003cstrong\u003e (A)\u003c/strong\u003e, \u003cem\u003eMS1\u003c/em\u003e\u003cstrong\u003e(B)\u003c/strong\u003e, \u003cem\u003eMS2\u003c/em\u003e (\u003cstrong\u003eC)\u003c/strong\u003e and \u003cem\u003eS1\u003c/em\u003e\u003cstrong\u003e (D)\u003c/strong\u003e. Colormap for inter-residue and inter-domain correlation is shown on the right. Generalized correlation between all the residues/domains is shown in \u003cstrong\u003eSupplementary Figure 8\u003c/strong\u003e.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/f79360ec1f434fec505d0457.png"},{"id":77697737,"identity":"7c27ea76-2361-45d5-bd1a-171aa6092258","added_by":"auto","created_at":"2025-03-04 10:46:15","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":668907,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eExperiment validation of SpCas9 RNP stepwise activation model.\u003c/strong\u003e (\u003cstrong\u003eA) \u003c/strong\u003eSchematic representation of the florescence-based system for evaluating SpCas9 RNP activation in HEK293T cells. (\u003cstrong\u003eB)\u003c/strong\u003e Flowchart of \u003cem\u003ein vitro\u003c/em\u003e SpCas9 cleavage reaction. (\u003cstrong\u003eC)\u003c/strong\u003e Summary of REC2 and L2 linker mutants used in the current study. Results of two additional L2 linker mutants SpCas9\u003csup\u003eL908A\u003c/sup\u003e, SpCas9\u003csup\u003eL909A\u003c/sup\u003e, are shown in \u003cstrong\u003eSupplementary Figure 9\u003c/strong\u003e. (\u003cstrong\u003eD)\u003c/strong\u003e FACS dot blot showing gates for FITC (x-axis) and SSC (y-axis) for HEK293T cells co-transfecting wildtype/mutant SpCas9, sgRNA and GFxFP reporter plasmid. The GFP-positive group is highlighted with a magenta rectangle, and the percentage of this group is indicated above the rectangle. The GFP-positive group is further divided into GFP-low and GFP-high subgroups by black rectangles, and the percentages of the subgroups are indicated below. (\u003cstrong\u003eE)\u003c/strong\u003e Quantification of GFP-positive population in HEK293T cells expressing WT SpCas9, SpCas9\u003csup\u003eFF\u003c/sup\u003e, SpCas9\u003csup\u003eWW\u003c/sup\u003e, SpCas9\u003csup\u003eAA\u003c/sup\u003e, SpCas9\u003csup\u003etrunc\u003c/sup\u003e. (\u003cstrong\u003eF)\u003c/strong\u003e Gel electrophoresis of tDNA substrate from \u003cem\u003ein vitro\u003c/em\u003e SpCas9 cleavage experiments after 0, 1, 5, 15, 30, 60, 120 min reaction. The top band is the uncleaved substrate, and the lower two bands are cleaved products. (\u003cstrong\u003eG)\u003c/strong\u003e Percentage of cleaved products with respect to initial amount at different time. Values are mean (three biologically independent experiments) ± S.E.M. Significance value is calculated with unpaired two-tail t-test. * p \u0026lt; 0.05, ** p \u0026lt; 0.01, *** p \u0026lt; 0.001, **** p \u0026lt; 0.0001.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/c8f895be1cd098be82fe5106.png"},{"id":77697736,"identity":"e3631ea3-94df-49ef-bbb4-666eccc48a53","added_by":"auto","created_at":"2025-03-04 10:46:15","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":236064,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSchematic summary of the “\u003c/strong\u003e\u003cem\u003e\u003cstrong\u003egear (3N)-and-wedge (2C)\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e” model for stepwise catalytic activation of SpCas9 RNP.\u003c/strong\u003eSpCas9 RNP activation is driven by the coordinated folding of the L1 linker and unfolding of the L2 linker, which jointly reposition and rotate the HNH domain to approach target DNA. Concurrently, the REC2 domain shifts away from the sgRNA:tDNA heteroduplex, creating space to accommodate the incoming HNH domain. Additionally, the departing REC2 domain forms transient stabilizing interactions with HNH domain that support the transitional states along the activation pathway.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/5836ed5ec1e02d19a8869437.png"},{"id":97673430,"identity":"49414f30-a0c9-4b52-85f3-b833cfd4bdad","added_by":"auto","created_at":"2025-12-08 09:40:10","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7551967,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/8ec879eb-4b06-4ffe-bdb8-b37b1ad48c58.pdf"},{"id":77697744,"identity":"61b0896d-e176-4445-aa20-36d34d4dee55","added_by":"auto","created_at":"2025-03-04 10:46:15","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":16048214,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial20250212.docx","url":"https://assets-eu.researchsquare.com/files/rs-6018412/v1/dedf0a065d4ba5b62466a6d4.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Untangling the Molecular Mechanism of SpCas9 Catalytic Activation: A Gear-and-Wedge Fitting Model","fulltext":[{"header":"Introduction","content":"\u003cp\u003eClustered regularly interspaced short palindromic repeats (CRISPR-Cas) systems, derived from adaptive immune mechanisms in bacteria, have revolutionized genome editing by enabling precise target DNA cleavage guided by programmable RNA\u003csup\u003e\u003cspan additionalcitationids=\"CR2 CR3 CR4\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. Currently, owing to its high editing efficiency and adaptability, \u003cem\u003eStreptococcus pyogenes\u003c/em\u003e Cas9 (SpCas9) is one of the most widely employed endonucleases for CRISPR-Cas-relevant genome-editing applications\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. Crystallographic or cryo-electron microscopy (cryo-EM) studies have revealed a bilobed structure adopted by SpCas9 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), consisting of a recognition (REC) lobe and a nuclease (NUC) lobe\u003csup\u003e\u003cspan additionalcitationids=\"CR9\" citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. The REC lobe encompasses three α-helical domains, REC1, REC2 and REC3, all of which have been suggested to play an important role in mediating nucleotide interactions\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. The NUC lobe harbors two nuclease domains, HNH and RuvC, followed by a C-terminal domain (CTD). The HNH and RuvC domains are connected by L1 and L2 linkers and are responsible for the concerted cleavage of target and non-target DNA strand\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e,\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e, respectively.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe process of the SpCas9 endonuclease action has been elaborated in great detail\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. In brief, SpCas9 first complexes with a single-guide RNA (sgRNA) containing the CRISPR sequence for the specific recognition of target DNA (tDNA) with complementarity. Recognition of the species-specific protospacer adjacent motif (PAM) on the non-target strand initiates this process, followed by the invasion of sgRNA into tDNA and the formation of an sgRNA:tDNA heteroduplex. To this stage, the ternary SpCas9:sgRNA:tDNA ribonucleoprotein complex (SpCas9 RNP) has gathered all necessary elements for cleavage, except that the HNH nuclease domain locates more than approximately 40 \u0026Aring; away from its target nucleotide, adopting a pre-catalytic/inactive conformation\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). For the cleavage reaction to occur, the H840 catalytic residue within the HNH domain must undergo a massive displacement of approximately 40 \u0026Aring; and rotation of approximately 110\u0026deg; to approach its target phosphate (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). Much effort has been devoted to understanding the process of catalytic activation of the SpCas9 RNP, focusing on elucidating how the HNH domain fulfills such a dramatic relocation. Crystallographic and cryo-EM studies have attempted to capture the structure of SpCas9 RNP at various stages of activation\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan additionalcitationids=\"CR16\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. However, owing to the highly dynamic and transient nature of the activation process, the majority of the obtained structures only represent the system in its endpoint pre-catalytic/catalytic structures. The critical molecular basis of HNH domain relocation towards tDNA within SpCas9 RNP during catalytic activation remains unknown.\u003c/p\u003e \u003cp\u003eIn this study, we combined extensive unbiased molecular dynamics (MD) simulations with \u003cem\u003ein vitro\u003c/em\u003e DNA cleavage assay and green fluorescent protein reporter system in mammalian cells to disclose the conformational dynamics underlying SpCas9 RNP catalytic activation. Nudged elastic band method was used to generate intermediate conformations between solved inactive and active states\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, followed by extensive MD simulations to reach an unprecedent 160 \u0026micro;s timescale. A Markov state model was constructed to elucidate the stepwise structural transition pattern of the SpCas9 RNP, where HNH domain relocation was predominantly driven by the intrinsic conformational plasticity of its adjacent L1 and L2 linkers. During activation, the L1 linker folds into a highly ordered helical structure, whereas the L2 linker disassembles into a disordered loop. The two concurrent structural changes drive the HNH domain to redock approximately 40 \u0026Aring; from L2-proximal to L1-proximal region. Additionally, we revealed the structural basis of HNH domain rotation during activation, identifying it as a result of rotational forces exerted on the HNH N-terminal α-helical units by the contraction of L1 linker. This rotation propagates to the HNH C-terminal α-helical units through extensive interactions, resembling the mechanism of a rotating gear repositioning its fitted wedge. In addition to the primary driving force of the L1 and L2 linkers, the REC2 domain was found to relocate cooperatively with the HNH domain, enabling formation of distinctive interactions with the HNH domain to stabilize the transitional structures during activation. Furthermore, we designed a fluorescence-based system to evaluate SpCas9 RNP activation under various cellular conditions. Together with \u003cem\u003ein vitro\u003c/em\u003e DNA cleavage assay, we experimentally verified the contribution of the L1 and L2 linker structural flexibility and the REC2 domain to SpCas9 RNP catalytic activation, providing further support for our activation model.\u003c/p\u003e \u003cp\u003eTaken together, by combining \u003cem\u003ein silico, in vitro\u003c/em\u003e, and intracellular fluorescence-based reporter systems, our work illuminated the detailed atomic mechanism for the catalytic activation of SpCas9 RNP and highlighted the contribution of the L1, L2 linker, and REC2 domains in the process. The resolved mechanism of SpCas9 RNP activation could advance our ability to rationally design SpCas9-based genome editing tools with enhanced functionality\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e,\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eExtensive Unbiased MD Simulations Reveal Activation Pattern of SpCas9 RNP at Millisecond Timescale\u003c/h2\u003e \u003cp\u003eTo elucidate the SpCas9 RNP activation pathway, we employed the nudged elastic band (NEB) method to generate 30 intermediate conformations bridging the experimentally resolved inactive/pre-catalytic (PDB ID: 6O0Z) and active/catalytic (PDB ID: 6O0Y) states\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. The two solved cryo-EM structures, which differed significantly in their domain organization, only represented static end snapshots during the activation process of SpCas9 RNP. The 30 intermediate states produced by NEB method formed a hypothetical transition pathway connecting the two endpoint structures during SpCas9 RNP activation. These NEB-generated snapshots revealed progressive structural reorganization towards the active state, as exemplified by the relocation of the HNH domain towards its cleavage site on the tDNA strand (\u003cb\u003eSupplementary Fig.\u0026nbsp;1\u003c/b\u003e). To understand the molecular basis underlying SpCas9 RNP activation and to identify targetable transition states in the process, each of the 32 structures was subjected to 500 ns extensive unbiased MD simulations and independently repeated 10 times per structure, gathering a cumulative timescale of 160 \u0026micro;s (MD simulation details provided in \u003cb\u003eMethods\u003c/b\u003e). According to previous studies\u003csup\u003e\u003cspan additionalcitationids=\"CR22 CR23 CR24 CR25 CR26\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e, the timescale reached in the current study is sufficient to investigate SpCas9 RNP activation kinetics.\u003c/p\u003e \u003cp\u003eTo capture the essential dynamics, we first performed time-lagged independent component analysis (TICA) on the obtained trajectories\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e. The first two time-lagged independent components (tIC1 and tIC2), which captured the most principle kinetic dynamics, were used to generate a TICA free-energy landscape (FEL) to provide preliminary insights into the activation dynamics of SpCas9 RNP (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). The inactive and active starting structures were found locating within the two major energy basins at opposite ends of the TICA landscape (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA, \u003cb\u003eSupplementary Fig.\u0026nbsp;2A, B\u003c/b\u003e). The TICA FEL revealed a continuous conformational transition pathway connecting the two energy basins, confirming that our simulations efficiently recapitulated the activation pathway.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo further investigate structural intermediates with biological relevance during SpCas9 RNP activation, two distance parameters were selected: the distance between the catalytic H840 residue in the HNH nuclease domain and the tDNA cleavage site (d\u003csub\u003eHNH\u0026minus;N\u003c/sub\u003e) and the distance between the REC2 domain and the sgRNA:tDNA heteroduplex (d\u003csub\u003eREC2\u0026minus;N\u003c/sub\u003e). By comparing the two distances in the cryo-EM structures of SpCas9 RNP in its pre-catalytic and catalytic states, we observed an approximately 6 \u0026Aring; displacement of REC2 away from the sgRNA:tDNA duplex and an approximately 40 \u0026Aring; movement of the HNH domain towards the cleavage site (\u003cb\u003eSupplementary Fig.\u0026nbsp;2C\u003c/b\u003e). Various crystal or cryo-EM structures have captured the REC2 and HNH domains at distinct spatial positions\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan additionalcitationids=\"CR16\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e, further confirming the importance of the conformational dynamics of these domains during activation. Additionally, the FEL of the two distance parameters exhibited a continuous distribution of states with distinct energy basins, indicating the existence of potential metastable states along the activation pathway (\u003cb\u003eSupplementary Fig.\u0026nbsp;3A, B\u003c/b\u003e).\u003c/p\u003e \u003cp\u003eTo construct a kinetic model of structural transitions during SpCas9 RNP activation, we first discretized the two distance parameters calculated from all 160 \u0026micro;s simulations into 600 clusters using K-means clustering algorithm\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. The number of clusters chosen here maximized the kinetic variance within the chosen feature, as estimated by the variational approach for Markov processes (VAMP2) score\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e (\u003cb\u003eSupplementary Fig.\u0026nbsp;3C\u003c/b\u003e). These clusters are treated as discrete states within the system. As the system evolved, transitions between clusters over time formed the basis for establishing a Markov state model (MSM). We constructed an MSM for SpCas9 RNP activation with a lag time of 10 ns, and rigorously tested its Markovianity (\u003cb\u003eSupplementary Fig.\u0026nbsp;3D, E\u003c/b\u003e). Perron Cluster Cluster Analysis (PCCA)\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e identified the presence of four metastable states (denoted as \u003cem\u003eS0\u003c/em\u003e, \u003cem\u003eMS1\u003c/em\u003e, \u003cem\u003eMS2\u003c/em\u003e, and \u003cem\u003eS1\u003c/em\u003e). Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). \u003cem\u003eS0\u003c/em\u003e and \u003cem\u003eS1\u003c/em\u003e encompassed the majority of the simulated samples, occupying 21.72% and 52.83% of the conformational space, respectively. \u003cem\u003eS0\u003c/em\u003e and \u003cem\u003eS1\u003c/em\u003e also represented kinetically stable conformational ensembles of catalytically inactive and active SpCas9 RNP, as the majority of the solved structures fell into the two states (\u003cb\u003eSupplementary Fig.\u0026nbsp;3F, G\u003c/b\u003e). The two minor metastable states, \u003cem\u003eMS1\u003c/em\u003e and \u003cem\u003eMS2\u003c/em\u003e, which represented 15.26% and 10.20% of the conformational space, respectively, constituted the key intermediate states during SpCas9 RNP activation. Based on the established MSM, transition path theory\u003csup\u003e\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e disclosed that SpCas9 RNP catalytic activation followed a sequential pathway from \u003cem\u003eS0\u003c/em\u003e \u0026rarr; \u003cem\u003eMS1\u003c/em\u003e \u0026rarr; \u003cem\u003eMS2\u003c/em\u003e \u0026rarr; \u003cem\u003eS1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). The relevant time of activation was further estimated by calculating the mean first passage time (MFPT) between the metastable states \u003cem\u003eS0\u003c/em\u003e and \u003cem\u003eS1\u003c/em\u003e along the sequential transition path. MFPT indicated a timescale of ~\u0026thinsp;4.5 millisecond for SpCas9 RNP catalytic activation (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). The millisecond timescale of SpCas9 catalytic activation calculated using our model aligned well with previous MD simulations and experimental observations\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e,\u003cspan additionalcitationids=\"CR34\" citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eTaken together, our extensive 160 \u0026micro;s unbiased MD simulations enabled construction of an MSM based on two critical distance parameters. This model effectively recapitulated the kinetics of SpCas9 RNP activation, revealing stepwise activation dynamics that prime the complex for tDNA cleavage.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eCoordinated Structural Dynamics of L1 and L2 Linkers Modulate HNH Domain Relocation\u003c/h3\u003e\n\u003cp\u003eTo further understand the key conformational rearrangement events during stepwise activation, representative structures were extracted from the metastable ensembles (\u003cb\u003eSupplementary Fig.\u0026nbsp;4A-D\u003c/b\u003e). Consistent with previous studies\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e, our simulations also suggested that the overall bilobed architecture of SpCas9 was preserved during catalytic activation, with prominent structural dynamics observed in the REC2 and HNH domains (\u003cb\u003eSupplementary Fig.\u0026nbsp;4E-G\u003c/b\u003e). The HNH domain of SpCas9 consists of three N-terminal helices (referred as 3N, including helix αN1, αN2, and αN3) and two C-terminal helices (referred as 2C, including helix αC1 and αC2) (\u003cb\u003eSupplementary Fig.\u0026nbsp;5\u003c/b\u003e). 3N and 2C are connected by a long, disordered loop harboring the catalytic residue H840. The 3N helical region directly connects to the L1 linker through αN1, while the 2C region connects to the L2 linker via αC1. Throughout our simulations, 2C helices remained largely stable, whereas the 3N helical regions, especially αN2, underwent noticeable secondary structural rearrangements during activation (\u003cb\u003eSupplementary Fig.\u0026nbsp;6C\u003c/b\u003e).\u003c/p\u003e \u003cp\u003eIn the metastable state \u003cem\u003eS0\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA, \u003cb\u003eSupplementary Fig.\u0026nbsp;4A\u003c/b\u003e), the HNH domain was located near the PAM-distal region of the sgRNA:tDNA heteroduplex, away from the scissile phosphate, rendering SpCas9 RNP catalytically inactive. The molecular basis underlying such HNH domain location is the highly ordered helical structure of the L2 linker (\u003cb\u003eSupplementary Fig.\u0026nbsp;6B\u003c/b\u003e), which anchors the HNH domain in the proximity of RuvC domain through the 2C helical region. Residue A889 and Y882 of αC2 from the tethered 2C helices interact with R783 of αN1 and Y815 of αN3 from the HNH N-terminal 3N region. Under these circumstances, the disordered loop sandwiched between the N- and C-terminal helical regions is positioned away from the catalytic core. Additionally, K878 of αC2 interacts with M822 and D825 within the disordered loop, effectively concealing the catalytic H840 residue within the NUC lobe.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eActivation of SpCas9 RNP is initiated as \u003cem\u003eS0\u003c/em\u003e transitions towards \u003cem\u003eMS1\u003c/em\u003e. Key events in this process included the partial folding of the L1 linker accompanied by the unfolding of the L2 linker (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB, \u003cb\u003eSupplementary Fig.\u0026nbsp;6A, B\u003c/b\u003e). With the collapse of the L2 linker helical conformation, the tethered HNH domain gained increased flexibility, which was the basis for its subsequent translocation. The partially folded L1 linker directly exerts a pulling force on its adjacent helix αN1 in 3N helical region, resulting in ~\u0026thinsp;40\u0026deg; rotation of 3N while dragging HNH domain towards the PAM-proximal region of the heteroduplex. The rotation of 3N tuple by L1 linker, together with coordinated unfolding of L2 linker, further propagates to the whole HNH domain through the emerging interaction between R783 of αN1 with K890 of αC2. The disordered loop between two helical regions swings\u0026thinsp;~\u0026thinsp;30\u0026deg; anticlockwise, enabling novel interactions with αC2. Together, these events contributed to a substantial\u0026thinsp;~\u0026thinsp;16 \u0026Aring; displacement (approximately 22 \u0026Aring; for H840) and approximately 49\u0026deg; rotation of the HNH domain (around the SpCas9 RNP central axis, \u003cb\u003eSupplementary Fig.\u0026nbsp;4B, E, 7A, B\u003c/b\u003e).\u003c/p\u003e \u003cp\u003eThe subsequent transition from \u003cem\u003eMS1\u003c/em\u003e to \u003cem\u003eMS2\u003c/em\u003e involved further concerted folding of the L1 linker and unfolding of the L2 linker (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC, F, \u003cb\u003eand Supplementary Fig.\u0026nbsp;6A, B\u003c/b\u003e). Continued contraction of the L1 linker keeps spinning the 3N helical tuple while pulling it towards the tDNA, which further rotates 2C through structural complementarity. Compared with \u003cem\u003eS0\u003c/em\u003e, the 3N helical region undergoes an approximately 90\u0026deg; rotation, which further reposes 2C. A new interaction forms between Y882 of αC2 and N818 of the disordered region, facilitating coordinated movement of 3N and 2C. This results in a further approximately 17 \u0026Aring; movement (approximately 18 \u0026Aring; for H840) and approximately 40\u0026deg; rotation of the HNH domain (around the SpCas9 RNP central axis), relocating the HNH domain to the center of the SpCas9 RNP (\u003cb\u003eSupplementary Fig.\u0026nbsp;4C, 4F, 7A, 7C\u003c/b\u003e).\u003c/p\u003e \u003cp\u003eDuring the transition from metastable state \u003cem\u003eS0\u003c/em\u003e to \u003cem\u003eMS1\u003c/em\u003e and then to \u003cem\u003eMS2\u003c/em\u003e, the HNH domain mainly undergoes horizontal displacement to reach the center of the SpCas9 RNP. However, HNH domain still locates\u0026thinsp;~\u0026thinsp;10 \u0026Aring; away from its target in metastable state \u003cem\u003eMS2\u003c/em\u003e. The catalytically active metastable state \u003cem\u003eS1\u003c/em\u003e is reached when L1 adopts a fully helical structure that anchors the HNH domain, whose flexibility is conferred by a completely extended L2 linker (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD, E). The highly ordered L1 linker further spins 3N approximately 30\u0026deg;, shifting 2C to its lower position, thus posing the H840-haboring disordered loop even closer to the tDNA. Such conformation is further \u0026ldquo;locked\u0026rdquo; by the interaction between A889, K890 of αC2 and R778 of L1 linker. Notably, apart from the approximately 27\u0026deg; rotation (around the SpCas9 RNP central axis), the transition from \u003cem\u003eMS2\u003c/em\u003e to \u003cem\u003eS1\u003c/em\u003e mainly involved an approximately 11 \u0026Aring; para-axis upward shift of the HNH domain (also approximately 11 \u0026Aring; for H840) to dock onto the sgRNA:tDNA heteroduplex (\u003cb\u003eSupplementary Fig.\u0026nbsp;7D\u003c/b\u003e).\u003c/p\u003e \u003cp\u003eTaken together, through analyzing the experimentally inaccessible transition states, we found that the HNH domain underwent synchronous relocation and rotation during activation. Compared to the catalytically incompetent metastable state \u003cem\u003eS0\u003c/em\u003e, the HNH N-terminal helical region 3N undergoes approximately 120\u0026deg; counterclockwise rotation (\u003cb\u003eSupplementary Fig.\u0026nbsp;7A\u003c/b\u003e), which drives approximately 27 \u0026Aring; displacement (approximately 37 \u0026Aring; for the H840 catalytic residue), accompanied by approximately 104\u0026deg; rotation (around the SpCas9 RNP central axis) of the HNH domain to approach its target nucleotide to reach the catalytically competent state \u003cem\u003eS1\u003c/em\u003e. Interestingly, our model also suggested that stepwise relocation of the HNH domain begins with a horizontal motion to proximate the central canal (\u003cb\u003eSupplementary Fig.\u0026nbsp;4E, F\u003c/b\u003e), followed by a shift upward to approach the tDNA (\u003cb\u003eSupplementary Fig.\u0026nbsp;4G\u003c/b\u003e). Moreover, our MD simulation-based study revealed a mechanism by which the cooperative structural dynamics of the L1 and L2 linkers drive relocation of the HNH domain. The contracting L1 linker rotates its adjacent HNH N-terminal helical region 3N, which through structural complementarity, spins the HNH C-terminal helical region 2C, resembling the way a \u0026ldquo;\u003cem\u003egear\u0026rdquo;\u003c/em\u003e (3N) orientates its matching \u003cem\u003e\u0026ldquo;wedge\u0026rdquo;\u003c/em\u003e (2C). Meanwhile, the disassembling L2 linker unleashes its restraint on 2C to enable its displacement, further permitting the H840-habouring disordered loop sandwiched between the two helical regions to relocate near the nucleotide substrate. Hence, the structural dynamics of the L1 and L2 linkers play a pivotal role in SpCas9 RNP catalytic activation.\u003c/p\u003e\n\u003ch3\u003eCoupled Motion between REC2 and HNH Domains Drives SpCas9 RNP Catalytic Activation\u003c/h3\u003e\n\u003cp\u003eThe REC lobe mainly facilitates the binding and recognition of nucleotide substrates; however, recent studies have suggested a role in mediating SpCas9 RNP catalytic activation. In line with experimental observations, in our model, as SpCas9 RNP trends towards activation following \u003cem\u003eS0\u003c/em\u003e -\u0026gt; \u003cem\u003eMS1\u003c/em\u003e -\u0026gt; \u003cem\u003eMS2\u003c/em\u003e -\u0026gt; \u003cem\u003eS1\u003c/em\u003e transition path, REC2 domain gradually positions away from the sgRNA:tDNA heteroduplex, thus providing space for accommodating HNH domain (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). The pattern of motion correlation between the residues during catalytic activation was analyzed using a generalized cross-correlation matrix (GCCM) algorism\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e. In general, the inter-residue correlation decreased when trending towards the active state, and the highest level of correlation was found in \u003cem\u003eS0\u003c/em\u003e metastable state (\u003cb\u003eSupplementary Fig.\u0026nbsp;8A, B\u003c/b\u003e), suggesting a tight information flow within the complex under catalytically inactive state. The coordinated motion between the REC2 and HNH domains was retained throughout activation (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e), supporting the importance of the REC2 domain in regulating HNH domain relocation. As catalytic activation progressed, the correlated motion between the REC2 domain and L2 linker gradually diminished, whereas that between REC2 and L1 linker persisted. This observation aligns well with the structural folding and refolding dynamics of L1 and L2 linkers during activation. To understand the essential structural events underlying the motional correlation between REC2 and the HNH domains, we looked more closely at the interfacing residues between the two domains (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn the inactive metastable state, \u003cem\u003eS0\u003c/em\u003e, the REC2 domain docked close to the central sgRNA:tDNA heteroduplex immediately upstream of the target nucleotide, thus occluding HNH domain from accessing the tDNA. At this stage, E197 and E198 of REC2 interact with R780 and K782 of αN1 in HNH domain N-terminal region 3N, stabilizing HNH domain in its position distant from the cleavage site (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA, \u003cb\u003eSupplementary Fig.\u0026nbsp;4A\u003c/b\u003e). As activation is initiated, the HNH domain begins to rotate due to the structural plasticity of the L1 and L2 linkers, thereby changing its interfacing residues with REC2. In the intermediate metastable conformation \u003cem\u003eMS1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB, \u003cb\u003eSupplementary Fig.\u0026nbsp;4B, E\u003c/b\u003e), HNH rotation results in displacement of αN1, weakening its interaction with REC2 domain. The rotation of 3N brings αN2 close to REC2, allowing contacts to be established between E802 and Q805 of αN2 and R220 and S219 in the REC2 domain. With further contraction of L1 linker, SpCas9 RNP transits to metastable state \u003cem\u003eMS2\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC, \u003cb\u003eSupplementary Fig.\u0026nbsp;4C, F\u003c/b\u003e), in which αN1 rotating away from the REC2 domain. Under this state, transient interaction between E223 of REC2 and K797 of αN2 establishes, contributing to stabilization of HNH domain in a transitional metastable state. Ultimately, the HNH domain approaches tDNA, which is enabled by the approximately 11 \u0026Aring; outward relocation of the REC2 domain away from the central heteroduplex (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). In the corresponding metastable ensemble \u003cem\u003eS1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD, \u003cb\u003eSupplementary Fig.\u0026nbsp;4D, G\u003c/b\u003e), R221 of REC2 interacts with E798 of αN2. REC2 domain undergoes the most prominent outward shift while transiting to metastable state \u003cem\u003eS1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD), and such displacement is in well coordination with the ~\u0026thinsp;11 \u0026Aring; vertical shift (parallel to SpCas9 RNP central axis) of HNH domain during \u003cem\u003eMS2\u003c/em\u003e -\u0026gt; \u003cem\u003eS1\u003c/em\u003e transition to dock onto the target scissile phosphate.\u003c/p\u003e \u003cp\u003eInterestingly, αN2 provides the major interactions between HNH and REC2 domains when HNH domain starts to rotate by L1 linker contraction during catalytic activation. As αN2 is also the only component exhibiting prominent conformational flexibility within HNH N-terminal region 3N (\u003cb\u003eSupplementary Fig.\u0026nbsp;6C\u003c/b\u003e), it is obvious that disassembly of αN2 helical structure into flexible loop enables its versatile interaction with REC2 interfacing residues. These transient interactions can assist in stabilizing the HNH domain at different spatial positions during catalytic activation. In summary, together with the structural plasticity of the L1 and L2 linkers, the REC2 domain relocated from the SpCas9 RNP center in a coordinated manner. Furthermore, the departing REC2 domain establishes different interactions with the HNH N-terminal helical regions, in which αN2 plays an important role, to dictate and guide the HNH domain to the cleavage site.\u003c/p\u003e \u003cp\u003e \u003cb\u003eIn vitro\u003c/b\u003e \u003cb\u003eand Intracellular Experimental Validation of SpCas9 RNP Stepwise Activation Model\u003c/b\u003e\u003c/p\u003e \u003cp\u003eBased on the MSM established with the unprecedent 160 \u0026micro;s unbiased MD simulations, we disclosed the importance of L1 and L2 linkers in driving HNH domain relocation, which was further dictated by REC2 domain. Our results suggested that the synergistic folding and unfolding of the L1 and L2 linkers are key events that promote catalytic activation. The REC2 domain, located distant from the HNH domain on the other side of the heteroduplex, contributes to HNH domain relocation through cooperative motion and establishes distinctive interactions during stepwise activation.\u003c/p\u003e \u003cp\u003eTo further validate our proposed model, we designed an green fluorescent protein (GFP) reporter system in mammalian cells to evaluate the SpCas9 RNP activation capacity\u003csup\u003e\u003cspan additionalcitationids=\"CR38\" citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e. Briefly, a reporter plasmid (GFxFP reporter plasmid) was designed, in which the cDNA encoding GFP was separated into GF and FP segments by a fragment that contained an early stop codon. The GFxFP reporter plasmid was co-transfected into HEK293T cells with a plasmid encoding the SpCas9 endonuclease and an sgRNA sequence. The sgRNA sequence guided SpCas9 cleavage in the GFxFP reporter plasmid. Based on the overlapping homology of the GF and FP segments, the cleaved GFxFP reporter plasmid restored GFP cDNA levels via complementarity-directed single-strand annealing (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA), thus enabling the expression of GFP. Therefore, the SpCas9 catalytic capacity could be inferred from the population of GFP-positive HEK293T cells through fluorescence-activated cell sorting (FACS). In addition to the cellular fluorescence-based reporter plasmid, we also reconstituted the SpCas9 RNP cleavage reaction \u003cem\u003ein vitro\u003c/em\u003e using the synthesized double-stranded DNA oligos as a substrate\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. Activated SpCas9 RNP introduces a double-stranded break in the substrate DNA, which can be resolved by gel electrophoresis as cleaved and uncleaved nucleotide bands. Under these circumstances, the ratio between the cleaved and uncleaved nucleotide substrates directly reflects the catalytic activation potential of SpCas9 RNP (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAccording to our model, the L2 linker restrained HNH domain at its helical conformation, and the structural flexibility of the L2 linker enables HNH domain relocation. G906 and G907 of the L2 linker were highly dynamic during the simulation. As the least bulky amino acids, G906 and G907 are expected to provide the L2 linker structural flexibility during the unfolding process. Therefore, we generated SpCas9 mutants in which the two glycines were substituted with bulky phenylalanine/tryptophan (SpCas9\u003csup\u003eFF\u003c/sup\u003e: G906F G907F, SpCas9\u003csup\u003eWW\u003c/sup\u003e: G906W G907W; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC) to decrease L2 linker flexibility. As expected, HEK293T cells transfected with SpCas9\u003csup\u003eFF\u003c/sup\u003e and SpCas9\u003csup\u003eWW\u003c/sup\u003e mutants exhibited a reduced GFP-positive population compared to the wild-type (WT) SpCas9. We also observed more uncleaved tDNA substrates in the SpCas9 cleavage assay using these L2 linker mutants (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD-G). However, mutation of adjacent non-glycine residues (SpCas9\u003csup\u003eL908A\u003c/sup\u003e and SpCas9\u003csup\u003eL909A\u003c/sup\u003e) did not induce a noticeable change in SpCas9 RNP catalytic activity, further validating the importance of L2 linker flexibility in the activation process (\u003cb\u003eSupplementary Fig.\u0026nbsp;9\u003c/b\u003e). The cellular and \u003cem\u003ein vitro\u003c/em\u003e experiments confirmed that SpCas9 RNP activation was impaired when the flexibility of the L2 linker was restrained.\u003c/p\u003e \u003cp\u003eSimilarly, for the REC2 domain, we identified two loops, \u003csup\u003e196\u003c/sup\u003eFEENPIN\u003csup\u003e202\u003c/sup\u003e and \u003csup\u003e230\u003c/sup\u003ePGEKKN\u003csup\u003e235\u003c/sup\u003e connecting adjacent regions interfacing with the HNH domain during catalytic activation. Mutation (substitute all residues of the two loops with alanine, SpCas9\u003csup\u003eAA\u003c/sup\u003e) and truncation (truncate the two loops, SpCas9\u003csup\u003etrunc\u003c/sup\u003e) of the two loops significantly inhibited SpCas9 RNP activation both \u003cem\u003ein vitro\u003c/em\u003e and in cellular conditions (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD-G), conforming the importance of REC2 domain in dictating catalytic activation process of SpCas9 RNP.\u003c/p\u003e \u003cp\u003eIt is worth noting that mutations in the L2 linker and REC2 domain did not significantly change either the thermostability or secondary structure composition of the SpCas9 endonuclease (\u003cb\u003eSupplementary Fig.\u0026nbsp;10\u003c/b\u003e), proving that the effects we observed resulted from the disrupted activation process of SpCas9 RNP \u003cem\u003eper se\u003c/em\u003e. In summary, using both \u003cem\u003ein vitro\u003c/em\u003e and cellular experimental approaches, we demonstrated the importance of the L2 linker structural plasticity and the REC2 domain in the catalytic activation of SpCas9 RNP, providing further support for our stepwise activation model of SpCas9 RNP.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this work, we integrated MD simulations with both \u003cem\u003ein vitro\u003c/em\u003e cleavage assay and a cellular fluorescence-based reporter to delineate how SpCas9 transitions from its pre-catalytic to the catalytic state. Through the construction of an MSM encompassing unprecedented 160 \u0026micro;s simulation data, we uncovered four metastable states (\u003cem\u003eS0\u003c/em\u003e, \u003cem\u003eMS1\u003c/em\u003e, \u003cem\u003eMS2\u003c/em\u003e, and \u003cem\u003eS1\u003c/em\u003e) that form a sequential activation pathway.\u003c/p\u003e \u003cp\u003eA central feature of this activation process is the coupled folding and unfolding dynamics of the L1 and L2 linkers, which coordinate the approximately 40 \u0026Aring; large-scale relocation of the HNH domain towards the tDNA strand. Our MD simulations show that L1 folds into a more ordered α-helix, exerting a pulling and rotational force on the HNH N-terminal helical region 3N. Concurrently, L2 undergoes helix disassembly, relieving its restraint on the HNH C-terminal helical region 2C, thereby facilitating the concerted domain reorganization required for catalysis. This \u0026ldquo;gear-and-wedge\u0026rdquo; coupling of 3N and 2C \u0026mdash; powered by the opposing folding states of L1 and L2 \u0026mdash; accounts for the striking amplitude of HNH domain motion, directing HNH domain to first undergo horizontal movement followed by vertical shift to precisely relocate the catalytic residue H840 at the target nucleotide (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eOur model underscores the critical role of the REC2 domain, which cooperates with the HNH domain during activation. Initially positioned in close proximity to the sgRNA:tDNA heteroduplex and sterically hindering the HNH domain from engaging the scissile phosphate, REC2 gradually moved outward in synchrony with the conformational changes in the HNH domain. Along this path, transient interactions between HNH αN2 and several REC2 loop residues stabilize intermediate metastable states \u003cem\u003eMS1\u003c/em\u003e and \u003cem\u003eMS2\u003c/em\u003e, ensuring a smooth, stepwise transition to the fully active \u003cem\u003eS1\u003c/em\u003e ensemble.\u003c/p\u003e \u003cp\u003eThe importance of these structural elements (L1, L2, and REC2) in regulating SpCas9 activation was further validated by complementary experiments. Mutations that disrupt L2 linker flexibility (SpCas9\u003csup\u003eFF\u003c/sup\u003e and SpCas9\u003csup\u003eWW\u003c/sup\u003e) significantly impair HNH domain relocation, leading to reduced cleavage efficiency, both under cellular conditions and \u003cem\u003ein vitro\u003c/em\u003e. Likewise, alanine replacement or truncation of key REC2 loops leads to the deleterious inhibition of SpCas9-mediated DNA cleavage. These observations confirm that perturbations in either linker plasticity or REC2 integrity abrogate proper SpCas9 RNP activation, while leaving the overall protein stability largely intact.\u003c/p\u003e \u003cp\u003eInterestingly, we observed a small population of HEK293T cells with visibly elevated GFP signals (GFP-high), which were readily distinguishable by FACS (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD, \u003cb\u003eSupplementary Fig.\u0026nbsp;9\u003c/b\u003e). The presence of this GFP-high population may reflect a heightened activation state that enables more rapid and efficient gene editing. Notably, this subgroup was absent in cells expressing SpCas9 mutants, even when the overall GFP-positive cells were detectable. These observations raise the possibility that REC2 and L1, L2 linkers not only facilitate the initial catalytic transition, but may also help sustain robust editing activity/efficiency. Ongoing investigations in our laboratory have aimed to clarify whether the integrity of these regions is crucial for maintaining long-term or repeated rounds of SpCas9-mediated DNA cleavage.\u003c/p\u003e \u003cp\u003eTaken together, our findings provide a high-resolution view of the SpCas9 RNP activation mechanism, highlight the dynamic roles of the L1, L2, and REC2 domains, and demonstrate how their coordinated conformational dynamics enable catalytic activation. In addition to advancing the fundamental understanding of SpCas9 function, this mechanistic framework may inform targeted protein engineering strategies aimed at modulating Cas9\u0026rsquo;s kinetic profile and specificity. By tuning linker dynamics or recalibrating key domain interactions, it may be possible to develop Cas9 variants optimized for diverse applications ranging from gene therapy to synthetic biology. More broadly, our approach of integrating large-scale MD simulations with biochemical and cellular assays underscores the power of combining computational and experimental methods to elucidate complex transient processes in biomolecular machines.\u003c/p\u003e"},{"header":"Materials and Methods","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eMinimum Energy Path Exploration with Nudge Elastic Band Sampling\u003c/h2\u003e \u003cp\u003eThe cryo-EM structures of the SpCas9:sgRNA:tDNA ternary complex in its pre-catalytic/inactive (PDB ID: 6O0Z) and catalytic/active (PDB ID: 6O0Y) states were used in the current study\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e, with missing amino acids complemented based on sequence homology with available structures or remodeled in SWISS-MODEL\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e. The obtained systems were first subjected to a 10,000-step minimization using the steepest descent algorithm with the CHARMM force field to eliminate potential structural conflicts\u003csup\u003e\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e. To explore the underlying conformational transition path between inactive and active SpCas9 RNP endpoint structures, the nudge elastic band (NEB) method was employed to disclose the transition pathway\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. The NEB method first predicted the transition pathway from inactive to active SpCas9 RNP as an elastic band consisting of discretized snapshots interconnected by virtual springs. The springs impose forces on their connecting snapshots to ensure that they do not slide towards each other and are evenly distributed along the reaction path, thereby forming a minimum energy path (MEP) for the conformational transition between two designated end structures. The total force on intermediate snapshot \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{F}_{i}\\)\u003c/span\u003e\u003c/span\u003e) can be orthogonally decomposed into parallel (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{F}_{i}^{//}\\)\u003c/span\u003e\u003c/span\u003e) and perpendicular (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{F}_{i}^{\\perp\\:}\\)\u003c/span\u003e\u003c/span\u003e) components:\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:\\begin{array}{c}{F}_{i}={F}_{i}^{//}+{F}_{i}^{\\perp\\:}\\#\\left(1\\right)\\end{array}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eFor a system that consists of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:N\\)\u003c/span\u003e\u003c/span\u003e atoms, the 3N-dimensional coordinate vector of intermediate snapshot \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{R}_{i}\\)\u003c/span\u003e\u003c/span\u003e) and the 3N-dimensional tangent unit vector (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\tau\\:}_{i}\\)\u003c/span\u003e\u003c/span\u003e) could be calculated, which enables calculation of the parallel force \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{F}_{i}^{//}\\)\u003c/span\u003e\u003c/span\u003e and perpendicular forces \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{F}_{i}^{\\perp\\:}\\)\u003c/span\u003e\u003c/span\u003e through the following equations:\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\:\\begin{array}{c}{F}_{i}^{//}=\\left[{k}_{i+1}\\left({R}_{i+1}-{R}_{i}\\right)-{k}_{i}\\left({R}_{i}-{R}_{i-1}\\right)\\bullet\\:\\tau\\:\\right]\\tau\\:\\#\\left(2\\right)\\end{array}\\)\u003c/span\u003e \u003c/span\u003e\u003cbr\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\:\\begin{array}{c}{F}_{i}^{\\perp\\:}=-\\nabla\\:V\\left({R}_{i}\\right)+\\left(\\nabla\\:V\\left({R}_{i}\\right)\\tau\\:\\right)\\tau\\:\\#\\left(3\\right)\\end{array}\\)\u003c/span\u003e \u003c/span\u003e\u003cbr\u003ewhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{k}_{i}\\)\u003c/span\u003e\u003c/span\u003e means the elastic spring constant between intermediate snapshot \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i+1\\)\u003c/span\u003e\u003c/span\u003e, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\nabla\\:V\\left({R}_{i}\\right)\\)\u003c/span\u003e\u003c/span\u003e indicates the potential energy gradient with respect to the coordinate vector in the entire system of intermediate snapshot \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eIn our study, the NEB suite within AMBER20\u003csup\u003e43,44\u003c/sup\u003e was used to generate 30 intermediate structures bridging inactive and active SpCas9 RNP. The process included heating the systems to 300 K at 0.5 fs timestep with 1 kcal/mol/\u0026Aring; spring force and 1 ns\u003csup\u003e\u0026minus;\u0026thinsp;1\u003c/sup\u003e Langevin collision efficient. During the subsequent equilibration, annealing, and cooling steps, 50 kcal/mol/\u0026Aring; spring force constant was used. Equilibration runs of the replicates were performed at 300 K with a time step of 1 fs. During the annealing process, the systems were first heated to 500 K and then gradually cooled to 0 K at 0.5 fs timestep. Finally, the replicas were cooled completely at 0 K for 2 ns with a time step of 1 fs.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eMD Simulation Setup\u003c/h3\u003e\n\u003cp\u003eThirty intermediate snapshots, together with the two end structures, were subjected to unbiased MD simulations. The LEaP program was used to prepare the structures, and the ff14SB force field was employed to describe the ribonucleoprotein complex\u003csup\u003e\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e. All systems were first solvated in an orthorhombic transferable intermolecular potential three-point (TIP3P) water box, followed by adding Na\u003csup\u003e+\u003c/sup\u003e and Cl\u003csup\u003e\u0026minus;\u003c/sup\u003e counterions to neutralize the system electrostatics while mimicking \u003cem\u003ein vivo\u003c/em\u003e physiological cleavage conditions. Two rounds of energy minimization were first carried out with the whole protein scaffold fixed, followed by removing all constraints for 5000- and 10000-step maximum minimization cycles, respectively. Subsequently, all systems were equilibrated in a canonical ensemble for 700 ps after being heated from 0 K to 300 K within 300ps. Finally, 10 independent 500 ns classical unbiased MD simulations were performed on all 32 systems embedded in an isothermal and isobaric ensemble with periodic boundaries, generating 320 independent trajectories and accumulating 160 \u0026micro;s conformational sampling in total. Langevin dynamics using 1 ps\u003csup\u003e\u0026minus;\u0026thinsp;1\u003c/sup\u003e collision frequency was applied to control the temperature during the simulation. Long-range electrostatic interactions were analyzed using the Particle Mesh Ewald method, and a 10 \u0026Aring; nonbonded cutoff was introduced for short-range electrostatic and van der Waals interactions. The covalent bond interactions involving hydrogen atoms were constrained using the SHAKE algorithm. Snapshots were written out every 50 ps\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n\u003ch3\u003eMarkov State Model Construction\u003c/h3\u003e\n\u003cp\u003eIntegrating Markov state modeling (MSM) with MD simulations is gaining increasing popularity for the efficiency and accuracy that can be reached when interpreting biomolecular dynamics, and this combination has been proven reproducible when verified with experimental techniques. The Python library PyEMMA was used for the estimation, validation, and analysis of MSM based on simulation trajectories\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e. Implied timescale test confirmed that the activation process of SpCas9 RNP was Markovian and reliable with a 800 microstate model and a lag time of 10 ns. The microstates were then clustered into four macrostates using the PCCA\u0026thinsp;+\u0026thinsp;algorithm, which was confirmed by the Chapman\u0026ndash;Kolmogorov test. Using the transition path theory, we measured the transition probability matrix of the MSMs and computed the mean first-passage time between macrostates. Trajectories close to the microstate cluster centers were extracted using the mdtraj package as representative trajectories for each metastable ensemble. Representative conformations of different metastable states were obtained based on representative trajectories\u003csup\u003e\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eGeneralized Cross Correlation Analysis\u003c/h2\u003e \u003cp\u003eGeneralized cross-correlation matrix (GCCM) analysis, as proposed by Grubm\u0026uuml;ller and Lange\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e, was employed to understand both linear and nonlinear correlated motions between residues. GCCM adopted the fundamental definition of independence of random variables, and treated variables \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{x}_{i},\\:{x}_{j}\\)\u003c/span\u003e\u003c/span\u003e correlated only when the product of their marginal distribution \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{p(x}_{i})\\bullet\\:p({x}_{j})\\)\u003c/span\u003e\u003c/span\u003e is larger than their joint distribution \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:p({x}_{i},\\:{x}_{j})\\)\u003c/span\u003e\u003c/span\u003e. Thus, mutual information (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:MI\\)\u003c/span\u003e\u003c/span\u003e) between \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{x}_{i}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{x}_{i}\\)\u003c/span\u003e\u003c/span\u003e is defined as\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\:\\begin{array}{c}MI\\left({x}_{i},\\:{x}_{j}\\right)=\\iint\\:p\\left({x}_{i},\\:{x}_{j}\\right)\\text{l}\\text{n}\\frac{p\\left({x}_{i},\\:{x}_{j}\\right).\\:}{{p(x}_{i})\\bullet\\:p({x}_{j})}d{x}_{i}d{x}_{j}\\#\\left(4\\right)\\end{array}\\)\u003c/span\u003e \u003c/span\u003e\u003cbr\u003eThus, with the g_correlation tool in Gromacs, generalized correlation coefficients between residue \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:i\\)\u003c/span\u003e\u003c/span\u003e and residue \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:j\\)\u003c/span\u003e\u003c/span\u003e (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{GC}_{ij}\\)\u003c/span\u003e\u003c/span\u003e) could be calculated through:\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\:\\begin{array}{c}{GC}_{ij}={\\left\\{1-{e}^{-\\frac{2MI\\left({x}_{i},\\:{x}_{j}\\right)}{d}}\\right\\}}^{\\frac{1}{2}}\\#\\left(5\\right)\\end{array}\\)\u003c/span\u003e \u003c/span\u003e\u003cbr\u003eIn which \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:d\\)\u003c/span\u003e\u003c/span\u003e represents the dimensionality of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{x}_{i}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{x}_{j}\\)\u003c/span\u003e\u003c/span\u003e, which is equal to three in our study.\u003c/p\u003e \u003cp\u003eTo further represent the extent to which the domains are correlated with each other, we introduced the inter-domain correlation \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{GC}_{XY}^{domain}\\)\u003c/span\u003e\u003c/span\u003e between domains X and Y, which can be calculated as\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\:\\begin{array}{c}{GC}_{XY}^{domain}=\\sum\\:_{i\\in\\:X,\\:j\\in\\:Y}{GC}_{ij}\\#\\left(6\\right)\\end{array}\\)\u003c/span\u003e \u003c/span\u003e\u003cbr\u003eIn our study, only \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{GC}_{ij}\\)\u003c/span\u003e\u003c/span\u003e above the threshold value of 0.65 is calculated.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eCell Culture and Transfection\u003c/h2\u003e \u003cp\u003eHEK293T cells were used in our study and maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% FBS at 37\u0026deg;C in 5% CO2. One day before transfection, cells were trypsinized and seeded at 1.0 \u0026times; 10\u003csup\u003e5\u003c/sup\u003e cells/well in 24-well plates. 1\u0026micro;g of plasmids (0.7\u0026micro;g of SpCas9/sgRNA plasmid and 0.3\u0026micro;g of GFxFP reporter plasmid) were co-transfected into HEK293T cells at ~\u0026thinsp;60% confluence with 2\u0026micro;L of ExFect Transfection Reagent (Vazyme Biotech, Nanjing, China) according to the manufacturer's instructions. Cells were harvested 24h, 48h and 72h after transfection, and a CytoFlex flow cytometer (Beckman) was used to analyze EGFP fluorescence.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003ePlasmids and Reagents\u003c/h2\u003e \u003cp\u003eThe cDNA for SpCas9 endonuclease purification from \u003cem\u003eE. coli\u003c/em\u003e was purchased from Saiheng Biological Technology (Shanghai, China) and inserted into the pET28a vector after fusion with an N-terminal Hisx6 tag. Mutations were introduced into the SpCas9 endonuclease using the Mut Express II Fast Mutagenesis Kit V2 (Vazyme Biotech, Nanjing, China) and verified by DNA sequencing (Personalbio, Shanghai, China). The nucleotide sequences used in our study are provided on request.\u003c/p\u003e \u003cp\u003eThe cDNA for \u003cem\u003ein vivo\u003c/em\u003e expression of SpCas9, sgRNA, and the PAM sequence (between the GF and FP segments) were also purchased from Saiheng Biological Technology and inserted into the pCMV or pBFP vector by Golden Gate assembly.\u003c/p\u003e \u003cp\u003eThe sgRNAs were synthesized by GENEWIZ (Suzhou, China). Target double-stranded DNA (tDNA) substrates for \u003cem\u003ein vitro\u003c/em\u003e cleavage assay were synthesized by Saiheng Biological Technology (Shanghai, China).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eSpCas9 Endonuclease Purification\u003c/h2\u003e \u003cp\u003eChemically competent E. coli Rosetta (DE3) cells (Weidi Biotechnology) were transformed with wild-type or mutant SpCas9 endonucleases. A single colony was picked to inoculate 2xYT medium containing 50\u0026micro;g/mL kanamycin at 37\u0026deg;C. 0.5mM IPTG was added to the bacteria culture when OD\u003csub\u003e600\u003c/sub\u003e reached 0.6\u0026ndash;0.8, followed by induction of protein expression at 16\u0026deg;C for 16\u0026ndash;18 h. To harvest proteins, the bacteria were lysed under high pressure using a lysis buffer (containing 20mM HEPES and 500 mM NaCl at pH 7.5) and then loaded onto a nickel column (GE Healthcare, Buckinghamshire, UK). After washing with 40 mM imidazole, the proteins were eluted with 250 mM imidazole. The elution was then dialyzed against storage buffer (consisting of 20mM HEPES, 100 mM NaCl, pH 7.5). The purified protein was snap frozen in liquid nitrogen and stored at \u0026minus;\u0026thinsp;80\u0026deg;C.\u003c/p\u003e \u003cp\u003e \u003cb\u003ein vitro\u003c/b\u003e \u003cb\u003eDNA Cleavage Assays\u003c/b\u003e\u003c/p\u003e \u003cp\u003eThe 1579-bp tDNA substrate for \u003cem\u003ein vitro\u003c/em\u003e cleavage assay contained the target and PAM sequences. \u003cem\u003ein vitro\u003c/em\u003e cleavage reactions were performed in 20 \u0026micro;L reaction buffer (20 mM HEPES, 100 mM KCl, 1 mM DTT, 10 mM MgCl2, at pH 7.5) containing 5nM linearized tDNA substrates, 200 nM purified wildtype/mutant SpCas9, and 200 nM sgRNA. The reaction was incubated at 37\u0026deg;C for 60 min and quenched by the addition of 50 mM EDTA, 20 \u0026micro;g Proteinase K for 30 min at room temperature. The products were analyzed by electrophoresis on a 1% agarose, 0.5x TBE gel stained with 4S red plus dye (Sangon, Shanghai, China). The gels were imaged using a Tanon-3500 gel imaging system (Tanon, Shanghai, China) and quantified using ImageJ software.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eProtein Thermal Shift Assays\u003c/h2\u003e \u003cp\u003eMixtures of 10\u0026micro;M wildtype/mutant SpCas9 with 5x SYPRO orange dye (Sigma-Aldrich, St. Louis, MO, USA) were prepared in 1x PBS solution. Samples were analyzed using a Light Cycler 480 real-time PCR instrument system II (Roche, Basel, Switzerland). The temperature was gradually increased at a rate of 0.05\u0026deg;C/s over a range of 25\u0026ndash;95\u0026deg;C while the fluorescence was monitored through the SYPRO orange channel. The melting temperature (Tm) was calculated from the melting curve using Light Cycler 480 software (Roche).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eCD Spectroscopy\u003c/h2\u003e \u003cp\u003eCD spectra were measured on a Chirascan and Chirascan-plus Circular Dichroism spectrometer (Applied Photophysics of Leatherhead, Surrey, UK) using a 1 mm path length quartz cuvette. The samples were prepared at 0.2 a concentration of ddH2O. Ten scans were performed for each sample, and three independent experiments were performed.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eData Analysis\u003c/h2\u003e \u003cp\u003eSignificance levels for comparisons between groups were determined using a paired two-tailed Student\u0026rsquo;s \u003cem\u003et\u003c/em\u003e-test using GraphPad Prism version 7.00 (La Jolla, CA, USA).\u003c/p\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eSupporting Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCryo-EM structures and simulated snapshots for SpCas9 structures, markovianity of the established model, representative structures extracted from metastable ensemble, secondary structural analysis, structural dynamics of HNH domain during activation, correlated motion within SpCas9 during activation, cellular experiments, protein thermos shift assays, and CD spectroscopy.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by grants from the National Key R\u0026amp;D Program of China (No. 2023YFC3404700), the National Natural Science Foundation of China (No. 22077082 and No. 81925034), and the Innovative Research Team of High-Level Local Universities in Shanghai.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflicts of interest\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no conflicts of interest regarding this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eStarting structures (PDB ID: 6O0Z, 6O0Y) were obtained from RCSB PDB database [https://www.rcsb.org/]. NEB calculations within the AMBER suite were performed to get initial intermediate structures. MD simulations were performed with AMBER suite [https://ambermd.org/]. All 32 structures for MD simulation, as well as scripts for setting simulation parameters, were provided in Source data file. The analysis protocol for Markov State Model referred to PyEMMA [http://www.emma-project.org/latest/].\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eDoudna JA (2020) The promise and challenge of therapeutic genome editing. Nature 578:229\u0026ndash;236\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAnzalone AV, Koblan LW, Liu DR (2020) Genome editing with CRISPR\u0026ndash;Cas nucleases, base editors, transposases and prime editors. Nat Biotechnol 38:824\u0026ndash;844\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMali P et al (2013) RNA-guided human genome engineering via Cas9. Science 339:823\u0026ndash;826\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeveau H, Garneau JE, Moineau S (2010) CRISPR/Cas system and its role in phage-bacteria interactions. Annu Rev Microbiol 64:475\u0026ndash;493\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDoudna JA, Charpentier E (2014) The new frontier of genome engineering with CRISPR-Cas9. Science 346:1258096\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi T et al (2023) CRISPR/Cas9 therapeutics: progress and prospects. Signal Transduct Target Ther 8:189\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHsu PD, Lander ES, Zhang F (2014) Development and applications of CRISPR-Cas9 for genome engineering. Cell 157:1262\u0026ndash;1278\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNishimasu H et al (2014) Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156:935\u0026ndash;949\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAnders C, Niewoehner O, Duerst A, Jinek M (2014) Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513:569\u0026ndash;573\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhu X et al (2019) Cryo-EM structures reveal coordinated domain motions that govern DNA cleavage by Cas9. Nat Struct Mol Biol 26:679\u0026ndash;685\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalermo G et al (2018) Key role of the REC lobe during CRISPR-Cas9 activation by sensing, regulating, and locking the catalytic HNH domain. Q Rev Biophys 51:e9\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJinek M et al (2014) Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science 343:1247997\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBabu K et al (2021) Coordinated actions of Cas9 HNH and RuvC nuclease domains are regulated by the bridge helix and the target DNA sequence. Biochemistry 60:3783\u0026ndash;3800\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiang F et al (2016) Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage. Science 351:867\u0026ndash;871\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiang F, Zhou K, Ma L, Gressel S, Doudna JA (2015) A Cas9-guide RNA complex preorganized for target DNA recognition. Science 348:1477\u0026ndash;1481\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBravo JPK et al (2022) Structural basis for mismatch surveillance by CRISPR\u0026ndash;Cas9. Nature 603:343\u0026ndash;347\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePacesa M et al (2022) R-loop formation and conformational activation mechanisms of Cas9. Nature 609:191\u0026ndash;196\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBergonzo C, Campbell AJ, Walker RC, Simmerling C (2009) A partial nudged elastic band implementation for use with large or explicitly solvated systems. Int J Quantum Chem 109:3781\u0026ndash;3790\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRan FA et al (2013) Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154:1380\u0026ndash;1389\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCong L et al (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 339:819\u0026ndash;823\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi M et al (2024) Delineating the stepwise millisecond allosteric activation mechanism of the class C GPCR dimer mGlu5. Nat Commun 15:7519\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi X et al (2021) Atomic-scale insights into allosteric inhibition and evolutional rescue mechanism of \u003cem\u003eStreptococcus thermophilus\u003c/em\u003e Cas9 by the anti-CRISPR protein AcrIIA6. Comput Struct Biotechnol J 19:6108\u0026ndash;6124\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLu S et al (2021) Activation pathway of a G protein-coupled receptor uncovers conformational intermediates as targets for allosteric drug design. Nat Commun 12:4721\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalermo G et al (2017) Protospacer adjacent motif-induced allostery activates CRISPR-Cas9. J Am Chem Soc 139:16028\u0026ndash;16031\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalermo G, Miao Y, Walker RC, Jinek M, McCammon JA (2016) Striking plasticity of CRISPR-Cas9 and key role of non-target DNA, as revealed by molecular simulations. ACS Cent Sci 2:756\u0026ndash;763\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePalermo G, Miao Y, Walker RC, Jinek M, McCammon JA (2017) CRISPR-Cas9 conformational activation as elucidated from enhanced molecular simulations. \u003cem\u003eProc. Natl Acad. Sci. USA\u003c/em\u003e 114, 7260\u0026ndash;7265\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLu S et al (2019) Deactivation pathway of Ras GTPase underlies conformational substates as targets for drug design. ACS Catal 9:7188\u0026ndash;7196\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchultze S, Grubm\u0026uuml;ller H (2021) Time-lagged independent component analysis of random walks and protein dynamics. J Chem Theory Comput 17:5766\u0026ndash;5776\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIkotun AM, Ezugwu AE, Abualigah L, Abuhaija B, Heming J (2023) K-means clustering algorithms: a comprehensive review, variants analysis, and advances in the era of big data. Inf Sci 622:178\u0026ndash;210\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu H, No\u0026eacute; F (2020) Variational approach for learning Markov processes from time series data. J Nonlinear Sci 30:23\u0026ndash;66\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeuflhard P, Weber M (2005) Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl 398:161\u0026ndash;184\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKube S, Weber M (2007) A coarse graining method for the identification of transition rates between molecular conformations. J Chem Phys 126:024103\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShibata M et al (2017) Real-space and real-time dynamics of CRISPR-Cas9 visualized by high-speed atomic force microscopy. Nat Commun 8:1430\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRicci CG et al (2019) Deciphering off-target effects in CRISPR-Cas9 through accelerated molecular dynamics. ACS Cent Sci 5:651\u0026ndash;662\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEast KW et al (2020) Allosteric motions of the CRISPR-Cas9 HNH nuclease probed by NMR and molecular dynamics. J Am Chem Soc 142:1348\u0026ndash;1358\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLange OF, Grubm\u0026uuml;ller H (2006) Generalized correlation for biomolecular dynamics. Proteins 62:1053\u0026ndash;1061\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu Z et al (2021) Programmed genome editing by a miniature CRISPR-Cas12f nuclease. Nat Chem Biol 17:1132\u0026ndash;1138\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang H et al (2023) An engineered xCas12i with high activity, high specificity, and broad PAM range. Protein Cell 14:538\u0026ndash;543\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNierzwicki Ł et al (2021) Enhanced specificity mutations perturb allosteric signaling in CRISPR-Cas9. \u003cem\u003eeLife\u003c/em\u003e 10, e73777\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHsu PD et al (2013) DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31:827\u0026ndash;832\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWaterhouse A et al (2018) SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46:W296\u0026ndash;W303\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrooks BR et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30:1545\u0026ndash;1614\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCase DA et al (2023) AmberTools J Chem Inf Model 63:6183\u0026ndash;6191\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRoe DR, Cheatham TE (2013) Ptraj and cpptraj: software for processing and analysis of molecular dynamics trajectory data. J Chem Theory Comput 9:3084\u0026ndash;3095\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaier JA et al (2015) ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J Chem Theory Comput 11:3696\u0026ndash;3713\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eScherer MK et al (2015) PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J Chem Theory Comput 11:5525\u0026ndash;5542\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcGibbon RT et al (2015) MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J 109:1528\u0026ndash;1532\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"CRISPR-Cas9, Molecular dynamics simulations, Conformational dynamics, Gene editing","lastPublishedDoi":"10.21203/rs.3.rs-6018412/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6018412/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe CRISPR-associated endonuclease \u003cem\u003eStreptococcus pyogenes\u003c/em\u003e Cas9 (SpCas9) enables site-specific DNA cleavage by transitioning from a pre-catalytic conformation to a catalytically active state, yet how its HNH catalytic domain undergoes an approximately 40 \u0026Aring; displacement towards the target DNA has remained elusive. Here, we combined extensive unbiased molecular dynamics simulations, spanning a cumulative timescale of 160 \u0026micro;s, with Markov state modeling to map the kinetic pathway of SpCas9 activation. \u003cem\u003eIn vitro\u003c/em\u003e DNA cleavage assays and a cellular fluorescence reporter system further validated the atomic-level mechanisms revealed by our simulations. We found that the folding of the L1 linker and unfolding of the L2 linker serve as the principal driving force, inducing a \u0026ldquo;gear-and-wedge\u0026rdquo; cooperative motion within the HNH domain. Concurrently, the REC2 domain moved outward to accommodate the displaced HNH domain and formed transient stabilizing interactions with the HNH domain along the activation route. Site-directed mutagenesis of key L2 linker residues and REC2 loops markedly reduced SpCas9 cleavage efficiency in both HEK293T cells and biochemical assays, underscoring their critical role in SpCas9 ribonucleoprotein activation. Collectively, this study provides a high-resolution view of SpCas9 catalytic activation and opens up new avenues for the rational design of SpCas9 variants with enhanced performance and specificity.\u003c/p\u003e","manuscriptTitle":"Untangling the Molecular Mechanism of SpCas9 Catalytic Activation: A Gear-and-Wedge Fitting Model","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-03-04 10:46:10","doi":"10.21203/rs.3.rs-6018412/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a573a06f-2ae7-45a1-b19b-333c715531d1","owner":[],"postedDate":"March 4th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":45125139,"name":"Biological sciences/Computational biology and bioinformatics/Protein function predictions"},{"id":45125140,"name":"Biological sciences/Biophysics/Computational biophysics"},{"id":45125141,"name":"Biological sciences/Microbiology/CRISPR-Cas systems/CRISPR-Cas9 genome editing"}],"tags":[],"updatedAt":"2025-12-05T20:30:35+00:00","versionOfRecord":[],"versionCreatedAt":"2025-03-04 10:46:10","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6018412","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6018412","identity":"rs-6018412","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00