Structure and function of the Si3 insertion integrated into the trigger loop/helix of cyanobacterial RNA polymerase

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 69,320 characters · extracted from oa-pdf · 7 sections · click to expand

Keywords

cyanobacteria, RNA polymerase, transcription, cryo-EM 24 This PDF file includes: 25 Main Text 26 Figures 1 to 5 27 Supplemental Figures 1-7, Tables 1-2, Supplemental Movie Legends 1-2 28 29 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 2

Abstract

30 Cyanobacteria and evolutionarily related chloroplasts of algae and plants possess unique RNA 31 polymerases (RNAPs) with characteristics that distinguish from canonical bacterial RNAPs. The 32 largest subunit of cyanobacterial RNAP (cyRNAP) is divided into two polypeptides, β ’1 and β ’2, 33 and contains the largest known lineage-specific insertion domain, Si3, located in the middle of 34 the trigger loop and spans approximately half of the β ’2 subunit. In this study, we present the X-35 ray crystal structure of Si3 and the cryo-EM structures of the cyRNAP transcription elongation 36 complex plus the NusG factor with and without incoming nucleoside triphosphate (iNTP) bound 37 at the active site. Si3 has a well-ordered and elongated shape that exceeds the length of the main 38 body of cyRNAP, fits into cavities of cyRNAP and shields the binding site of secondary channel-39 binding proteins such as Gre and DksA. A small transition from the trigger loop to the trigger 40 helix upon iNTP binding at the active site results in a large swing motion of Si3; however, this 41 transition does not affect the catalytic activity of cyRNAP due to its minimal contact with 42 cyRNAP, NusG or DNA. This study provides a structural framework for understanding the 43 evolutionary significance of these features unique to cyRNAP and chloroplast RNAP and may 44 provide insights into the molecular mechanism of transcription in specific environment of 45 photosynthetic organisms. 46 47 Significance statement: 48 Cellular RNA polymerase (RNAP) carries out RNA synthesis and proofreading reactions 49 utilizing a mobile catalytic domain known as the trigger loop/helix. In cyanobacteria, this 50 essential domain acquired a large Si3 insertion during the course of evolution. Despite its 51 elongated shape and large swinging motion associated with the transition between the trigger 52 loop and helix, Si3 is effectively accommodated within cyRNAP, with no impact on the 53 fundamental functions of the trigger loop. Understanding the significance of Si3 in cyanobacteria 54 and chloroplasts is expected to reveal unique transcription mechanism in photosynthetic 55 organisms. 56 57 58 59 60 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 3

Introduction

61 Cyanobacteria and chloroplasts of algae and higher plants are characterized by oxygen-62 evolving photosynthesis and are phylogenetically closely related. These genomes are transcribed 63 by a bacterial-type RNA polymerase (cyRNAP and plastid-encoded RNAP, PEP, respectively) 64 aided by transcription initiation σ factors for recognition of specific promoters (1-3). Although 65 cyRNAPs and chloroplast PEPs retain the fundamental functions of bacterial RNAPs, they 66 possess several distinct characteristics that distinguish them from canonical bacterial RNAPs. 67 First, the largest subunit of cyRNAP is separated into two polypeptides, β ’1 and β ’2, 68 which are encoded by the rpoC1 and rpoC2 genes, respectively (Fig. 1A). In Synechococcus 69 elongatus, which is the cyanobacterium used for the cryo-EM structural study of RNAP 70 described herein, the 624 residue β ’1 and 1,318 residue β ’2 subunits correspond to the amino 71 (N)-terminal one-third and the carboxy (C)-terminal two-thirds of the 1,407 residue β ’ subunit in 72 Escherichia coli, respectively. A junction between the β ’1 and β ’2 subunits is positioned before 73 the conserved region E (4, 5). The β ’1 subunit contains the clamp and the catalytic double-psi-β -74 barrel domain coordinating a Mg 2+ ion; the β ’2 subunit contains the rim helix, bridge helix, 75 trigger loop and jaw domain. 76 Second, cyRNAP contains the largest known lineage-specific insertion domain, Si3 (645 77 residues), which spans approximately half the size of the β 2’ subunit and is located in the middle 78 of the trigger loop (Fig. 1A) (6, 7). The trigger loop plays a central role in nucleotide selection, 79 RNA synthesis and RNA cleavage during proofreading by cellular RNAPs (8). In the absence of 80 nucleotide triphosphate (NTP) substrate, the tip of the trigger loop is located away from the 81 active site (9). Upon binding of complementary incoming NTP (iNTP) at the active site, the 82 trigger loop folds to form a trigger helix containing two α -helices, which extensively interacts 83 with the base and triphosphate groups of iNTP and facilitates the nucleotidyl transfer reaction 84 (8). The Si3 insertion is found in RNAPs of gram-n egative bacteria in the middle of the trigger 85 loop (evolutionarily conserved region G; Fig. 1A ). Si3 is composed of repeats of the conserved 86 sandwich-barrel hybrid motif (SBHM). Escherichia coli (E. coli) RNAP contains two copies of 87 SBHM (SFig. 1A) (10), and sequence analysis indicates that up to seven copies of SBHM are 88 present in the Si3 insertion of cyRNAP (6). The structure and function of the Si3 insertion in E. 89 coli RNAP have been well characterized; it is involved in stabilizing the open complex and RNA 90 hairpin-dependent ( his) and -independent ( ops) transcription pausing (11, 12) and is highly 91 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 4 mobile, with its confirmation being dependent on the folded/unfolded state of the trigger 92 loop/helix and binding of transcription factors (Gre, DksA) at the secondary channel of RNAP 93 (13, 14). In addition, structural and functional analyses of Si3 in cyRNAP have recently been 94 initiated. According to the cryo-electron microscopy (cryo-EM) structure of the cyRNAP 95 promoter complex (15), Si3 forms an “arch” with region 2 of the σ factor, the element involved 96 in opening the DNA duplex at the -10 position of the promoter. This arch stabilizes the promoter 97 complex, and its removal affects the fitness and stress resistance of cyanobacteria. Notably, the 98 Si3-σ contact remains intact upon trigger loop refolding into the trigger helix after iNTP addition 99 to the initiation complex with the short RNA transcript. After transition to the elongation phase, 100 it is unknown whether Si3 becomes mobile in the presence of transcription elongation factors 101 such as NusG and how Si3 affects refolding of the trigger helix and the catalytic activity of 102 RNAP. 103 In this work, we structurally and biochemically analyzed cyRNAP elongation complex 104 (EC) to understand the functional importance of Si3 in the elongation phase of transcription. We 105 solved the X-ray crystal structure of Si3 and cryo-EM structures of the cyRNAP EC with NusG 106 in the presence and absence of iNTP bound at the active site. 107 108

Results

109 X-ray crystal structure of Thermosynechococcus elongatus BP-1 Si3 (TelSi3) 110 We investigated the structure of the separate Si3 protein of the thermophilic 111 cyanobacterium Thermosynechococcus elongatus BP-1 (TelSi3) by X-ray crystallography. The 112 DNA sequence encoding Si3 (residues 345-983) was cloned and inserted into a vector for 113 expression in E. coli cells, and the resulting protein was purified to homogeneity. Initial attempts 114 to crystallize TelSi3 were unsuccessful. Limited trypsinolysis revealed that the amino-terminal 115 (N-terminal) 91 residues of TelSi3 are sensitive to proteolysis (SFig. 2A), indicating flexibility, 116 which potentially hindered crystallization. We then cloned and expressed TelSi3, which lacks the 117 N-terminal 91 residues (TelSi3 ΔN, residues 435 to 983) and thus forms large crystals (Fig. 1B) 118 belonging to the P3(2)21 space group (six TelSi3 ΔN copies per asymmetric unit; Fig. 1C). We 119 were unable to generate a TelSi3 ΔN model suitable for molecular replacement based on the 120 protein sequence (e.g., by SWISS-MODEL; SFig. 2C). Therefore, the experimental phase was 121 achieved by the single-wavelength anomalous dispersion (SAD) method using selenomethionine 122 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 5 (SeMet)-labeled TelSi3ΔN protein (SFig. 2B). The 3.2 Å resolution experimental density map 123 allowed us to build the structures of four full-length and two partial models of TelSi3 ΔN in the 124 asymmetric unit (STable 1). The AlphaFold (20) structural prediction for TelSi3ΔN was in close 125 agreement with the X-ray structure, with an RMSD of 1.08 Å (SFig. 2C). 126 TelSi3 ΔN (150 Å in length and 50 Å in width) is longer than the canonical bacterial 127 RNAP (e.g., 110 × 130 Å: E. coli RNAP) (SFig. 1B). TelSi3 ΔN comprises seven SBHMs 128 (SBHM-2 to SBHM-8). The X-ray crystal structure of the N-terminal region (81 residues) of Si3 129 from S. elongatus PCC 7942 (15) showed an independently folded SBHM (SBHM-1). This 130 region corresponds to the 91 N-terminal residues of TelSi3 (missing in the crystallized 131 TelSi3ΔN), indicating that cyRNAP contains 8 copies of SBHM within Si3 (Fig. 1C, SFig. 3). 132 TelSi3 has a swordfish-shaped profile, with distinct “tail”, “fin”, “body” and “head” 133 subdomains formed by SBHM-1, SBHM-2/8, SBHM-3/4/5 and SBHM-6/7, respectively (Fig. 134 1D). Notably, the SBHMs in TelSi3 are not structured in a simple tandem arrangement (Fig. 2D 135 and SFig. 3), in contrast to E. coli Si3, which contains two independently folded SBHMs 136 connected by a short linker (SFig. 1A) (10). Although each SBHM has a core antiparallel β -sheet 137 topology, connections between the β -sheets vary as the polypeptide chain folds over itself (Fig. 138 1D). In addition, the sequences of SBHM-1, -6, -7 and -8 are continuous; the others (SBHM-2, -139 3, -4 and -5) contain structural elements from distant regions of the polypeptide sequence. We 140 assessed conformational flexibility by comparing the four full-length TelSi3ΔN structures from 141 the asymmetric unit using the Si3-fin as a reference for superimposition. This showed substantial 142 conformational variation in the Si3- h ead, allo wing for a 24 A/i18 displacement associated with an 143 11° rotation (Fig. 1E). 144 145 Cryo-EM structure of the Synechococcus elongatus RNAP elongation complex with NusG 146 To investigate the structure of cyRNAP and the dynamics of Si3 at the elongation stage, 147 we determined the cryo-EM single-particle reconstruction structure of the cyRNAP EC (SFig 4, 148 STable 2). NusG was also included in the EC, as most ECs contain NusG under physiological 149 transcription conditions (16, 17), and physical contact between Si3 and NusG was investigated. 150 We used recombinant double affinity-tagged S. elongatus cyRNAP to avoid isolation of 151 any chimeric cyRNAP containing E. coli RNAP subunit. EC was assembled by mixing cyRNAP, 152 NusG and the DNA/RNA scaffold (Fig. 2A). The preferred particle orientation issue of EC-153 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 6 NusG was resolved by adding CHAPSO (final concentration of 0.8 mM) to the sample before 154 application to the cryo-EM grid (18) . The cryo -EM s truc t ure was determined with an overal l 155 reso lution of 3 A/i18 , revealing well-defined cryo-EM densities for cyRNAP, the N-terminal 156 domain of NusG (residues 19-138) and the DNA/RNA hybrid (Fig. 2B). The densities of the 157 single-stranded nontemplate DNA in the transcription bubble and the single-stranded RNA 158 within the RNA exit channel were traceable due to their respective interactions with NusG and 159 the RNA exit channel (Fig. 2C). The carboxyl-terminal (C-terminal) domains of the α subunits 160 and the Kyrpides-Ouzounis-Woese (KOW) domain of NusG were disordered. 161 By contacting both the upstream and downstream DNA duplexes, NusG seemed to 162 maintain a 90 ° bend in the DNA centered at the RNAP active site (SFig. 5A), which may 163 stabilize the DNA/RNA holding of cyRNAP. To evaluate the role of NusG, we immobilized 164 reconstituted ECs on agarose beads and challenged the complex with 300 mM NaCl in the 165 absence of NusG. There was a significant reduction in the proportion of RNA released from the 166 complex compared with the EC in the presence of NusG (SFig. 5B), indicating its stabilizing 167 effect. Notably, compared with its orthologs from E. coli, Bacillus subtilis and Mycobacterium 168 tuberculosis, the cyanobacterial NusG gene possesses a longer and more positively charged loop 169 (residues 110-122) within the N-terminal domain. This loop extends toward the downstream 170 DNA and single-stranded non-template DNA within the transcription bubble (SFig. 5A). 171 Deletion of this cyanobacteria-specific loop (NusG Δ110-122) significantly reduced the stabilizing 172 effect of NusG (SFig. 5B). 173 174 Si3 runs along the cavities of cyRNAP and shields the binding site of DksA/Gre factors 175 By fitting the models of RNAP (without Si3), NusG and the DNA/RNA scaffold, we 176 elucidated a density corresponding to Si3, which extends starting from the trigger loop and then 177 moves below the rim helix ( β ’2 subunit), running along the lobe/protrusion domains ( β subunit) 178 and nearly reaching the upstream DNA (Fig. 2B, SMovie 1). The overall structure of cyRNAP is 179 nearly identical to the structures of other bacterial RNAPs, including those of E. coli and M. 180 tuberculosis (19, 20), indicating that Si3 runs along the cavities of RNAP without influencing its 181 general shape or conformation. The crystal structures of Si3 containing both SBHM2-8 and 182 SBHM1 were fitted to their corresponding cryo-EM density. The cryo-EM density of the Si3-183 head was weak and had a low resolution (Fig. 2B, SFig. 4E), suggesting its mobility. 184 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 7 Si3-tail is positioned in front of the rim helix (Fig. 3A). Si3-fin is positioned below the 185 rim helix, and the extended SBHM2 loop (residues 463-471) fills a gap between the β ’2 jaw and 186 β lobe domains. Si3-body is located beside the lobe and protrusion domains of the β subunit, and 187 Si3-head reaches the upstream DNA (Figs. 2B and 3A). Si3-fin contacts the bottom part of the 188 rim helix, but only a few amino acid residues of Si3 contact the main body of RNAP and NusG, 189 suggesting that Si3-tail and Si3-body/head can move their positions without restraint. Si3 spans 190 the entire length of cyRNAP, reaching from the secondary channel to the upstream DNA. 191 However, it likely does not interfere with any basic function of cyRNAP (i.e., DNA binding, 192 RNA elongation, binding of initiation factor σ, or elongation factors NusA and NusG), as it runs 193 along the sidewall of cyRNAP (Figs. 2D and 3A). 194 During transcription, the secondary channel of all cellular RNAPs, including bacterial 195 RNAPs, serves as the only access route between the active site found in the center of RNAP and 196 the external milieu, serving as an entry point for substrate NTPs and an exit route for the RNA 197 3’-end during backtracking (prior to RNA cleavage). In cyRNAP, the secondary channel appears 198 to be open enough to allow these functions. In addition to these basic functions, the secondary 199 channel serves as a binding platform for proofreading factors such as Gre and regulatory factors 200 such as DksA, known as secondary channel binding factors (13, 27). These factors use the RNAP 201 rim helix as a primary binding site, after which the coiled-coil domain is inserted to access the 202 active site of RNAP (Fig. 3B). In cyRNAP, Si3-tail and -fin occupy the front and bottom sides of 203 the rim helix, respectively, thereby preventing any potential association of secondary channel 204 binding factors (Fig. 3A). 205 206 Dynamic motion of Si3 associated with the transition between the trigger loop and helix 207 during iNTP binding at the active site 208 To investigate the Si3 conformational change associated with trigger helix refolding, we 209 prepared an iNTP-bound form of the EC by extending RNA with 3’-deoxy adenosine 210 triphosphate (3’-dATP), which arrested further RNA extension, followed by cytosine 211 triphosphate (CTP) addition as the iNTP (SFig. 6). The resulting cryo-EM structure was 212 determined at 2.79 Å resolution (SFig. 6). Although an excess amount of CTP was added to the 213 EC, a substantial population of ECs (~40%) remained unbound to iNTP. However, the iNTP-214 bound EC could be clearly distinguished from the iNTP-free EC during 3D classification of the 215 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 8 cryo-EM data process due to its unique Si3 orientation relative to the main body of cyRNAP 216 associated with iNTP binding (SFig. 6B and 6D). This allowed for a well-defined density map of 217 the cyRNAP active site. In the iNTP-bound EC, the B-site Mg 2+ (known as the nucleotide-218 binding metal) was present at the active site. However, the A-site Mg 2+ (known as the catalytic 219 metal) was absent, likely due to the lack of a hydroxyl group at the 3’-end of the RNA. Trigger 220 helix folding establishes several essential contacts between the iNTP and amino acid residues, 221 including β ’2-M339 in contact with the nucleobase and β ’2-H343 in contact with the β-222 phosphate group (SFig. 6D). 223 Trigger helix folding induces significant motion of Si3 relative to the main body of 224 cyRNAP. Specifically, the trigger helix formation pulls a linker connecting the C-terminal half 225 of the trigger helix and the Si3-fin, and during this process, the tip of the rim helix acts as a pivot 226 point, converting the lateral motion of the linker (~10 Å) into the rotational motion of Si3, 227 resulting in an ~50 Å distance and a 24° swing of Si3-head (Figs. 4A and B, SMovie 2). Si3-228 body/head swings down from the main body of cyRNAP; thus, the β protrusion domain no 229 longer contacts Si3-body/head in the iNTP-bound EC (Fig. 4A). Remarkably, the large swinging 230 of Si3, which is coupled to trigger helix formation (Fig. 4B), did not markedly alter the catalytic 231 properties of cyRNAP (Fig. 4C). Three ECs containing 14, 15 and 16 nucleotide long RNAs 232 (EC14, 15 and 16) were prepared by extending the initial 5’-labelled 13 nt long RNA in the 233 nucleic acid scaffold shown above the summary table. Nucleotide addition, its direct reversal by 234 pyrophosphorolysis, and transcript cleavage were performed for the ECs that formed with either 235 wild-type (WT) or Si3-lacking ( ΔSi3) cyRNAP. Rates of the NTP addition, pyrophosphorolysis 236 and RNA hydrolysis were similar between the WT and ΔSi3 cyRNAPs (Fig. 4C and SFig. 7). 237 The relative rates of these reactions also allowed us to attribute a predominant translocation state 238 to the EC tested because nucleotide addition proceeded from post-translocation, 239 pyrophosphorolysis from pre-translocation and hydrolysis from the backtracked state (scheme on 240 Fig. 4C). Comparison of the rates of these reactions for the three complexes used in the present 241 study suggested that EC14 is mainly stabilized in a post-translocated state (characterized by fast 242 NTP addition), EC15 is mainly pre-translocated (fast pyrophospholysis), and EC16 is mainly 243 backtracked/paused (faster hydrolysis), similar to the ECs formed by Thermus aquaticus RNAP 244 (21), which doesn’t contain Si3, on this template. These results imply that Si3 does not influence 245 the catalysis or translocation equilibrium of cyRNAP. 246 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 9 The cryo-EM structure of the cyRNAP-promoter DNA complex containing σA (both 247 from Synechocystis sp. PCC 6803, which is closely related to the S. elongatus PCC 7942 used in 248 this study), promoter DNA and 4-mer RNA was determined by Shen et al. (15); the results 249 showed that Si3-head contacts σA domain 2. This interaction clamps the single-stranded DNA 250 around the -10 region, stabilizing the open complex and facilitating transcription initiation. 251 Comparison of the structures of the cyRNAP promoter complex (15) with those of the EC (this 252 study) revealed that Si3-body and -head move toward σA domain 2 for interaction but that the 253 other cyRNAP structures, including Si3-tail and -fin and the main body of the RNAP, are nearly 254 identical (Fig. 5A). 255 Si3 wraps around the main body of cyRNAP, which may facilitate RNAP folding, 256 subunit assembly and/or maturation to form an active and mature form of RNAP as DNA and a 257 σ factor that enhances reconstitution of E. coli RNAP (22). To test the function of Si3 during 258 cyRNAP assembly and maturation, we performed a refolding experiment with WT, ΔSi3 259 cyRNAP and ΔSi3 cyRNAP in combination with the separately expressed and purified Si3 260 protein ( ΔSi3+Si3) (Fig. 5B). The proteins were denatured with 6 M guanidine-HCl and 261 renatured by gradual removal of guanidine-HCl via dialysis against renaturation buffer. The 262 activities of the reconstituted ΔSi3 cyRNAP in the absence and presence of the Si3 protein, as 263 judged by their ability to extend 13 nt long RNA in the assembled duplex with template DNA 264 oligonucleotide, were nearly the same as those of the WT cyRNAP, indicating that Si3 does not 265 play a role in cyRNAP assembly and maturation. This conclusion is supported by the similar 266 yields of recombinant WT and ΔSi3 cyRNAPs routinely isolated from E. coli. Remarkably, 267 however, the separate Si3 protein binds ΔSi3 cyRNAP but not the WT cyRNAP when it is added 268 externally to cyRNAP (Fig. 5C). When complex formation between Si3 and ΔSi3 cyRNAP was 269 assessed by a blue native polyacrylamide gel electrophoresis, a band with a lower mobility 270 similar to that of the WT cyRNAP was observed (Fig. 5C, Lane 4). Interaction between WT 271 cyRNAP and Si3 was not detected, i.e., no complex with lower mobility than that of WT 272 cyRNAP was detected (Lane 5). 273 274

Discussion

275 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 10 In this study, we determined the structures of cyRNAP Si3 by X-ray crystallography (Fig. 276 1) and of cyRNAP EC-NusG with and without iNTP by cryo-EM (Figs. 2 and 4). We 277 investigated the function of Si3 by comparing the catalytic activities of WT and ΔSi3 cyRNAPs. 278 The results of structural and biochemical investigations of cyRNAP showed that Si3 is 279 accommodated within the cavities of cyRNAP without compromising its basic activities, that it 280 shields the site of secondary channel binding proteins, and that it moves within cyRNAP upon 281 binding of iNTP in the active site. Remarkably, a minor structural transition between the trigger 282 loop and trigger helix causes a major swinging motion of Si3 (Fig. 4 and SMovie 2). The 283 presence of Si3 in the middle of the trigger loop/helix did not affect cyRNAP catalysis under our 284 experimental conditions (Fig. 4C). Because of the large conformational change that occurs 285 during the transcription reaction, changes in cyRNAP activity could be observed when the 286 motion of Si3 is hindered, such as by binding of external factors. Further proteomics for 287 searching factors binding Si3, structural, single-molecule and biochemical studies are required to 288 elucidate its role in regulating transcription by cyRNAP, such as by sensing environmental 289 signals (e.g., trafficking of RNAP or transcription-translation coupling) to optimizing cyRNAP 290 activity. Alternatively, the oscillating motion of Si3 might function as a regulatory signal for 291 cellular processes. Photosynthetic cyanobacteria synchronize their gene expression patterns with 292 diurnal light cycles (23). Conceivably, the lack of Si3 movement might trigger initiation of 293 cyRNAP hibernation through binding to cellular factors or its oligomerization during the night. 294 Additionally, Si3 movement might help RNAP propel through the densely packed cytoplasm of 295 cyanobacteria during transcription. 296 The primary proofreading mechanism employed by RNAP involves backtracking 297 followed by hydrolysis of misincorporated nucleotides at the 3’-end of nascent RNA. This 298 process is significantly enhanced by elongation factors that bind to RNAP secondary channel, 299 such as Gre in bacteria, TFS in archaea, and TFIIS in eukaryotes (24). However, unlike the 300 absolute majority of living organisms, cyanobacteria lack Gre factor. The intracellular 301 concentration of Mn2+ is two orders of magnitude greater in cyanobacteria than in other bacteria 302 to support photosynthesis (16). It is possible that Mn 2+ replaces the catalytic Mg2+ of RNAP and 303 thus promotes misincorporation of NTPs (25, 26). Potentially as a compensating mechanism, 304 cyRNAP has been shown to possess proficient intrinsic proofreading activity (7, 27). However, 305 this intrinsic activity is still approximately 10 times lower than the Gre-stimulated activity of E. 306 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 11 coli RNAP. Gre-like factors either emerged after the split of cyanobacteria from their last 307 common ancestor with other bacteria or were subsequently lost. The distinctive characteristics of 308 cyRNAP—the absence of Gre/DksA factors and the split of the largest subunit may be 309 intrinsically linked to Si3 acquisition. The Si3-tail/fin position around the rim helix of RNAP 310 prevents association of secondary channel binding proteins, such as GreA and DksA, with 311 cyRNAP (Fig. 3). As secondary channel binding proteins play critical roles in transcription 312 fidelity and regulation in bacteria, the Si3-GreA/DksA trade-off in cyanobacteria might be 313 advantageous but remains to be fully understood. With Si3 acquisition, β ’ increased to 210 kDa 314 in size, and separation of the original rpoC gene into two genes was perhaps beneficial to 315 facilitate expression of such a large protein. The observed change in the position and mobility of 316 Si3 in cyRNAP ECs compared to those in the promoter complex (Fig. 5A) raises questions about 317 the role of Si3 in promoter escape. Si3 may complicate promoter escape by binding to the σ 318 factor; conversely, its large-range movement upon RNA synthesis may contribute to weakening 319 σ association with core and/or promote σ release at transition to elongation stage. 320 The structure corresponding to Si3 of cyRNAP has not been found in other bacterial 321 RNAPs. However, the structure and arrangement of the Rpb9 subunit in eukaryotic RNAPII 322 show remarkable similarity to those of the Si3 subunit of cyRNAP (Fig. 3C). Rpb9 is positioned 323 within a cavity between the rim helix and the lobe domain of RNAPII, akin to the Si3-fin of 324 cyRNAP (highlighted in red in cyRNAP and RNAPII). Rpb9 is a unique subunit found only in 325 RNAPII and plays a critical role in enhancing the accuracy of transcription (28). Although both 326 Rpb9 and Si3-tail are located away from the active site of RNAP, their presence may enhance 327 transcription fidelity, which coordinates RNAP confirmation changes such as RNAP swiveling 328 and/or movement of the rim helix during the nucleotide addition cycle (20). The presence of 329 these unique structural features in different types of RNAPs suggests a common mechanism for 330 enhancing transcriptional accuracy and specificity across different organisms. Further 331 investigation of Si3 function at different stages of transcription and under several growth 332 conditions in cyanobacteria will be required to determine the full array of its biological 333 functions. 334 335 Experimental Procedures 336 Protein preparation 337 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 12 The DNA fragment encoding Thermosynechococcus elongatus BP-1 Si3 in the β’2 subunit 338 (TelSi3, RpoC2 residues 345-983, 69 kDa) was cloned between the NdeI and BamHI sites of the 339 pET15b expression vector to introduce an N-terminal His 6-tag, and the protein was 340 overexpressed in E. coli BL21(DE3)/pLysS cells. Transformants were subsequently grown in LB 341 media supplemented with ampicillin (100 μ g/ml) and chloramphenicol (25 μg/ml) at 37 °C until 342 the OD600 reached ~0.5, after which protein expression was induced by adding 0.5 mM IPTG for 343 10 h at 4 °C. The harvested cells were lysed by sonication, and proteins in the soluble fraction 344 were purified by Ni-affinity column chromatography (HisTrap 5 ml column, GE Healthcare). 345 The His6-tag was removed by thrombin digestion (1 μg of thrombin per mg of TelSi3 protein) 346 for 20 h at 4 °C, and the protein was further purified by Q Sepharose column chromatography 347 (GE Healthcare) and gel-filtration column chromatography (HiLoad Superdex75 16/60, GE 348 Healthcare). The purified protein was concentrated to 15 mg/ml and exchanged into buffer 349 containing 10 mM Tris-HCl (pH 8.0), 50 mM NaCl and 0.1 mM EDTA. 350 351 Limited trypsinolysis 352 Limited trypsinolysis was used to remove flexible regions from TelSi3, and N-terminal amino 353 acid sequencing was used to identify protein fragments suitable for crystallization. The trypsin 354 digests were carried out in 10 mM Tris–HCl (pH 8), 100 mM NaCl, 5% (v/v) glycerol, 0.1 mM 355 EDTA and 1 mM DTT. TelSi3 (10 mg/ml) was digested in a 10 µl volume with different 356 amounts of trypsin (5 nM to 5 µM) for 10 min at 25 °C. The reactions were terminated by 357 addition of PMSF. The trypsinized fragments were separated by SDS /i2 PAGE and blotted onto 358 PVDF membranes, and the N-terminal sequences were determined by Edman based protein 359 sequencing. The TelSi3 fragment containing residues 435-938 (TelSi3ΔN, 60 kDa) was PCR 360 subcloned and inserted into the pET15b expression vector between the NdeI and BamHI sites. 361 The protein was overexpressed and purified as described above for full-length TelSi3. 362 363 Crystallization 364 Initial crystals of TelSi3 ΔN were obtained by the hanging-drop vapor diffusion method by 365 mixing equal volumes of the protein solution (20 mg/ml) and crystallization solution (0.1 M 366 sodium citrate [pH 3.5], 0.2 M MgCl 2 and 10% PEG6000) and incubating at 4 °C over the same 367 crystallization solution. The large crystals (0.5 × 0.2 × 0.2 mm) used for X-ray data collection 368 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 13 were prepared by microseeding by mixing 2 µl of protein solution, 2 µl of crystallization solution 369 (0.1 M sodium citrate [pH 5], 0.4~0.6 M MgCl 2, 4~6% PEG3350 and 50 µg/ml heparin) and 0.2 370 µl of seed solution. The crystals were then dehydrated by transfer to crystallization solution 371 (without heparin) with increasing concentrations of PEG3350 (in 5% steps) to a final 372 concentration of 20% and incubated for 5-10 h. For all procedures, crystal preparation, growth 373 and dehydration were performed at 4 °C. The crystals were transferred to a crystallization 374 solution with 25% (v/v) propylene glycol as a cryoprotective solution and flash frozen in liquid 375 nitrogen. Selenomethionine-substituted proteins were prepared for SAD analysis by suppressing 376 methionine biosynthesis. 377 378 X-ray data collection and crystal structure determination 379 In addition to the four original methionine residues found in TelSi3 ΔN (including an N-terminal 380 methionine residue resulting from cloning into the pET15b vector), three methionine residues 381 were introduced by replacing the leucine residues at 508, 738 and 922 by site-directed 382 mutagenesis to obtain the experimental phase via single-anomalous dispersion (SAD) 383 experiments using SeMet-labeled proteins. The TelSi3ΔN protein with seven selenomethionine 384 residues (TelSi3 ΔNMet7) was generated by suppressing methionine biosynthesis during 385 overexpression of the TelSi3ΔNMet7 protein. The protein was purified as described above. 386 Diffraction data were corrected at National Synchrotron Light Source (NSLS) beamline X25. 387 There are six TelSi3 ΔNMet7 molecules in an asymmetric unit of the crystal belonging to the 388 P3(2)21 space group. The crystallographic datasets were processed using HKL2000 (29). With 389 the anomalous signal from SeMet, the experimental phase (figure of merit: 0.273) was calculated 390 using automated structure solution (AutoSol) in PHENIX (30). Density modification yielded a 391 map suitable for manual model building by Coot (31) followed by structure refinement using 392 PHENIX. The final coordinates and structure factors have been deposited in Protein Data Bank 393 (PDB) under the accession codes listed in Supplementary Table 1. 394 395 Expression and isolation of the Synechococcus elongatus RNAP 396 The core enzyme of cyRNAP was overexpressed in E. coli T7Express cells (New England 397 Biolabs) cells transformed with a pET28a expression vector containing the α , β , β ’1, β ’2 and ω 398 encoding genes ( β and β ’2 contain a Strep-tag and His-tag, respectively) (32). The cells were 399 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 14 grown in LB media supplemented with kanamycin (50 μ g/ml) at 37 °C until the OD600 was ~0.6. 400 Afterward, the cells were induced with IPTG (1 mM) and grown overnight at 22 °C. 401 The biomass was harvested and suspended in lysis buffer (50 mM Tris-HCl (pH 8.0), 250 mM 402 NaCl, 10% glycerol, 20 mM imidazole, and 1 mM β -mercaptoethanol and protease inhibitors 403 from Roche according to the manufacturer’s instructions). The cells were sonicated, lysate 404 centrifuged at 18 k x g, after which the supernatant was collected. The protein was purified at 405 4 °C sequentially through a HisTrap (5 mL) column and a Strep-Tactin XT (1 mL) column (both 406 from Cytiva). The latter column was washed with 3 column volumes (CVs) of Buffer W (100 407 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA). The bound protein was eluted by applying 408 1 CV of Buffer E (100 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA, 2.5 mM 409 desthiobiotin). The purified cyRNAP (20 μ M) was assessed using SDS/i2 PAGE, dialyzed against 410 Storage Buffer (40 mM Tris–HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 mM DTT, and 5% 411 glycerol), and stored at -80 °C. 412 413 Cloning, expression and isolation of the Synechococcus elongatus NusG and Si3 proteins 414 The NusG was overexpressed in E. coli T7Express cells (New England Biolabs) cells 415 transformed with pET28a expression vector where the gene for the C-terminal His 6-416 tagged Synechococcus elongatus NusG was cloned. Cells were grown in LB 417 medium supplemented with kanamycin (50 μ g/ml) at 37°C until OD 600 ~0.5, then induced with 418 IPTG (1 mM) and grown overnight at 22°C. Culture pellets were sonicated in 50 ml Lysis Buffer 419 (10 mM Tris-HCl pH 7.9, 300 mM KCl, protease inhibitors from Roche according 420 to manufacturer), spun at 18K rpm, and filtered through 0.22 μ M syringe filter. Filtered 421 supernatant was subjected to Ni-NTA affinity chromatography in 10 mM Tris-HCl pH 7.9, 600 422 mM KCl, 5% glycerol with 50 mM imidazole washes and 100 mM imidazole elution. The eluted 423 protein (in 600 mM KCl) was diluted (~100 mM KCl) and applied to a pre-equilibrated with 424 (10 mM Tris-HCl pH 8.0, 100 mM KCl, 5% glycerol) 5 ml Resource Q column, Cytiva. The 425 column was washed with 5 CV of equilibration buffer, and the protein was eluted by applying a 426 linear salt gradient (100-1 M KCl) over 10 CV. The purified NusG (90 μ M) was checked using 427 SDS-PAGE, stored in Storage Buffer (40 mM Tris-HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 428 mM DTT, 5% glycerol) at -80 °C. Cyanobacteria-specific loop of NusG (residues 110-122) was 429 deleted by site-directed mutageneses, and the mutant NusG was isolated as the WT protein. 430 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 15 The open reading frame encoding separate full-size Si3 domain was cloned into pET28 vector, 431 overexpressed E. coli T7Express cells (New England Biolabs) as the N-terminal His 6-tagged 432 protein, and isolated via Ni-NTA affinity chromatography on HisTrap column, Cytiva, similarly 433 to NusG protein. After affinity chromatography the protein was dyalised against the storage 434 buffer (20 Tris-HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 mM DTT, 50% glycerol). 435 436 Sample preparation for cryo-EM 437 The cyRNAP EC with NusG was reconstituted in vitro by mixing 5 μ M cyRNAP with equimolar 438 amounts of template DNA and RNA (Fig. 2A) in storage buffer at 37 °C for 10 minutes, 439 followed by mixing with 7 μ M nontemplate DNA and incubating further for 10 minutes. The 440 resulting EC was mixed with 7 μ M NusG and incubated for 10 min at 37 °C. CHAPSO (8 mM) 441 was added to the sample just before vitrification. The iNTP-bound EC was prepared by adding 1 442 mM 3’-deoxyATP or CTP to the EC with NusG and incubating for 5 min at 37 °C. Another 443 difference between the EC- and iNTP-bound ECs is the nontemplate DNA used in the scaffold, 444 the latter of which contains complementary transcription bubbles. 445 446 Grid preparation for cryo-EM 447 C-flat Cu grids (CF-1.2/1.3 400 mesh, Protochips, Morrisville, NC) were glow-discharged for 40 448 seconds using the PELCO easiGlowTM system prior to application of 3.5 μ l of the sample (2.5 –449 3.0 mg/ml protein concentration) and plunge-freezing in liquid ethane using a Vitrobot Mark IV 450 (FEI, Hillsboro, OR) with 100% chamber humidity at 5 °C. 451 452 Cryo-EM data acquisition and processing 453 Data were collected using a Titan Krios (Thermo Fisher) microscope equipped with a Falcon IV 454 direct electron detector (Gatan) at Penn State Cryo-EM Facility. Sample grids were imaged at 455 300 kV, with an intended defocus range of -2.5 to -0.75 μ m and a magnification of 75,000X in 456 electron counting mode (0.87 Å per pixel). Movies were collected with a total dose of 45 457 electrons/Å2. Downstream processing was performed with CryoSPARC (33). The movies were 458 corrected and aligned using patch motion correction followed by patch CTF correction. Particles 459 were picked using a template-based autopicker and multiple rounds of 2D classification to 460 discard bad particles. The 2D classes with EC-NusG particles were selected and used for training 461 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 16 the Topaz model (34). The Topaz-extracted particles were subjected to multiple rounds of 462 heterogeneous refinement to remove junk particles. Finally, a nonuniform refinement operation 463 was run on the final set of particles to yield the reconstruction (SFigs. 4 and 6). 464 465 Structure refinement and model building 466 A model of the cyRNAP core enzyme was constructed by homology modeling using core RNAP 467 from the cryo-EM structure of the Syn6803 RNAP- σA promoter DNA open complex as a 468

Reference

model. A model of cyNusG was constructed with the AlphaFold2 gene (35). DNA and 469 RNA models were constructed using the E. coli RNAP elongation complex (PDB: 7MKO) as a 470 guide. The cyRNAP gene was manually fitted into the cryo-EM density map using Chimera (36), 471 followed by rigid body and real-space refinement using Coot (31) and Phenix (37). 472 473 In vitro transcription in the assembled elongation complexes 474 ECs were assembled and immobilized as described (38). Sequences of the oligonucleotides used 475 for the assembly of ECs are shown on Fig.4C. For assembly of ECs used for experiments on Fig. 476 4C, 13 nt long RNA was radiolabelled at the 5’-end with [ γ -32P] ATP and T4 Polynucleotide 477 kinase (New England Biolabs) prior to complexes assembly. Stalled elongation complexes 478 EC14, EC15 and EC16 were obtained by extension of the initial RNA13 in EC13 with 10 μ M 479 NTP sets according to the sequence for 5 min and then were washed with TB to remove Mg 2+ 480 and NTPs. Reactions were initiated by addition of 10 mM MgCl 2 with or without either 1 μ M 481 NTPs or 250 μ M PPi. Single nucleotide addition and pyrophosphorolysis experiments were 482 performed at 30°C in transcription buffer (TB) containing 20 mM Tris–HCl pH 6.8, 40 mM KCl, 483 10 mM MgCl 2, transcript hydrolysis was done in the same buffer except at pH 7.9. After 484 incubation for intervals of time specified on Figures, reactions were stopped with formamide-485 containing buffer. Products were resolved by denaturing 23% polyacrylamide gel electrophoresis 486 (PAGE) (8 M Urea), revealed by PhosphorImaging (Cytiva) and visualized using ImageQuant 487 (Cytiva) software. Kinetics data were fitted to a single exponential equation y=y 0+a-bx using 488 SigmaPlot software by non-linear regression to determine rate constants of the reactions. 489 490 Denaturation and renaturation of cyRNAP 491 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 17 Denaturation of cyRNAP was performed by incubating the purified protein for 20 min in 492 denaturing buffer containing 20 mM Tris-HCl (pH 7.9), 6 M guanidine-HCl, 5% glycerol, 1 mM 493 EDTA, and 10 mM DTT at 30 °C in a 100 µl volume and with a cyRNAP concentration of 0.5 494 mg/ml. Recombinant Si3 was included in 2.5 molar excess. The proteins were renatured via 495 overnight dialysis at 7 °C against renaturing buffer containing 20 mM Tris-HCl (pH 7.9), 200 496 mM KCl, 10% glycerol, 2 mM MgCl2, 10 µM ZnCl2, 1 mM EDTA, and 1 mM DTT. Aliquots of 497 the renaturation mixture and their serial dilutions were used for nucleotide addition experiments 498 on assembled constructs containing template DNA and RNA oligonucleotides. A 13 nt RNA 499 oligonucleotide was radiolabeled at the 5’ end with [ γ -32P] ATP and T4 polynucleotide kinase 500 (New England Biolabs) prior to EC assembly. The indicated on the Fig. 5C amount of assembly 501 # 502 mixture was incubated with the RNA-DNA duplex for 5 min at room temperature, then 10 µM 503 GTP was added for 10 minutes at 30°C. Reactions were stopped and products analyzed as 504 before. 505 506 Complex formation between the Si3 protein and core cyRNAP 507 For the binding experiment 150 nM core enzymes and 1.5 µM Si3 proteins were incubated for 10 508 minutes at 4°C in 20 mM Tris-HCl pH 7.9, 40 mM KCl, mixed with loading dye (final 509 concentration is 50mM BisTris pH 7.2, 50mM NaCl, 10% glycerol, 0.001% Ponceau S) and 510 resolved on the NativePAGE 3-12% Bis-Tris gel, Invitrogen using running buffers prepared 511 according to the manufacturer, for 90 minutes at 150V. Gel was fixed with 50% methanol,10% 512 acetic acid solution, and additionally de-stained by boiling in 8% acetic acid. 513 514 Salt stability of elongation complexes 515 Elongation complex was assembled using oligos shown on SFig. 5B. 14 nt RNA in ECs on was 516 radiolabelled at the 3’ end by incorporation of [ α -32P] GTP into original 13 nt long RNA. To 517 examine the stability of ECs, ECs bound to the streptavidin sepharose beads, Cytiva via strep tag 518 on β subunit of cyRNAP, were incubated in TB containing 300 mM KCl at 30°C for times 519 specified on SFig. 5B. WT or mutant NusG Δ110-122 were added where specified at 1 μ M final 520 concentration. Supernatant and total fractions were collected for analysis. Reactions were 521 stopped and products analyzed as before. 522 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 18 523 Figure legends 524 Fig. 1. X-ray crystal structure of TelSi3Δ N. (A) The thick bars represent the primary sequences 525 of the largest subunits of the bacterial, chloroplast and archaeal RNAPs. Domains (Si3, green 526 boxes) and structural motifs (RH, rim helix; BH, bridge helix; TL, trigger loop) are labeled. The 527 lettered boxes represent evolutionarily conserved regions. The split ends of the two polypeptides 528 are indicated by black triangles. (B) Crystals of TelSi3 ΔN. (C) Structure of TelSi3 ΔN. Six 529 molecules of TelSi3 ΔN (I~VI) are present in the asymmetric unit. Molecules are depicted as 530 cartoon models with transparent surfaces, and each molecule is denoted by a unique color and 531 labeled. (D) The backbone is colored as a ramp from the N-terminus to the C-terminus, from 532 blue/cyan/green/yellow/orange/red. SBHMs are labeled 1 to 8, and subdomains (tail, fin, body 533 and head) are indicated. The TelSi3 ΔN structure lacks SBHM-1, and the trigger loops (TL N and 534 TLC) are depicted as blue oval and pink cylinders, respectively, with black lines showing their 535 connections with TelSi3 ΔN. (E) Molecules 1 and 3 of TelSi3 ΔN are superimposed via fin 536 subdomains, revealing flexibility in the orientation between the fin and body/head subdomains. 537 538 Fig. 2. Cryo-EM image of the cyRNAP elongation complex with NusG. (A) The sequence of 539 the DNA/RNA scaffold used for the EC-NusG assembly (template DNA, green; nontemplate 540 DNA, yellow; RNA, red). DNA and RNA regions lacking cryo-EM density are underlined. (B) 541 Orthogonal views of the cryo-EM density map. Subunits and domains of cyRNAP, DNA, RNA 542 and NusG are colored and labeled (RH, rim helix; prot, protrusion; downDNA, downstream 543 DNA; upDNA, upstream DNA). The split ends of the β ’1 and β ’2 subunits are indicated by 544 white circles. The SBHMs in Si3 are labeled 1 to 8. (C) Cryo-EM density of DNA, RNA and 545 NusG are shown with a transparent RNAP density map (ntDNA, nontemplate DNA; ssRNA, 546 single-stranded RNA). The 5’ and 3’ ends of the RNA are indicated. The cryo-EM density map 547 is colored according to B. (D) Efficient storage of an elongated and large Si3 molecule on the 548 surface of cyRNAP. The structure of EC-NusG is shown as a transparent surface, and the Si3, 549 DNA/RNA and trigger loop (TL N, TL C) regions are shown as cartoon models. SBHMs are 550 labeled 1 to 8, and subdomains (tail, fin, body and head) are indicated. The active site of RNAP 551 is designated by catalytic Mg2+ (magenta sphere). 552 553 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 19 Fig. 3. Comparison of the structures of cyRNAP, E. coli RNAP and eukaryotic RNAPII. 554 The structures of cyRNAP ( A), E. coli RNAP with GreB (PDB: 6RIN, B) and yeast RNAPII 555 (PDB: 7ML0, C) are shown as transparent surfaces with domains, subunits and a factor 556 described in the main text. 557 558 Fig. 4. Si3 movement during the trigger helix folding. (A) Cryo-EM maps of the iNTP-bound 559 (gray) and iNTP-free (light blue) states of the EC-NusG strains (RH, rim helix; prot, protrusion; 560 upDNA, upstream DNA). Arrows indicate movement of Si3 by trigger helix folding. (B) 561 Conformational change in Si3 during the transition from the trigger loop (TL) to the trigger helix 562 (TH) by iNTP (blue stick model) binding. The red and black arrows indicate movements of the 563 TL/TH-Si3 linker and Si3, respectively. A pivot point for converting the movement of the linker 564 to the swing motion of Si3 is shown as a blue transparent circle. (C) Si3 does not influence 565 catalysis by cyRNAP. Scheme and sequence of the assembled elongation complex used for 566 experiments with WT and ΔSi3 RNAPs. The table represents the summary of reaction rate 567 constants of single nucleotide addition (kNTP), pyrophosphorolysis (kPPi) and transcript hydrolysis 568 (kOH-) in EC14, EC15 and EC16 by WT and ΔSi3 RNAPs. The values that follow the ± sign are 569 the values of standard deviation derived from three independent experiments. The shade of green 570 in the cells reflects the value of the constant, i.e., darkest shade corresponds to the highest rate. 571 The right column shows the predominant translocation states of the elongation complexes, as 572 deduced from the relative rates of reaction. Scheme of RNAP oscillation in translocation 573 equilibrium and the architecture of the nucleic acid scaffold of the elongation complex in 574 post/i2 translocation, pre /i2 translocation and backtracked states, as adapted from (21). The 575 template DNA, the non ‐ template DNA and the RNA are green, yellow and pink, respectively. 576 Catalytic Mg2+ ions and the i+1 site of the RNAP active center are shown by a red circle and a 577 blue rectangle, respectively. 578 579 Fig. 5. Si3 functions. (A) The cryo-EM structures of cyRNAP in the EC (left) and the promoter 580 complex (right, PDB: 8GZG). The contact between Si3-head and σA in the promoter complex is 581 indicated by a black circle. (B) Si3 is not required for cyRNAP assembly or maturation. WT and 582 ΔSi3 cyRNAPs were denatured and subsequently renatured, after which their activity was tested 583 on the construct mimicking the DNA template–RNA transcript duplex structure by their ability 584 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 20 to incorporate the next nucleotide, G, dictated by the template. Twofold serial dilutions of the 585 assembly mixture with the indicated initial amounts of core enzymes were tested. The vertical 586 lines indicate the positions where the parts of the same gel were combined. (C) The recombinant 587 Si3 protein can bind ΔSi3 cyRNAP but not WT cyRNAP. The complex formation between the 588 indicated proteins was analyzed by blue native polyacrylamide gel electrophoresis. The vertical 589 line indicates the position where two parts of the same gel were combined. 590 591 Data, Materials, and Software Availability. The X-ray crystallographic density map and the 592 refined model have been deposited in Protein Data Bank (www.rcsb.org) under accession 593 number 8EMB. The cryo-EM density map and the refined model have been deposited in 594 Electron Microscopy Data Bank (www.ebi.ac.uk/emdb/) under accession numbers EMD-40874 595 (iNTP-free EC-NusG) and EMD-42502 (iNTP-bound EC-NusG) and in Protein Data Bank 596 (www.rcsb.org) under accession numbers 8SYI (iNTP-free EC-NusG) and 8URW (iNTP-bound 597 EC-NusG). All study data are included in the article and/or SI Appendix. 598 599 ACKNOWLEDGMENTS 600 We thank Jean-Paul Armache at Penn State for the technical support. We thank the National 601 Synchrotron Light Source (NSLS) Brookhaven National Laboratory for X-ray data collection. 602 We would like to acknowledge the Penn State Huck Life Science Institutes Cryo-EM Core 603 Facility for use of the Talos Arctica G2 TEM and the Vitrobot Mark IV and Sung Hyun Cho for 604 data collection. We thank Yu Zhang at the Shanghai Institute of Plant Physiology and Ecology 605 for kindly sharing the coordinates of Synechocystis sp. PCC 6803 RNAP. This work was 606 supported by a National Institutes of Health grant (R35 GM131860 to K. S. M.) and a 607 Biotechnology and Biological Sciences Research Council grant BB/W017385/1 to Y.Y. 608 609

References

610 1. T. Borner, A. Y. Aleynikova, Y. O. Zubo, V. V. Kusnetsov, Chloroplast RNA 611 polymerases: Role in chloroplast biogenesis. Biochim Biophys Acta 1847, 761-769 612 (2015). 613 2. T. Pfannschmidt et al., Plastid RNA polymerases: orchestration of enzymes with different 614 evolutionary origins controls chloroplast biogenesis during the plant life cycle. J Exp Bot 615 66, 6957-6973 (2015). 616 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 21 3. G. J. Schneider, N. E. Tumer, C. Richaud, G. Borbely, R. Haselkorn, Purification and 617 characterization of RNA polymerase from the cyanobacterium Anabaena 7120. J Biol 618 Chem 262, 14633-14639 (1987). 619 4. W. Q. Xie, K. Jager, M. Potts, Cyanobacterial RNA polymerase genes rpoC1 and rpoC2 620 correspond to rpoC of Escherichia coli. J Bacteriol 171, 1967-1973 (1989). 621 5. G. J. Schneider, R. Hasekorn, RNA polymerase subunit homology among cyanobacteria, 622 other eubacteria and archaebacteria. J Bacteriol 170, 4136-4140 (1988). 623 6. W. J. Lane, S. A. Darst, Molecular evolution of multisubunit RNA polymerases: 624 sequence analysis. J Mol Biol 395, 671-685 (2010). 625 7. A. Riaz-Bradley, K. James, Y. Yuzenkova, High intrinsic hydrolytic activity of 626 cyanobacterial RNA polymerase compensates for the absence of transcription 627 proofreading factors. Nucleic Acids Res 48, 1341-1352 (2020). 628 8. M. Z. Qayyum, Y. Shin, K. S. Murakami, Encyclopedia of Biological Chemistry III. 629 10.1016/b978-0-12-819460-7.00252-8, 358-364 (2021). 630 9. A. C. Cheung, P. Cramer, A movie of RNA polymerase II transcription. Cell 149, 1431-631 1437 (2012). 632 10. M. Chlenov et al., Structure and function of lineage-specific sequence insertions in the 633 bacterial RNA polymerase beta' subunit. J Mol Biol 353, 138-154 (2005). 634 11. J. Y. Kang et al., RNA Polymerase Accommodates a Pause RNA Hairpin by Global 635 Conformational Rearrangements that Prolong Pausing. Mol Cell 69, 802-815 e805 636 (2018). 637 12. I. Artsimovitch, V. Svetlov, K. S. Murakami, R. Landick, Co-overexpression of 638 Escherichia coli RNA polymerase subunits allows isolation and analysis of mutant 639 enzymes lacking lineage-specific sequence insertions. J Biol Chem 278, 12344-12355 640 (2003). 641 13. M. Abdelkareem et al., Structural Basis of Transcription: RNA Polymerase Backtracking 642 and Its Reactivation. Mol Cell 75, 298-309 e294 (2019). 643 14. Y. Shin et al., Structural basis of ribosomal RNA transcription regulation. Nat Commun 644 12, 528 (2021). 645 15. L. Shen et al., An SI3-sigma arch stabilizes cyanobacteria transcription initiation 646 complex. Proc Natl Acad Sci U S A 120, e2219290120 (2023). 647 16. R. A. Mooney et al., Regulator trafficking on bacterial transcription units in vivo. Mol 648 Cell 33, 97-108 (2009). 649 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 22 17. A. V. Yakhnin, M. Kashlev, P. Babitzke, NusG-dependent RNA polymerase pausing is a 650 frequent function of this universally conserved transcription elongation factor. Crit Rev 651 Biochem Mol Biol 55, 716-728 (2020). 652 18. J. Chen, A. J. Noble, J. Y. Kang, S. A. Darst, Eliminating effects of particle adsorption to 653 the air/water interface in single-particle cryo-electron microscopy: Bacterial RNA 654 polymerase and CHAPSO. J Struct Biol X 1 (2019). 655 19. M. Z. Qayyum, V. Molodtsov, A. Renda, K. S. Murakami, Structural basis of RNA 656 polymerase recycling by the Swi2/Snf2 family of ATPase RapA in Escherichia coli. J 657 Biol Chem 297, 101404 (2021). 658 20. R. K. Vishwakarma, M. Z. Qayyum, P. Babitzke, K. S. Murakami, Allosteric mechanism 659 of transcription inhibition by NusG-dependent pausing of RNA polymerase. Proc Natl 660 Acad Sci U S A 120, e2218516120 (2023). 661 21. A. Bochkareva, Y. Yuzenkova, V. R. Tadigotla, N. Zenkin, Factor-independent 662 transcription pausing caused by recognition of the RNA-DNA hybrid sequence. EMBO J 663 31, 630-639 (2012). 664 22. R. Fukuda, A. Ishihama, Subunits of RNA polymerase in function and structure; 665 Maturation in vitro of core enzyme from Escherichia coli. J Mol Biol 87, 523-540 (1974). 666 23. G. Dong, S. S. Golden, How a cyanobacterium tells time. Curr Opin Microbiol 11, 541-667 546 (2008). 668 24. R. C. Conaway, S. E. Kong, J. W. Conaway, TFIIS and GreB: two like-minded 669 transcription elongation factors with sticky fingers. Cell 114, 272-274 (2003). 670 25. J. Hurwitz, L. Yarbrough, S. Wickner, Utilization of deoxynucleoside triphosphates by 671 DNA-dependent RNA polymerase of E. coli. Biochem Biophys Res Commun 48, 628-635 672 (1972). 673 26. S. K. Niyogi, R. P. Feldman, Effect of several metal ions on misincorporation during 674 transcription. Nucleic Acids Res 9, 2615-2627 (1981). 675 27. M. Imashimizu, K. Tanaka, N. Shimamoto, Comparative Study of Cyanobacterial and E. 676 coli RNA Polymerases: Misincorporation, Abortive Transcription, and Dependence on 677 Divalent Cations. Genet Res Int 2011, 572689 (2011). 678 28. N. K. Nesser, D. O. Peterson, D. K. Hawley, RNA polymerase II subunit Rpb9 is 679 important for transcriptional fidelity in vivo. Proc Natl Acad Sci U S A 103, 3268-3273 680 (2006). 681 29. Z. Otwinowski, W. Minor, Processing of X-ray diffraction data collected in oscillation 682 mode. Methods Enzymol 276, 307-326 (1997). 683 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 23 30. D. Liebschner et al., Macromolecular structure determination using X-rays, neutrons and 684 electrons: recent developments in Phenix. Acta Crystallogr D Struct Biol 75, 861-877 685 (2019). 686 31. P. Emsley, B. Lohkamp, W. G. Scott, K. Cowtan, Features and development of Coot. 687 Acta Crystallogr D Biol Crystallogr 66, 486-501 (2010). 688 32. L. Shen et al., A SI3-σ arch stabilizes cyanobacteria transcription initiation complex. 689 bioRxiv 10.1101/2022.10.06.511230, 2022.2010.2006.511230 (2022). 690 33. A. Punjani, J. L. Rubinstein, D. J. Fleet, M. A. Brubaker, cryoSPARC: algorithms for 691 rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290-296 (2017). 692 34. T. Bepler et al., Positive-unlabeled convolutional neural networks for particle picking in 693 cryo-electron micrographs. Nat Methods 16, 1153-1160 (2019). 694 35. J. Jumper et al., Applying and improving AlphaFold at CASP14. Proteins 89, 1711-1721 695 (2021). 696 36. E. F. Pettersen et al., UCSF Chimera--a visualization system for exploratory research and 697 analysis. J Comput Chem 25, 1605-1612 (2004). 698 37. P. V. Afonine et al., Real-space refinement in PHENIX for cryo-EM and crystallography. 699 Acta Crystallogr D Struct Biol 74, 531-544 (2018). 700 38. Y. Yuzenkova et al., Stepwise mechanism for transcription fidelity. BMC Biol 8, 54 701 (2010). 702 703 .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint headbodyfin 24 Å, 11º I II III V 8 2 3 4 5 6 7 TLc TLN 1 Mg (active site) 0.5 x 0.2 x 0.2 (mm) C B D E tail Fig. 1 A A A C C C B B B D D D F F F E E E G G G H H H 645 793 188 G G GE. coli Cyanobacteria (T. elongatus) (S. elongatus) Chloroplast (A. thaliana) BH RH TLN TLC JawMg2+ A CB D FE G HArchaeal RNAP TL A clump .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint Si3 upDNAb’1 b’2 a RNA ch 1 8 prot lobe RH 3 4 5 6 7 b RNA ch flap upDNA NusGSi3 b w a b’1 prot 1 8 RH w split ends a 2nd ch 82 b b’2 lobe 2 3 4 5 6 7 8 TLc TLN Mg RNAdownDNA fin body headtail 8 1 upDNA B A CCTCTCCATG 5'-GGGCGCATGCTGCTCTA ACGGCGACTGCCC-3’ 3'-CCCGCGTACGACGAGATCCTCTCCATGTGCCGCTGACGGG-5’ GGAGAGGUA 5'-GCAUUCAAAGC upDNA downDNA RNA C downDNA upDNA NusG 3’ 5’ ntDNA ssRNA Fig. 2 D 2nd ch 90° downDNA Si3 b’1 b’2 NusG 1 2 8 lobe clamp jaw 3 4 5 6 7 90° 180° Mg .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint 90° Si3-body/head Mg BH RH Si3-tail Si3-fin lobe Si3-fin RH BH Mg Si3-tail lobe 2nd ch Si3-body/head Mg BH RH GreB lobe Si3 Mg BH RH GreB lobeSi3 Mg BH RH lobe Rpb9 Mg BH RH lobe Rpb9 cyRNAP E. coli RNAP-GreB Yeast RNAPII acidic residues Fig. 3 A B C .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint prot lobe head RH body fin tail upDNA RH fin tail 2nd ch head body 50 Å, 24º A fin tail TLc TLN MgA BH downDNA THc THN NusG iNTP MgB tDNA ntDNA RNA body head upDNA Si3 iNTP Fig. 4 90° B RH fin tail body head C pivot point .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint upDNA b a NusG Si3 pDNA b’1 b’2 sA -35 -10 headbody tail fin cyRNAP-sA holoenzyme promoter DNA complex cyRNAP EC with NusG A Fig. 5 B C .CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0