Keywords
cyanobacteria, RNA polymerase, transcription, cryo-EM 24
This PDF file includes: 25
Main Text 26
Figures 1 to 5 27
Supplemental Figures 1-7, Tables 1-2, Supplemental Movie Legends 1-2 28
29
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
2
Abstract
30
Cyanobacteria and evolutionarily related chloroplasts of algae and plants possess unique RNA 31
polymerases (RNAPs) with characteristics that distinguish from canonical bacterial RNAPs. The 32
largest subunit of cyanobacterial RNAP (cyRNAP) is divided into two polypeptides, β ’1 and β ’2, 33
and contains the largest known lineage-specific insertion domain, Si3, located in the middle of 34
the trigger loop and spans approximately half of the β ’2 subunit. In this study, we present the X-35
ray crystal structure of Si3 and the cryo-EM structures of the cyRNAP transcription elongation 36
complex plus the NusG factor with and without incoming nucleoside triphosphate (iNTP) bound 37
at the active site. Si3 has a well-ordered and elongated shape that exceeds the length of the main 38
body of cyRNAP, fits into cavities of cyRNAP and shields the binding site of secondary channel-39
binding proteins such as Gre and DksA. A small transition from the trigger loop to the trigger 40
helix upon iNTP binding at the active site results in a large swing motion of Si3; however, this 41
transition does not affect the catalytic activity of cyRNAP due to its minimal contact with 42
cyRNAP, NusG or DNA. This study provides a structural framework for understanding the 43
evolutionary significance of these features unique to cyRNAP and chloroplast RNAP and may 44
provide insights into the molecular mechanism of transcription in specific environment of 45
photosynthetic organisms. 46
47
Significance statement: 48
Cellular RNA polymerase (RNAP) carries out RNA synthesis and proofreading reactions 49
utilizing a mobile catalytic domain known as the trigger loop/helix. In cyanobacteria, this 50
essential domain acquired a large Si3 insertion during the course of evolution. Despite its 51
elongated shape and large swinging motion associated with the transition between the trigger 52
loop and helix, Si3 is effectively accommodated within cyRNAP, with no impact on the 53
fundamental functions of the trigger loop. Understanding the significance of Si3 in cyanobacteria 54
and chloroplasts is expected to reveal unique transcription mechanism in photosynthetic 55
organisms. 56
57
58
59
60
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
3
Introduction
61
Cyanobacteria and chloroplasts of algae and higher plants are characterized by oxygen-62
evolving photosynthesis and are phylogenetically closely related. These genomes are transcribed 63
by a bacterial-type RNA polymerase (cyRNAP and plastid-encoded RNAP, PEP, respectively) 64
aided by transcription initiation σ factors for recognition of specific promoters (1-3). Although 65
cyRNAPs and chloroplast PEPs retain the fundamental functions of bacterial RNAPs, they 66
possess several distinct characteristics that distinguish them from canonical bacterial RNAPs. 67
First, the largest subunit of cyRNAP is separated into two polypeptides, β ’1 and β ’2, 68
which are encoded by the rpoC1 and rpoC2 genes, respectively (Fig. 1A). In Synechococcus 69
elongatus, which is the cyanobacterium used for the cryo-EM structural study of RNAP 70
described herein, the 624 residue β ’1 and 1,318 residue β ’2 subunits correspond to the amino 71
(N)-terminal one-third and the carboxy (C)-terminal two-thirds of the 1,407 residue β ’ subunit in 72
Escherichia coli, respectively. A junction between the β ’1 and β ’2 subunits is positioned before 73
the conserved region E (4, 5). The β ’1 subunit contains the clamp and the catalytic double-psi-β -74
barrel domain coordinating a Mg 2+ ion; the β ’2 subunit contains the rim helix, bridge helix, 75
trigger loop and jaw domain. 76
Second, cyRNAP contains the largest known lineage-specific insertion domain, Si3 (645 77
residues), which spans approximately half the size of the β 2’ subunit and is located in the middle 78
of the trigger loop (Fig. 1A) (6, 7). The trigger loop plays a central role in nucleotide selection, 79
RNA synthesis and RNA cleavage during proofreading by cellular RNAPs (8). In the absence of 80
nucleotide triphosphate (NTP) substrate, the tip of the trigger loop is located away from the 81
active site (9). Upon binding of complementary incoming NTP (iNTP) at the active site, the 82
trigger loop folds to form a trigger helix containing two α -helices, which extensively interacts 83
with the base and triphosphate groups of iNTP and facilitates the nucleotidyl transfer reaction 84
(8). The Si3 insertion is found in RNAPs of gram-n egative bacteria in the middle of the trigger 85
loop (evolutionarily conserved region G; Fig. 1A ). Si3 is composed of repeats of the conserved 86
sandwich-barrel hybrid motif (SBHM). Escherichia coli (E. coli) RNAP contains two copies of 87
SBHM (SFig. 1A) (10), and sequence analysis indicates that up to seven copies of SBHM are 88
present in the Si3 insertion of cyRNAP (6). The structure and function of the Si3 insertion in E. 89
coli RNAP have been well characterized; it is involved in stabilizing the open complex and RNA 90
hairpin-dependent ( his) and -independent ( ops) transcription pausing (11, 12) and is highly 91
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
4
mobile, with its confirmation being dependent on the folded/unfolded state of the trigger 92
loop/helix and binding of transcription factors (Gre, DksA) at the secondary channel of RNAP 93
(13, 14). In addition, structural and functional analyses of Si3 in cyRNAP have recently been 94
initiated. According to the cryo-electron microscopy (cryo-EM) structure of the cyRNAP 95
promoter complex (15), Si3 forms an “arch” with region 2 of the σ factor, the element involved 96
in opening the DNA duplex at the -10 position of the promoter. This arch stabilizes the promoter 97
complex, and its removal affects the fitness and stress resistance of cyanobacteria. Notably, the 98
Si3-σ contact remains intact upon trigger loop refolding into the trigger helix after iNTP addition 99
to the initiation complex with the short RNA transcript. After transition to the elongation phase, 100
it is unknown whether Si3 becomes mobile in the presence of transcription elongation factors 101
such as NusG and how Si3 affects refolding of the trigger helix and the catalytic activity of 102
RNAP. 103
In this work, we structurally and biochemically analyzed cyRNAP elongation complex 104
(EC) to understand the functional importance of Si3 in the elongation phase of transcription. We 105
solved the X-ray crystal structure of Si3 and cryo-EM structures of the cyRNAP EC with NusG 106
in the presence and absence of iNTP bound at the active site. 107
108
Results
109
X-ray crystal structure of Thermosynechococcus elongatus BP-1 Si3 (TelSi3) 110
We investigated the structure of the separate Si3 protein of the thermophilic 111
cyanobacterium Thermosynechococcus elongatus BP-1 (TelSi3) by X-ray crystallography. The 112
DNA sequence encoding Si3 (residues 345-983) was cloned and inserted into a vector for 113
expression in E. coli cells, and the resulting protein was purified to homogeneity. Initial attempts 114
to crystallize TelSi3 were unsuccessful. Limited trypsinolysis revealed that the amino-terminal 115
(N-terminal) 91 residues of TelSi3 are sensitive to proteolysis (SFig. 2A), indicating flexibility, 116
which potentially hindered crystallization. We then cloned and expressed TelSi3, which lacks the 117
N-terminal 91 residues (TelSi3 ΔN, residues 435 to 983) and thus forms large crystals (Fig. 1B) 118
belonging to the P3(2)21 space group (six TelSi3 ΔN copies per asymmetric unit; Fig. 1C). We 119
were unable to generate a TelSi3 ΔN model suitable for molecular replacement based on the 120
protein sequence (e.g., by SWISS-MODEL; SFig. 2C). Therefore, the experimental phase was 121
achieved by the single-wavelength anomalous dispersion (SAD) method using selenomethionine 122
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
5
(SeMet)-labeled TelSi3ΔN protein (SFig. 2B). The 3.2 Å resolution experimental density map 123
allowed us to build the structures of four full-length and two partial models of TelSi3 ΔN in the 124
asymmetric unit (STable 1). The AlphaFold (20) structural prediction for TelSi3ΔN was in close 125
agreement with the X-ray structure, with an RMSD of 1.08 Å (SFig. 2C). 126
TelSi3 ΔN (150 Å in length and 50 Å in width) is longer than the canonical bacterial 127
RNAP (e.g., 110 × 130 Å: E. coli RNAP) (SFig. 1B). TelSi3 ΔN comprises seven SBHMs 128
(SBHM-2 to SBHM-8). The X-ray crystal structure of the N-terminal region (81 residues) of Si3 129
from S. elongatus PCC 7942 (15) showed an independently folded SBHM (SBHM-1). This 130
region corresponds to the 91 N-terminal residues of TelSi3 (missing in the crystallized 131
TelSi3ΔN), indicating that cyRNAP contains 8 copies of SBHM within Si3 (Fig. 1C, SFig. 3). 132
TelSi3 has a swordfish-shaped profile, with distinct “tail”, “fin”, “body” and “head” 133
subdomains formed by SBHM-1, SBHM-2/8, SBHM-3/4/5 and SBHM-6/7, respectively (Fig. 134
1D). Notably, the SBHMs in TelSi3 are not structured in a simple tandem arrangement (Fig. 2D 135
and SFig. 3), in contrast to E. coli Si3, which contains two independently folded SBHMs 136
connected by a short linker (SFig. 1A) (10). Although each SBHM has a core antiparallel β -sheet 137
topology, connections between the β -sheets vary as the polypeptide chain folds over itself (Fig. 138
1D). In addition, the sequences of SBHM-1, -6, -7 and -8 are continuous; the others (SBHM-2, -139
3, -4 and -5) contain structural elements from distant regions of the polypeptide sequence. We 140
assessed conformational flexibility by comparing the four full-length TelSi3ΔN structures from 141
the asymmetric unit using the Si3-fin as a reference for superimposition. This showed substantial 142
conformational variation in the Si3- h ead, allo wing for a 24 A/i18 displacement associated with an 143
11° rotation (Fig. 1E). 144
145
Cryo-EM structure of the Synechococcus elongatus RNAP elongation complex with NusG 146
To investigate the structure of cyRNAP and the dynamics of Si3 at the elongation stage, 147
we determined the cryo-EM single-particle reconstruction structure of the cyRNAP EC (SFig 4, 148
STable 2). NusG was also included in the EC, as most ECs contain NusG under physiological 149
transcription conditions (16, 17), and physical contact between Si3 and NusG was investigated. 150
We used recombinant double affinity-tagged S. elongatus cyRNAP to avoid isolation of 151
any chimeric cyRNAP containing E. coli RNAP subunit. EC was assembled by mixing cyRNAP, 152
NusG and the DNA/RNA scaffold (Fig. 2A). The preferred particle orientation issue of EC-153
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
6
NusG was resolved by adding CHAPSO (final concentration of 0.8 mM) to the sample before 154
application to the cryo-EM grid (18) . The cryo -EM s truc t ure was determined with an overal l 155
reso lution of 3 A/i18 , revealing well-defined cryo-EM densities for cyRNAP, the N-terminal 156
domain of NusG (residues 19-138) and the DNA/RNA hybrid (Fig. 2B). The densities of the 157
single-stranded nontemplate DNA in the transcription bubble and the single-stranded RNA 158
within the RNA exit channel were traceable due to their respective interactions with NusG and 159
the RNA exit channel (Fig. 2C). The carboxyl-terminal (C-terminal) domains of the α subunits 160
and the Kyrpides-Ouzounis-Woese (KOW) domain of NusG were disordered. 161
By contacting both the upstream and downstream DNA duplexes, NusG seemed to 162
maintain a 90 ° bend in the DNA centered at the RNAP active site (SFig. 5A), which may 163
stabilize the DNA/RNA holding of cyRNAP. To evaluate the role of NusG, we immobilized 164
reconstituted ECs on agarose beads and challenged the complex with 300 mM NaCl in the 165
absence of NusG. There was a significant reduction in the proportion of RNA released from the 166
complex compared with the EC in the presence of NusG (SFig. 5B), indicating its stabilizing 167
effect. Notably, compared with its orthologs from E. coli, Bacillus subtilis and Mycobacterium 168
tuberculosis, the cyanobacterial NusG gene possesses a longer and more positively charged loop 169
(residues 110-122) within the N-terminal domain. This loop extends toward the downstream 170
DNA and single-stranded non-template DNA within the transcription bubble (SFig. 5A). 171
Deletion of this cyanobacteria-specific loop (NusG Δ110-122) significantly reduced the stabilizing 172
effect of NusG (SFig. 5B). 173
174
Si3 runs along the cavities of cyRNAP and shields the binding site of DksA/Gre factors 175
By fitting the models of RNAP (without Si3), NusG and the DNA/RNA scaffold, we 176
elucidated a density corresponding to Si3, which extends starting from the trigger loop and then 177
moves below the rim helix ( β ’2 subunit), running along the lobe/protrusion domains ( β subunit) 178
and nearly reaching the upstream DNA (Fig. 2B, SMovie 1). The overall structure of cyRNAP is 179
nearly identical to the structures of other bacterial RNAPs, including those of E. coli and M. 180
tuberculosis (19, 20), indicating that Si3 runs along the cavities of RNAP without influencing its 181
general shape or conformation. The crystal structures of Si3 containing both SBHM2-8 and 182
SBHM1 were fitted to their corresponding cryo-EM density. The cryo-EM density of the Si3-183
head was weak and had a low resolution (Fig. 2B, SFig. 4E), suggesting its mobility. 184
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
7
Si3-tail is positioned in front of the rim helix (Fig. 3A). Si3-fin is positioned below the 185
rim helix, and the extended SBHM2 loop (residues 463-471) fills a gap between the β ’2 jaw and 186
β lobe domains. Si3-body is located beside the lobe and protrusion domains of the β subunit, and 187
Si3-head reaches the upstream DNA (Figs. 2B and 3A). Si3-fin contacts the bottom part of the 188
rim helix, but only a few amino acid residues of Si3 contact the main body of RNAP and NusG, 189
suggesting that Si3-tail and Si3-body/head can move their positions without restraint. Si3 spans 190
the entire length of cyRNAP, reaching from the secondary channel to the upstream DNA. 191
However, it likely does not interfere with any basic function of cyRNAP (i.e., DNA binding, 192
RNA elongation, binding of initiation factor σ, or elongation factors NusA and NusG), as it runs 193
along the sidewall of cyRNAP (Figs. 2D and 3A). 194
During transcription, the secondary channel of all cellular RNAPs, including bacterial 195
RNAPs, serves as the only access route between the active site found in the center of RNAP and 196
the external milieu, serving as an entry point for substrate NTPs and an exit route for the RNA 197
3’-end during backtracking (prior to RNA cleavage). In cyRNAP, the secondary channel appears 198
to be open enough to allow these functions. In addition to these basic functions, the secondary 199
channel serves as a binding platform for proofreading factors such as Gre and regulatory factors 200
such as DksA, known as secondary channel binding factors (13, 27). These factors use the RNAP 201
rim helix as a primary binding site, after which the coiled-coil domain is inserted to access the 202
active site of RNAP (Fig. 3B). In cyRNAP, Si3-tail and -fin occupy the front and bottom sides of 203
the rim helix, respectively, thereby preventing any potential association of secondary channel 204
binding factors (Fig. 3A). 205
206
Dynamic motion of Si3 associated with the transition between the trigger loop and helix 207
during iNTP binding at the active site 208
To investigate the Si3 conformational change associated with trigger helix refolding, we 209
prepared an iNTP-bound form of the EC by extending RNA with 3’-deoxy adenosine 210
triphosphate (3’-dATP), which arrested further RNA extension, followed by cytosine 211
triphosphate (CTP) addition as the iNTP (SFig. 6). The resulting cryo-EM structure was 212
determined at 2.79 Å resolution (SFig. 6). Although an excess amount of CTP was added to the 213
EC, a substantial population of ECs (~40%) remained unbound to iNTP. However, the iNTP-214
bound EC could be clearly distinguished from the iNTP-free EC during 3D classification of the 215
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
8
cryo-EM data process due to its unique Si3 orientation relative to the main body of cyRNAP 216
associated with iNTP binding (SFig. 6B and 6D). This allowed for a well-defined density map of 217
the cyRNAP active site. In the iNTP-bound EC, the B-site Mg 2+ (known as the nucleotide-218
binding metal) was present at the active site. However, the A-site Mg 2+ (known as the catalytic 219
metal) was absent, likely due to the lack of a hydroxyl group at the 3’-end of the RNA. Trigger 220
helix folding establishes several essential contacts between the iNTP and amino acid residues, 221
including β ’2-M339 in contact with the nucleobase and β ’2-H343 in contact with the β-222
phosphate group (SFig. 6D). 223
Trigger helix folding induces significant motion of Si3 relative to the main body of 224
cyRNAP. Specifically, the trigger helix formation pulls a linker connecting the C-terminal half 225
of the trigger helix and the Si3-fin, and during this process, the tip of the rim helix acts as a pivot 226
point, converting the lateral motion of the linker (~10 Å) into the rotational motion of Si3, 227
resulting in an ~50 Å distance and a 24° swing of Si3-head (Figs. 4A and B, SMovie 2). Si3-228
body/head swings down from the main body of cyRNAP; thus, the β protrusion domain no 229
longer contacts Si3-body/head in the iNTP-bound EC (Fig. 4A). Remarkably, the large swinging 230
of Si3, which is coupled to trigger helix formation (Fig. 4B), did not markedly alter the catalytic 231
properties of cyRNAP (Fig. 4C). Three ECs containing 14, 15 and 16 nucleotide long RNAs 232
(EC14, 15 and 16) were prepared by extending the initial 5’-labelled 13 nt long RNA in the 233
nucleic acid scaffold shown above the summary table. Nucleotide addition, its direct reversal by 234
pyrophosphorolysis, and transcript cleavage were performed for the ECs that formed with either 235
wild-type (WT) or Si3-lacking ( ΔSi3) cyRNAP. Rates of the NTP addition, pyrophosphorolysis 236
and RNA hydrolysis were similar between the WT and ΔSi3 cyRNAPs (Fig. 4C and SFig. 7). 237
The relative rates of these reactions also allowed us to attribute a predominant translocation state 238
to the EC tested because nucleotide addition proceeded from post-translocation, 239
pyrophosphorolysis from pre-translocation and hydrolysis from the backtracked state (scheme on 240
Fig. 4C). Comparison of the rates of these reactions for the three complexes used in the present 241
study suggested that EC14 is mainly stabilized in a post-translocated state (characterized by fast 242
NTP addition), EC15 is mainly pre-translocated (fast pyrophospholysis), and EC16 is mainly 243
backtracked/paused (faster hydrolysis), similar to the ECs formed by Thermus aquaticus RNAP 244
(21), which doesn’t contain Si3, on this template. These results imply that Si3 does not influence 245
the catalysis or translocation equilibrium of cyRNAP. 246
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
9
The cryo-EM structure of the cyRNAP-promoter DNA complex containing σA (both 247
from Synechocystis sp. PCC 6803, which is closely related to the S. elongatus PCC 7942 used in 248
this study), promoter DNA and 4-mer RNA was determined by Shen et al. (15); the results 249
showed that Si3-head contacts σA domain 2. This interaction clamps the single-stranded DNA 250
around the -10 region, stabilizing the open complex and facilitating transcription initiation. 251
Comparison of the structures of the cyRNAP promoter complex (15) with those of the EC (this 252
study) revealed that Si3-body and -head move toward σA domain 2 for interaction but that the 253
other cyRNAP structures, including Si3-tail and -fin and the main body of the RNAP, are nearly 254
identical (Fig. 5A). 255
Si3 wraps around the main body of cyRNAP, which may facilitate RNAP folding, 256
subunit assembly and/or maturation to form an active and mature form of RNAP as DNA and a 257
σ factor that enhances reconstitution of E. coli RNAP (22). To test the function of Si3 during 258
cyRNAP assembly and maturation, we performed a refolding experiment with WT, ΔSi3 259
cyRNAP and ΔSi3 cyRNAP in combination with the separately expressed and purified Si3 260
protein ( ΔSi3+Si3) (Fig. 5B). The proteins were denatured with 6 M guanidine-HCl and 261
renatured by gradual removal of guanidine-HCl via dialysis against renaturation buffer. The 262
activities of the reconstituted ΔSi3 cyRNAP in the absence and presence of the Si3 protein, as 263
judged by their ability to extend 13 nt long RNA in the assembled duplex with template DNA 264
oligonucleotide, were nearly the same as those of the WT cyRNAP, indicating that Si3 does not 265
play a role in cyRNAP assembly and maturation. This conclusion is supported by the similar 266
yields of recombinant WT and ΔSi3 cyRNAPs routinely isolated from E. coli. Remarkably, 267
however, the separate Si3 protein binds ΔSi3 cyRNAP but not the WT cyRNAP when it is added 268
externally to cyRNAP (Fig. 5C). When complex formation between Si3 and ΔSi3 cyRNAP was 269
assessed by a blue native polyacrylamide gel electrophoresis, a band with a lower mobility 270
similar to that of the WT cyRNAP was observed (Fig. 5C, Lane 4). Interaction between WT 271
cyRNAP and Si3 was not detected, i.e., no complex with lower mobility than that of WT 272
cyRNAP was detected (Lane 5). 273
274
Discussion
275
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
10
In this study, we determined the structures of cyRNAP Si3 by X-ray crystallography (Fig. 276
1) and of cyRNAP EC-NusG with and without iNTP by cryo-EM (Figs. 2 and 4). We 277
investigated the function of Si3 by comparing the catalytic activities of WT and ΔSi3 cyRNAPs. 278
The results of structural and biochemical investigations of cyRNAP showed that Si3 is 279
accommodated within the cavities of cyRNAP without compromising its basic activities, that it 280
shields the site of secondary channel binding proteins, and that it moves within cyRNAP upon 281
binding of iNTP in the active site. Remarkably, a minor structural transition between the trigger 282
loop and trigger helix causes a major swinging motion of Si3 (Fig. 4 and SMovie 2). The 283
presence of Si3 in the middle of the trigger loop/helix did not affect cyRNAP catalysis under our 284
experimental conditions (Fig. 4C). Because of the large conformational change that occurs 285
during the transcription reaction, changes in cyRNAP activity could be observed when the 286
motion of Si3 is hindered, such as by binding of external factors. Further proteomics for 287
searching factors binding Si3, structural, single-molecule and biochemical studies are required to 288
elucidate its role in regulating transcription by cyRNAP, such as by sensing environmental 289
signals (e.g., trafficking of RNAP or transcription-translation coupling) to optimizing cyRNAP 290
activity. Alternatively, the oscillating motion of Si3 might function as a regulatory signal for 291
cellular processes. Photosynthetic cyanobacteria synchronize their gene expression patterns with 292
diurnal light cycles (23). Conceivably, the lack of Si3 movement might trigger initiation of 293
cyRNAP hibernation through binding to cellular factors or its oligomerization during the night. 294
Additionally, Si3 movement might help RNAP propel through the densely packed cytoplasm of 295
cyanobacteria during transcription. 296
The primary proofreading mechanism employed by RNAP involves backtracking 297
followed by hydrolysis of misincorporated nucleotides at the 3’-end of nascent RNA. This 298
process is significantly enhanced by elongation factors that bind to RNAP secondary channel, 299
such as Gre in bacteria, TFS in archaea, and TFIIS in eukaryotes (24). However, unlike the 300
absolute majority of living organisms, cyanobacteria lack Gre factor. The intracellular 301
concentration of Mn2+ is two orders of magnitude greater in cyanobacteria than in other bacteria 302
to support photosynthesis (16). It is possible that Mn 2+ replaces the catalytic Mg2+ of RNAP and 303
thus promotes misincorporation of NTPs (25, 26). Potentially as a compensating mechanism, 304
cyRNAP has been shown to possess proficient intrinsic proofreading activity (7, 27). However, 305
this intrinsic activity is still approximately 10 times lower than the Gre-stimulated activity of E. 306
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
11
coli RNAP. Gre-like factors either emerged after the split of cyanobacteria from their last 307
common ancestor with other bacteria or were subsequently lost. The distinctive characteristics of 308
cyRNAP—the absence of Gre/DksA factors and the split of the largest subunit may be 309
intrinsically linked to Si3 acquisition. The Si3-tail/fin position around the rim helix of RNAP 310
prevents association of secondary channel binding proteins, such as GreA and DksA, with 311
cyRNAP (Fig. 3). As secondary channel binding proteins play critical roles in transcription 312
fidelity and regulation in bacteria, the Si3-GreA/DksA trade-off in cyanobacteria might be 313
advantageous but remains to be fully understood. With Si3 acquisition, β ’ increased to 210 kDa 314
in size, and separation of the original rpoC gene into two genes was perhaps beneficial to 315
facilitate expression of such a large protein. The observed change in the position and mobility of 316
Si3 in cyRNAP ECs compared to those in the promoter complex (Fig. 5A) raises questions about 317
the role of Si3 in promoter escape. Si3 may complicate promoter escape by binding to the σ 318
factor; conversely, its large-range movement upon RNA synthesis may contribute to weakening 319
σ association with core and/or promote σ release at transition to elongation stage. 320
The structure corresponding to Si3 of cyRNAP has not been found in other bacterial 321
RNAPs. However, the structure and arrangement of the Rpb9 subunit in eukaryotic RNAPII 322
show remarkable similarity to those of the Si3 subunit of cyRNAP (Fig. 3C). Rpb9 is positioned 323
within a cavity between the rim helix and the lobe domain of RNAPII, akin to the Si3-fin of 324
cyRNAP (highlighted in red in cyRNAP and RNAPII). Rpb9 is a unique subunit found only in 325
RNAPII and plays a critical role in enhancing the accuracy of transcription (28). Although both 326
Rpb9 and Si3-tail are located away from the active site of RNAP, their presence may enhance 327
transcription fidelity, which coordinates RNAP confirmation changes such as RNAP swiveling 328
and/or movement of the rim helix during the nucleotide addition cycle (20). The presence of 329
these unique structural features in different types of RNAPs suggests a common mechanism for 330
enhancing transcriptional accuracy and specificity across different organisms. Further 331
investigation of Si3 function at different stages of transcription and under several growth 332
conditions in cyanobacteria will be required to determine the full array of its biological 333
functions. 334
335
Experimental Procedures 336
Protein preparation 337
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
12
The DNA fragment encoding Thermosynechococcus elongatus BP-1 Si3 in the β’2 subunit 338
(TelSi3, RpoC2 residues 345-983, 69 kDa) was cloned between the NdeI and BamHI sites of the 339
pET15b expression vector to introduce an N-terminal His 6-tag, and the protein was 340
overexpressed in E. coli BL21(DE3)/pLysS cells. Transformants were subsequently grown in LB 341
media supplemented with ampicillin (100 μ g/ml) and chloramphenicol (25 μg/ml) at 37 °C until 342
the OD600 reached ~0.5, after which protein expression was induced by adding 0.5 mM IPTG for 343
10 h at 4 °C. The harvested cells were lysed by sonication, and proteins in the soluble fraction 344
were purified by Ni-affinity column chromatography (HisTrap 5 ml column, GE Healthcare). 345
The His6-tag was removed by thrombin digestion (1 μg of thrombin per mg of TelSi3 protein) 346
for 20 h at 4 °C, and the protein was further purified by Q Sepharose column chromatography 347
(GE Healthcare) and gel-filtration column chromatography (HiLoad Superdex75 16/60, GE 348
Healthcare). The purified protein was concentrated to 15 mg/ml and exchanged into buffer 349
containing 10 mM Tris-HCl (pH 8.0), 50 mM NaCl and 0.1 mM EDTA. 350
351
Limited trypsinolysis 352
Limited trypsinolysis was used to remove flexible regions from TelSi3, and N-terminal amino 353
acid sequencing was used to identify protein fragments suitable for crystallization. The trypsin 354
digests were carried out in 10 mM Tris–HCl (pH 8), 100 mM NaCl, 5% (v/v) glycerol, 0.1 mM 355
EDTA and 1 mM DTT. TelSi3 (10 mg/ml) was digested in a 10 µl volume with different 356
amounts of trypsin (5 nM to 5 µM) for 10 min at 25 °C. The reactions were terminated by 357
addition of PMSF. The trypsinized fragments were separated by SDS /i2 PAGE and blotted onto 358
PVDF membranes, and the N-terminal sequences were determined by Edman based protein 359
sequencing. The TelSi3 fragment containing residues 435-938 (TelSi3ΔN, 60 kDa) was PCR 360
subcloned and inserted into the pET15b expression vector between the NdeI and BamHI sites. 361
The protein was overexpressed and purified as described above for full-length TelSi3. 362
363
Crystallization 364
Initial crystals of TelSi3 ΔN were obtained by the hanging-drop vapor diffusion method by 365
mixing equal volumes of the protein solution (20 mg/ml) and crystallization solution (0.1 M 366
sodium citrate [pH 3.5], 0.2 M MgCl 2 and 10% PEG6000) and incubating at 4 °C over the same 367
crystallization solution. The large crystals (0.5 × 0.2 × 0.2 mm) used for X-ray data collection 368
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
13
were prepared by microseeding by mixing 2 µl of protein solution, 2 µl of crystallization solution 369
(0.1 M sodium citrate [pH 5], 0.4~0.6 M MgCl 2, 4~6% PEG3350 and 50 µg/ml heparin) and 0.2 370
µl of seed solution. The crystals were then dehydrated by transfer to crystallization solution 371
(without heparin) with increasing concentrations of PEG3350 (in 5% steps) to a final 372
concentration of 20% and incubated for 5-10 h. For all procedures, crystal preparation, growth 373
and dehydration were performed at 4 °C. The crystals were transferred to a crystallization 374
solution with 25% (v/v) propylene glycol as a cryoprotective solution and flash frozen in liquid 375
nitrogen. Selenomethionine-substituted proteins were prepared for SAD analysis by suppressing 376
methionine biosynthesis. 377
378
X-ray data collection and crystal structure determination 379
In addition to the four original methionine residues found in TelSi3 ΔN (including an N-terminal 380
methionine residue resulting from cloning into the pET15b vector), three methionine residues 381
were introduced by replacing the leucine residues at 508, 738 and 922 by site-directed 382
mutagenesis to obtain the experimental phase via single-anomalous dispersion (SAD) 383
experiments using SeMet-labeled proteins. The TelSi3ΔN protein with seven selenomethionine 384
residues (TelSi3 ΔNMet7) was generated by suppressing methionine biosynthesis during 385
overexpression of the TelSi3ΔNMet7 protein. The protein was purified as described above. 386
Diffraction data were corrected at National Synchrotron Light Source (NSLS) beamline X25. 387
There are six TelSi3 ΔNMet7 molecules in an asymmetric unit of the crystal belonging to the 388
P3(2)21 space group. The crystallographic datasets were processed using HKL2000 (29). With 389
the anomalous signal from SeMet, the experimental phase (figure of merit: 0.273) was calculated 390
using automated structure solution (AutoSol) in PHENIX (30). Density modification yielded a 391
map suitable for manual model building by Coot (31) followed by structure refinement using 392
PHENIX. The final coordinates and structure factors have been deposited in Protein Data Bank 393
(PDB) under the accession codes listed in Supplementary Table 1. 394
395
Expression and isolation of the Synechococcus elongatus RNAP 396
The core enzyme of cyRNAP was overexpressed in E. coli T7Express cells (New England 397
Biolabs) cells transformed with a pET28a expression vector containing the α , β , β ’1, β ’2 and ω 398
encoding genes ( β and β ’2 contain a Strep-tag and His-tag, respectively) (32). The cells were 399
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
14
grown in LB media supplemented with kanamycin (50 μ g/ml) at 37 °C until the OD600 was ~0.6. 400
Afterward, the cells were induced with IPTG (1 mM) and grown overnight at 22 °C. 401
The biomass was harvested and suspended in lysis buffer (50 mM Tris-HCl (pH 8.0), 250 mM 402
NaCl, 10% glycerol, 20 mM imidazole, and 1 mM β -mercaptoethanol and protease inhibitors 403
from Roche according to the manufacturer’s instructions). The cells were sonicated, lysate 404
centrifuged at 18 k x g, after which the supernatant was collected. The protein was purified at 405
4 °C sequentially through a HisTrap (5 mL) column and a Strep-Tactin XT (1 mL) column (both 406
from Cytiva). The latter column was washed with 3 column volumes (CVs) of Buffer W (100 407
mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA). The bound protein was eluted by applying 408
1 CV of Buffer E (100 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA, 2.5 mM 409
desthiobiotin). The purified cyRNAP (20 μ M) was assessed using SDS/i2 PAGE, dialyzed against 410
Storage Buffer (40 mM Tris–HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 mM DTT, and 5% 411
glycerol), and stored at -80 °C. 412
413
Cloning, expression and isolation of the Synechococcus elongatus NusG and Si3 proteins 414
The NusG was overexpressed in E. coli T7Express cells (New England Biolabs) cells 415
transformed with pET28a expression vector where the gene for the C-terminal His 6-416
tagged Synechococcus elongatus NusG was cloned. Cells were grown in LB 417
medium supplemented with kanamycin (50 μ g/ml) at 37°C until OD 600 ~0.5, then induced with 418
IPTG (1 mM) and grown overnight at 22°C. Culture pellets were sonicated in 50 ml Lysis Buffer 419
(10 mM Tris-HCl pH 7.9, 300 mM KCl, protease inhibitors from Roche according 420
to manufacturer), spun at 18K rpm, and filtered through 0.22 μ M syringe filter. Filtered 421
supernatant was subjected to Ni-NTA affinity chromatography in 10 mM Tris-HCl pH 7.9, 600 422
mM KCl, 5% glycerol with 50 mM imidazole washes and 100 mM imidazole elution. The eluted 423
protein (in 600 mM KCl) was diluted (~100 mM KCl) and applied to a pre-equilibrated with 424
(10 mM Tris-HCl pH 8.0, 100 mM KCl, 5% glycerol) 5 ml Resource Q column, Cytiva. The 425
column was washed with 5 CV of equilibration buffer, and the protein was eluted by applying a 426
linear salt gradient (100-1 M KCl) over 10 CV. The purified NusG (90 μ M) was checked using 427
SDS-PAGE, stored in Storage Buffer (40 mM Tris-HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 428
mM DTT, 5% glycerol) at -80 °C. Cyanobacteria-specific loop of NusG (residues 110-122) was 429
deleted by site-directed mutageneses, and the mutant NusG was isolated as the WT protein. 430
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
15
The open reading frame encoding separate full-size Si3 domain was cloned into pET28 vector, 431
overexpressed E. coli T7Express cells (New England Biolabs) as the N-terminal His 6-tagged 432
protein, and isolated via Ni-NTA affinity chromatography on HisTrap column, Cytiva, similarly 433
to NusG protein. After affinity chromatography the protein was dyalised against the storage 434
buffer (20 Tris-HCl pH 8.0, 200 mM KCl, 1 mM EDTA, 1 mM DTT, 50% glycerol). 435
436
Sample preparation for cryo-EM 437
The cyRNAP EC with NusG was reconstituted in vitro by mixing 5 μ M cyRNAP with equimolar 438
amounts of template DNA and RNA (Fig. 2A) in storage buffer at 37 °C for 10 minutes, 439
followed by mixing with 7 μ M nontemplate DNA and incubating further for 10 minutes. The 440
resulting EC was mixed with 7 μ M NusG and incubated for 10 min at 37 °C. CHAPSO (8 mM) 441
was added to the sample just before vitrification. The iNTP-bound EC was prepared by adding 1 442
mM 3’-deoxyATP or CTP to the EC with NusG and incubating for 5 min at 37 °C. Another 443
difference between the EC- and iNTP-bound ECs is the nontemplate DNA used in the scaffold, 444
the latter of which contains complementary transcription bubbles. 445
446
Grid preparation for cryo-EM 447
C-flat Cu grids (CF-1.2/1.3 400 mesh, Protochips, Morrisville, NC) were glow-discharged for 40 448
seconds using the PELCO easiGlowTM system prior to application of 3.5 μ l of the sample (2.5 –449
3.0 mg/ml protein concentration) and plunge-freezing in liquid ethane using a Vitrobot Mark IV 450
(FEI, Hillsboro, OR) with 100% chamber humidity at 5 °C. 451
452
Cryo-EM data acquisition and processing 453
Data were collected using a Titan Krios (Thermo Fisher) microscope equipped with a Falcon IV 454
direct electron detector (Gatan) at Penn State Cryo-EM Facility. Sample grids were imaged at 455
300 kV, with an intended defocus range of -2.5 to -0.75 μ m and a magnification of 75,000X in 456
electron counting mode (0.87 Å per pixel). Movies were collected with a total dose of 45 457
electrons/Å2. Downstream processing was performed with CryoSPARC (33). The movies were 458
corrected and aligned using patch motion correction followed by patch CTF correction. Particles 459
were picked using a template-based autopicker and multiple rounds of 2D classification to 460
discard bad particles. The 2D classes with EC-NusG particles were selected and used for training 461
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
16
the Topaz model (34). The Topaz-extracted particles were subjected to multiple rounds of 462
heterogeneous refinement to remove junk particles. Finally, a nonuniform refinement operation 463
was run on the final set of particles to yield the reconstruction (SFigs. 4 and 6). 464
465
Structure refinement and model building 466
A model of the cyRNAP core enzyme was constructed by homology modeling using core RNAP 467
from the cryo-EM structure of the Syn6803 RNAP- σA promoter DNA open complex as a 468
Reference
model. A model of cyNusG was constructed with the AlphaFold2 gene (35). DNA and 469
RNA models were constructed using the E. coli RNAP elongation complex (PDB: 7MKO) as a 470
guide. The cyRNAP gene was manually fitted into the cryo-EM density map using Chimera (36), 471
followed by rigid body and real-space refinement using Coot (31) and Phenix (37). 472
473
In vitro transcription in the assembled elongation complexes 474
ECs were assembled and immobilized as described (38). Sequences of the oligonucleotides used 475
for the assembly of ECs are shown on Fig.4C. For assembly of ECs used for experiments on Fig. 476
4C, 13 nt long RNA was radiolabelled at the 5’-end with [ γ -32P] ATP and T4 Polynucleotide 477
kinase (New England Biolabs) prior to complexes assembly. Stalled elongation complexes 478
EC14, EC15 and EC16 were obtained by extension of the initial RNA13 in EC13 with 10 μ M 479
NTP sets according to the sequence for 5 min and then were washed with TB to remove Mg 2+ 480
and NTPs. Reactions were initiated by addition of 10 mM MgCl 2 with or without either 1 μ M 481
NTPs or 250 μ M PPi. Single nucleotide addition and pyrophosphorolysis experiments were 482
performed at 30°C in transcription buffer (TB) containing 20 mM Tris–HCl pH 6.8, 40 mM KCl, 483
10 mM MgCl 2, transcript hydrolysis was done in the same buffer except at pH 7.9. After 484
incubation for intervals of time specified on Figures, reactions were stopped with formamide-485
containing buffer. Products were resolved by denaturing 23% polyacrylamide gel electrophoresis 486
(PAGE) (8 M Urea), revealed by PhosphorImaging (Cytiva) and visualized using ImageQuant 487
(Cytiva) software. Kinetics data were fitted to a single exponential equation y=y 0+a-bx using 488
SigmaPlot software by non-linear regression to determine rate constants of the reactions. 489
490
Denaturation and renaturation of cyRNAP 491
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
17
Denaturation of cyRNAP was performed by incubating the purified protein for 20 min in 492
denaturing buffer containing 20 mM Tris-HCl (pH 7.9), 6 M guanidine-HCl, 5% glycerol, 1 mM 493
EDTA, and 10 mM DTT at 30 °C in a 100 µl volume and with a cyRNAP concentration of 0.5 494
mg/ml. Recombinant Si3 was included in 2.5 molar excess. The proteins were renatured via 495
overnight dialysis at 7 °C against renaturing buffer containing 20 mM Tris-HCl (pH 7.9), 200 496
mM KCl, 10% glycerol, 2 mM MgCl2, 10 µM ZnCl2, 1 mM EDTA, and 1 mM DTT. Aliquots of 497
the renaturation mixture and their serial dilutions were used for nucleotide addition experiments 498
on assembled constructs containing template DNA and RNA oligonucleotides. A 13 nt RNA 499
oligonucleotide was radiolabeled at the 5’ end with [ γ -32P] ATP and T4 polynucleotide kinase 500
(New England Biolabs) prior to EC assembly. The indicated on the Fig. 5C amount of assembly 501
# 502
mixture was incubated with the RNA-DNA duplex for 5 min at room temperature, then 10 µM 503
GTP was added for 10 minutes at 30°C. Reactions were stopped and products analyzed as 504
before. 505
506
Complex formation between the Si3 protein and core cyRNAP 507
For the binding experiment 150 nM core enzymes and 1.5 µM Si3 proteins were incubated for 10 508
minutes at 4°C in 20 mM Tris-HCl pH 7.9, 40 mM KCl, mixed with loading dye (final 509
concentration is 50mM BisTris pH 7.2, 50mM NaCl, 10% glycerol, 0.001% Ponceau S) and 510
resolved on the NativePAGE 3-12% Bis-Tris gel, Invitrogen using running buffers prepared 511
according to the manufacturer, for 90 minutes at 150V. Gel was fixed with 50% methanol,10% 512
acetic acid solution, and additionally de-stained by boiling in 8% acetic acid. 513
514
Salt stability of elongation complexes 515
Elongation complex was assembled using oligos shown on SFig. 5B. 14 nt RNA in ECs on was 516
radiolabelled at the 3’ end by incorporation of [ α -32P] GTP into original 13 nt long RNA. To 517
examine the stability of ECs, ECs bound to the streptavidin sepharose beads, Cytiva via strep tag 518
on β subunit of cyRNAP, were incubated in TB containing 300 mM KCl at 30°C for times 519
specified on SFig. 5B. WT or mutant NusG Δ110-122 were added where specified at 1 μ M final 520
concentration. Supernatant and total fractions were collected for analysis. Reactions were 521
stopped and products analyzed as before. 522
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
18
523
Figure legends 524
Fig. 1. X-ray crystal structure of TelSi3Δ N. (A) The thick bars represent the primary sequences 525
of the largest subunits of the bacterial, chloroplast and archaeal RNAPs. Domains (Si3, green 526
boxes) and structural motifs (RH, rim helix; BH, bridge helix; TL, trigger loop) are labeled. The 527
lettered boxes represent evolutionarily conserved regions. The split ends of the two polypeptides 528
are indicated by black triangles. (B) Crystals of TelSi3 ΔN. (C) Structure of TelSi3 ΔN. Six 529
molecules of TelSi3 ΔN (I~VI) are present in the asymmetric unit. Molecules are depicted as 530
cartoon models with transparent surfaces, and each molecule is denoted by a unique color and 531
labeled. (D) The backbone is colored as a ramp from the N-terminus to the C-terminus, from 532
blue/cyan/green/yellow/orange/red. SBHMs are labeled 1 to 8, and subdomains (tail, fin, body 533
and head) are indicated. The TelSi3 ΔN structure lacks SBHM-1, and the trigger loops (TL N and 534
TLC) are depicted as blue oval and pink cylinders, respectively, with black lines showing their 535
connections with TelSi3 ΔN. (E) Molecules 1 and 3 of TelSi3 ΔN are superimposed via fin 536
subdomains, revealing flexibility in the orientation between the fin and body/head subdomains. 537
538
Fig. 2. Cryo-EM image of the cyRNAP elongation complex with NusG. (A) The sequence of 539
the DNA/RNA scaffold used for the EC-NusG assembly (template DNA, green; nontemplate 540
DNA, yellow; RNA, red). DNA and RNA regions lacking cryo-EM density are underlined. (B) 541
Orthogonal views of the cryo-EM density map. Subunits and domains of cyRNAP, DNA, RNA 542
and NusG are colored and labeled (RH, rim helix; prot, protrusion; downDNA, downstream 543
DNA; upDNA, upstream DNA). The split ends of the β ’1 and β ’2 subunits are indicated by 544
white circles. The SBHMs in Si3 are labeled 1 to 8. (C) Cryo-EM density of DNA, RNA and 545
NusG are shown with a transparent RNAP density map (ntDNA, nontemplate DNA; ssRNA, 546
single-stranded RNA). The 5’ and 3’ ends of the RNA are indicated. The cryo-EM density map 547
is colored according to B. (D) Efficient storage of an elongated and large Si3 molecule on the 548
surface of cyRNAP. The structure of EC-NusG is shown as a transparent surface, and the Si3, 549
DNA/RNA and trigger loop (TL N, TL C) regions are shown as cartoon models. SBHMs are 550
labeled 1 to 8, and subdomains (tail, fin, body and head) are indicated. The active site of RNAP 551
is designated by catalytic Mg2+ (magenta sphere). 552
553
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
19
Fig. 3. Comparison of the structures of cyRNAP, E. coli RNAP and eukaryotic RNAPII. 554
The structures of cyRNAP ( A), E. coli RNAP with GreB (PDB: 6RIN, B) and yeast RNAPII 555
(PDB: 7ML0, C) are shown as transparent surfaces with domains, subunits and a factor 556
described in the main text. 557
558
Fig. 4. Si3 movement during the trigger helix folding. (A) Cryo-EM maps of the iNTP-bound 559
(gray) and iNTP-free (light blue) states of the EC-NusG strains (RH, rim helix; prot, protrusion; 560
upDNA, upstream DNA). Arrows indicate movement of Si3 by trigger helix folding. (B) 561
Conformational change in Si3 during the transition from the trigger loop (TL) to the trigger helix 562
(TH) by iNTP (blue stick model) binding. The red and black arrows indicate movements of the 563
TL/TH-Si3 linker and Si3, respectively. A pivot point for converting the movement of the linker 564
to the swing motion of Si3 is shown as a blue transparent circle. (C) Si3 does not influence 565
catalysis by cyRNAP. Scheme and sequence of the assembled elongation complex used for 566
experiments with WT and ΔSi3 RNAPs. The table represents the summary of reaction rate 567
constants of single nucleotide addition (kNTP), pyrophosphorolysis (kPPi) and transcript hydrolysis 568
(kOH-) in EC14, EC15 and EC16 by WT and ΔSi3 RNAPs. The values that follow the ± sign are 569
the values of standard deviation derived from three independent experiments. The shade of green 570
in the cells reflects the value of the constant, i.e., darkest shade corresponds to the highest rate. 571
The right column shows the predominant translocation states of the elongation complexes, as 572
deduced from the relative rates of reaction. Scheme of RNAP oscillation in translocation 573
equilibrium and the architecture of the nucleic acid scaffold of the elongation complex in 574
post/i2 translocation, pre /i2 translocation and backtracked states, as adapted from (21). The 575
template DNA, the non ‐ template DNA and the RNA are green, yellow and pink, respectively. 576
Catalytic Mg2+ ions and the i+1 site of the RNAP active center are shown by a red circle and a 577
blue rectangle, respectively. 578
579
Fig. 5. Si3 functions. (A) The cryo-EM structures of cyRNAP in the EC (left) and the promoter 580
complex (right, PDB: 8GZG). The contact between Si3-head and σA in the promoter complex is 581
indicated by a black circle. (B) Si3 is not required for cyRNAP assembly or maturation. WT and 582
ΔSi3 cyRNAPs were denatured and subsequently renatured, after which their activity was tested 583
on the construct mimicking the DNA template–RNA transcript duplex structure by their ability 584
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
20
to incorporate the next nucleotide, G, dictated by the template. Twofold serial dilutions of the 585
assembly mixture with the indicated initial amounts of core enzymes were tested. The vertical 586
lines indicate the positions where the parts of the same gel were combined. (C) The recombinant 587
Si3 protein can bind ΔSi3 cyRNAP but not WT cyRNAP. The complex formation between the 588
indicated proteins was analyzed by blue native polyacrylamide gel electrophoresis. The vertical 589
line indicates the position where two parts of the same gel were combined. 590
591
Data, Materials, and Software Availability. The X-ray crystallographic density map and the 592
refined model have been deposited in Protein Data Bank (www.rcsb.org) under accession 593
number 8EMB. The cryo-EM density map and the refined model have been deposited in 594
Electron Microscopy Data Bank (www.ebi.ac.uk/emdb/) under accession numbers EMD-40874 595
(iNTP-free EC-NusG) and EMD-42502 (iNTP-bound EC-NusG) and in Protein Data Bank 596
(www.rcsb.org) under accession numbers 8SYI (iNTP-free EC-NusG) and 8URW (iNTP-bound 597
EC-NusG). All study data are included in the article and/or SI Appendix. 598
599
ACKNOWLEDGMENTS 600
We thank Jean-Paul Armache at Penn State for the technical support. We thank the National 601
Synchrotron Light Source (NSLS) Brookhaven National Laboratory for X-ray data collection. 602
We would like to acknowledge the Penn State Huck Life Science Institutes Cryo-EM Core 603
Facility for use of the Talos Arctica G2 TEM and the Vitrobot Mark IV and Sung Hyun Cho for 604
data collection. We thank Yu Zhang at the Shanghai Institute of Plant Physiology and Ecology 605
for kindly sharing the coordinates of Synechocystis sp. PCC 6803 RNAP. This work was 606
supported by a National Institutes of Health grant (R35 GM131860 to K. S. M.) and a 607
Biotechnology and Biological Sciences Research Council grant BB/W017385/1 to Y.Y. 608
609
References
610
1. T. Borner, A. Y. Aleynikova, Y. O. Zubo, V. V. Kusnetsov, Chloroplast RNA 611
polymerases: Role in chloroplast biogenesis. Biochim Biophys Acta 1847, 761-769 612
(2015). 613
2. T. Pfannschmidt et al., Plastid RNA polymerases: orchestration of enzymes with different 614
evolutionary origins controls chloroplast biogenesis during the plant life cycle. J Exp Bot 615
66, 6957-6973 (2015). 616
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
21
3. G. J. Schneider, N. E. Tumer, C. Richaud, G. Borbely, R. Haselkorn, Purification and 617
characterization of RNA polymerase from the cyanobacterium Anabaena 7120. J Biol 618
Chem 262, 14633-14639 (1987). 619
4. W. Q. Xie, K. Jager, M. Potts, Cyanobacterial RNA polymerase genes rpoC1 and rpoC2 620
correspond to rpoC of Escherichia coli. J Bacteriol 171, 1967-1973 (1989). 621
5. G. J. Schneider, R. Hasekorn, RNA polymerase subunit homology among cyanobacteria, 622
other eubacteria and archaebacteria. J Bacteriol 170, 4136-4140 (1988). 623
6. W. J. Lane, S. A. Darst, Molecular evolution of multisubunit RNA polymerases: 624
sequence analysis. J Mol Biol 395, 671-685 (2010). 625
7. A. Riaz-Bradley, K. James, Y. Yuzenkova, High intrinsic hydrolytic activity of 626
cyanobacterial RNA polymerase compensates for the absence of transcription 627
proofreading factors. Nucleic Acids Res 48, 1341-1352 (2020). 628
8. M. Z. Qayyum, Y. Shin, K. S. Murakami, Encyclopedia of Biological Chemistry III. 629
10.1016/b978-0-12-819460-7.00252-8, 358-364 (2021). 630
9. A. C. Cheung, P. Cramer, A movie of RNA polymerase II transcription. Cell 149, 1431-631
1437 (2012). 632
10. M. Chlenov et al., Structure and function of lineage-specific sequence insertions in the 633
bacterial RNA polymerase beta' subunit. J Mol Biol 353, 138-154 (2005). 634
11. J. Y. Kang et al., RNA Polymerase Accommodates a Pause RNA Hairpin by Global 635
Conformational Rearrangements that Prolong Pausing. Mol Cell 69, 802-815 e805 636
(2018). 637
12. I. Artsimovitch, V. Svetlov, K. S. Murakami, R. Landick, Co-overexpression of 638
Escherichia coli RNA polymerase subunits allows isolation and analysis of mutant 639
enzymes lacking lineage-specific sequence insertions. J Biol Chem 278, 12344-12355 640
(2003). 641
13. M. Abdelkareem et al., Structural Basis of Transcription: RNA Polymerase Backtracking 642
and Its Reactivation. Mol Cell 75, 298-309 e294 (2019). 643
14. Y. Shin et al., Structural basis of ribosomal RNA transcription regulation. Nat Commun 644
12, 528 (2021). 645
15. L. Shen et al., An SI3-sigma arch stabilizes cyanobacteria transcription initiation 646
complex. Proc Natl Acad Sci U S A 120, e2219290120 (2023). 647
16. R. A. Mooney et al., Regulator trafficking on bacterial transcription units in vivo. Mol 648
Cell 33, 97-108 (2009). 649
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
22
17. A. V. Yakhnin, M. Kashlev, P. Babitzke, NusG-dependent RNA polymerase pausing is a 650
frequent function of this universally conserved transcription elongation factor. Crit Rev 651
Biochem Mol Biol 55, 716-728 (2020). 652
18. J. Chen, A. J. Noble, J. Y. Kang, S. A. Darst, Eliminating effects of particle adsorption to 653
the air/water interface in single-particle cryo-electron microscopy: Bacterial RNA 654
polymerase and CHAPSO. J Struct Biol X 1 (2019). 655
19. M. Z. Qayyum, V. Molodtsov, A. Renda, K. S. Murakami, Structural basis of RNA 656
polymerase recycling by the Swi2/Snf2 family of ATPase RapA in Escherichia coli. J 657
Biol Chem 297, 101404 (2021). 658
20. R. K. Vishwakarma, M. Z. Qayyum, P. Babitzke, K. S. Murakami, Allosteric mechanism 659
of transcription inhibition by NusG-dependent pausing of RNA polymerase. Proc Natl 660
Acad Sci U S A 120, e2218516120 (2023). 661
21. A. Bochkareva, Y. Yuzenkova, V. R. Tadigotla, N. Zenkin, Factor-independent 662
transcription pausing caused by recognition of the RNA-DNA hybrid sequence. EMBO J 663
31, 630-639 (2012). 664
22. R. Fukuda, A. Ishihama, Subunits of RNA polymerase in function and structure; 665
Maturation in vitro of core enzyme from Escherichia coli. J Mol Biol 87, 523-540 (1974). 666
23. G. Dong, S. S. Golden, How a cyanobacterium tells time. Curr Opin Microbiol 11, 541-667
546 (2008). 668
24. R. C. Conaway, S. E. Kong, J. W. Conaway, TFIIS and GreB: two like-minded 669
transcription elongation factors with sticky fingers. Cell 114, 272-274 (2003). 670
25. J. Hurwitz, L. Yarbrough, S. Wickner, Utilization of deoxynucleoside triphosphates by 671
DNA-dependent RNA polymerase of E. coli. Biochem Biophys Res Commun 48, 628-635 672
(1972). 673
26. S. K. Niyogi, R. P. Feldman, Effect of several metal ions on misincorporation during 674
transcription. Nucleic Acids Res 9, 2615-2627 (1981). 675
27. M. Imashimizu, K. Tanaka, N. Shimamoto, Comparative Study of Cyanobacterial and E. 676
coli RNA Polymerases: Misincorporation, Abortive Transcription, and Dependence on 677
Divalent Cations. Genet Res Int 2011, 572689 (2011). 678
28. N. K. Nesser, D. O. Peterson, D. K. Hawley, RNA polymerase II subunit Rpb9 is 679
important for transcriptional fidelity in vivo. Proc Natl Acad Sci U S A 103, 3268-3273 680
(2006). 681
29. Z. Otwinowski, W. Minor, Processing of X-ray diffraction data collected in oscillation 682
mode. Methods Enzymol 276, 307-326 (1997). 683
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
23
30. D. Liebschner et al., Macromolecular structure determination using X-rays, neutrons and 684
electrons: recent developments in Phenix. Acta Crystallogr D Struct Biol 75, 861-877 685
(2019). 686
31. P. Emsley, B. Lohkamp, W. G. Scott, K. Cowtan, Features and development of Coot. 687
Acta Crystallogr D Biol Crystallogr 66, 486-501 (2010). 688
32. L. Shen et al., A SI3-σ arch stabilizes cyanobacteria transcription initiation complex. 689
bioRxiv 10.1101/2022.10.06.511230, 2022.2010.2006.511230 (2022). 690
33. A. Punjani, J. L. Rubinstein, D. J. Fleet, M. A. Brubaker, cryoSPARC: algorithms for 691
rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290-296 (2017). 692
34. T. Bepler et al., Positive-unlabeled convolutional neural networks for particle picking in 693
cryo-electron micrographs. Nat Methods 16, 1153-1160 (2019). 694
35. J. Jumper et al., Applying and improving AlphaFold at CASP14. Proteins 89, 1711-1721 695
(2021). 696
36. E. F. Pettersen et al., UCSF Chimera--a visualization system for exploratory research and 697
analysis. J Comput Chem 25, 1605-1612 (2004). 698
37. P. V. Afonine et al., Real-space refinement in PHENIX for cryo-EM and crystallography. 699
Acta Crystallogr D Struct Biol 74, 531-544 (2018). 700
38. Y. Yuzenkova et al., Stepwise mechanism for transcription fidelity. BMC Biol 8, 54 701
(2010). 702
703
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
headbodyfin
24 Å, 11º
I
II
III V
8
2
3 4
5
6
7
TLc
TLN
1
Mg (active site)
0.5 x 0.2 x 0.2 (mm)
C
B D
E
tail
Fig. 1
A
A
A
C
C
C
B
B
B
D
D
D
F
F
F
E
E
E
G
G
G
H
H
H
645
793
188
G
G
GE. coli
Cyanobacteria
(T. elongatus)
(S. elongatus)
Chloroplast
(A. thaliana)
BH
RH
TLN TLC JawMg2+
A CB D FE G HArchaeal RNAP
TL
A
clump
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
Si3
upDNAb’1
b’2
a
RNA ch
1
8
prot
lobe
RH
3 4 5
6
7
b
RNA ch
flap
upDNA
NusGSi3
b
w
a
b’1
prot
1 8
RH
w
split
ends
a
2nd ch
82
b
b’2
lobe
2
3 4
5
6
7
8
TLc
TLN
Mg
RNAdownDNA
fin body
headtail 8
1
upDNA
B
A CCTCTCCATG
5'-GGGCGCATGCTGCTCTA ACGGCGACTGCCC-3’
3'-CCCGCGTACGACGAGATCCTCTCCATGTGCCGCTGACGGG-5’
GGAGAGGUA
5'-GCAUUCAAAGC
upDNA downDNA
RNA
C
downDNA
upDNA
NusG
3’
5’
ntDNA
ssRNA
Fig. 2
D
2nd ch
90°
downDNA
Si3
b’1
b’2
NusG
1
2
8
lobe
clamp
jaw
3 4
5
6
7
90° 180°
Mg
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
90°
Si3-body/head
Mg
BH
RH
Si3-tail
Si3-fin lobe
Si3-fin
RH
BH
Mg
Si3-tail
lobe
2nd ch
Si3-body/head
Mg
BH
RH
GreB
lobe
Si3
Mg
BH
RH
GreB
lobeSi3
Mg
BH
RH
lobe
Rpb9
Mg
BH
RH
lobe
Rpb9
cyRNAP
E. coli RNAP-GreB
Yeast RNAPII
acidic residues
Fig. 3
A
B
C
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
prot
lobe
head
RH
body
fin
tail
upDNA
RH
fin
tail
2nd ch
head
body
50 Å, 24º
A
fin
tail
TLc
TLN
MgA
BH
downDNA
THc
THN
NusG
iNTP
MgB
tDNA
ntDNA
RNA
body
head
upDNA
Si3
iNTP
Fig. 4
90°
B RH
fin
tail
body
head
C
pivot point
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
upDNA
b
a
NusG
Si3
pDNA
b’1
b’2
sA
-35
-10
headbody
tail
fin
cyRNAP-sA holoenzyme
promoter DNA complex
cyRNAP EC with NusG
A
Fig. 5
B
C
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 11, 2024. ; https://doi.org/10.1101/2024.01.11.575193doi: bioRxiv preprint
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.