Structure and rational engineering of the PglX methyltransferase and specificity factor for BREX phage defence

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 97,724 characters · extracted from oa-pdf · 10 sections · click to expand

Keywords

BREX, phage defence, PglX, methyltransferase, Ocr 14 15 16 17 18 19 20 21 22 23 24 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 2

Abstract

25 Bacteria have evolved a broad range of systems that provide defence against their viral predators, 26 bacteriophages. Bacteriophage Exclusion (BREX) systems recognize and methylate 6 bp non -27 palindromic motifs within the host genome , and prevent replication of non-methylated phage DNA 28 that encodes these same motifs. How BREX recognizes cognate motifs has not been fully understood. 29 We have characterised BREX from pathogenic Salmonella and generated the first X -ray 30 crystallographic structures of the conserved BREX protein, PglX. The PglX N-terminal domain encodes 31 the methyltransferase, whereas the C-terminal domain is for motif recognition. We also present the 32 structure of PglX bound to the phage -derived DNA mimic, Ocr , an inhibitor of BREX activity. Our 33 analyses propose modes for DNA -binding by PglX and indicate that larger BREX complexes are 34 required for methyltransferase activity and defence. Through rational engineering of PglX , we 35 broadened both the range of phages targeted, and the host motif sequences that are methylated by 36 BREX. Our data demonstrate that PglX is the sole specificity factor for BREX activity, provid ing motif 37 recognition for both phage defence and host methylation. 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 3

Introduction

56 Bacteria have evolved a diverse range of defences to protect from bacteriophages (phages) and mobile 57 genetic elements 1,2. Classic examples of host defence mechanisms include restriction -modification 58 (RM) 3, abortive infection 4,5 and CRISPR-cas 6. Genes encoding these systems tend to co-localise into 59 “defence islands” 7. Analysis of defence islands using a “guilt-by-association” approach have resulted 60 in significant expansion of predicted and validated defence systems 8,9, including Bacteriophage 61 Exclusion (BREX) 10, CBASS 11, BstA 12, retrons 13, viperins 14 , pycsar 15 and PARIS 16. Whilst the 62 combinations of phage defence systems encoded in any island can differ, there is ev idence that 63 conserved regulatory systems, such as the BrxR family, control defence expression perhaps mediating 64 robust defence against a broad spectrum of invaders 17–19. 65 66 BREX genes are found in 10% of bacterial and archaeal genomes 10. BREX is related to Phage Growth 67

Limitation

(Pgl) (22) and was first identified through analysis of genes neighbouring pglZ, performed 68 to locate likely defence genes 10. Together with gmrS/gmrD, which encode a Type IV restriction 69 enzyme, BREX genes form one of the most common defence island pairings 7,21. We have recently 70 demonstrated that a defence island encoded on a multidrug -resistant plasmid of Escherichia 71 fergusonii provides complementary phage defence using BREX and a GmrSD homologue, BrxU 22. 72 There are six BREX sub-types, and type I BREX contains six genes; brxA, brxB, brxC, pglX, pglZ and brxL 73 10. BrxA is a DNA-binding protein 23, and BrxL is a DNA-stimulated AAA+ ATPase 24. PglX has sequence 74 and structural homology to methyltransferases and is hypothesised to methylate non-palindromic 6 75 bp sequences (BREX motifs) on the N6 adenine at the fifth position of the motif 10,22,25, allowing 76 discrimination between self and non-self DNA. Interestingly, it has been shown that Ocr from phage 77 T7, a protein that mimics dsDNA 26, can inhibit BREX activity through binding to PglX 27. Whilst 78 reminiscent of RM systems, the mechanism of BREX activity remains unclear. 79 80 The stySA locus from Salmonella enterica serovar Typhimurium 28, (also known as SenLT2III), was 81 recently re-constructed in an attenuated lab strain of S. Typhimurium (LT2) and shown to have BREX 82 activity 29. In 2017, invasive non-typhoidal Salmonella (iNTS) disease was responsible for 77,500 deaths 83 globally, of which 66,500 deaths occurred in sub -Saharan Africa 30. A high proportion of African iNTS 84 cases are caused by S. Typhimurium ST313 31,32. Representative ST313 strain D23580 31 encodes a BREX 85 locus that is closely-related to the LT2 BREX locus (Fig. 1a), comprising a defence island formed from 86 an amalgamation of the type I BREX system and PARIS 16. The D23580 BREX defence island lacks the 87 additional upstream and regulatory genes observed in the E. fergusonii type I BREX defence island 22. 88 89 The relative simplicity of the Salmonella BREX system and the clinical relevance of the host strain 90 prompted us to test the effects of the D23580 BREX defence island against environmental Salmonella 91 phages. The D23580 BREX phage defence island was then characterised through systematic gene 92 deletions in an E. coli background, to allow use of the Durham phage collection 33 in identifying the 93 determinants of phage defence and PglX -dependent host methylation. We present the first X-ray 94 crystallographic structural characterisation of PglX . We also present the first X -ray crystallographic 95 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 4 structural characterisation of PglX bound by the DNA mimic Ocr. Through rational engineering of PglX 96 it was possible to alter the BREX motif recognised for methylation and phage defence. Our structural 97 and biochemical analyses support PglX being the BREX methyltransferase and suggest modes of DNA-98 binding. Our data also definitively show PglX is the sole specificity factor in BREX phage defence, 99 providing motif recognition for both phage targeting and host methylation. 100 101 102 103 104 105 106 107 108 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 5

Results

109 The Salmonella D23580 BREX phage defence island provides protection 110 against environmental Salmonella phages 111 The BREX phage defence island from Salmonella enterica serovar Typhimurium ST313 strain D23580 112 (referred to as D23580 from now on) encodes two phage defence systems, type I BREX 10, and PARIS 113 16, collectively “BREX Sty” (Fig. 1a). The SalComD23580 RNA -seq-based gene expression compendium 114 (http:/bioinf.gen.tcd.ie/cgi-bin/salcom_v2.pl?_HL) shows that the defence island is expr essed 115 constitutively at the transcriptional level during exponential growth in LB and minimal media , and 116 within murine macrophages 34. Differential RNA -seq (dRNA -seq) was used to identify a promoter 117 upstream of brxA (STMMW_44431) at location 4773879 on the D23580 chromosome, which drives 118 transcription of the BREX-PARIS island 34 (Fig. 1a). 119 120 Also known as StySA 28, the ~15.7 kb D23580 BREXSty phage defence island has two synonymous point 121 mutations in pglX compared to the model S. Typhimurium ST19 strain LT2. The BREX island has 122 recently been studied in the S. Typhimurium-derived strain ER3625. Phage transduction was used to 123 construct ER3625 as a genetic hybrid between S. Abony 803 strain and S. Typhimurium in the 1960’s, 124 and the strain has recently been sequenced 35. In comparison to D23580, the defective BREX phage 125 defence island of S. Typhimurium strain ER3625 ha d a further 12 point mutations, of which 7 were 126 distributed throughout pglZ, and 5 in the 3′-terminal section of brxC 29. 127 128 The contiguous PARIS defence systems mediate an abortive infection response in the presence of the 129 anti-BREX and anti -restriction protein Ocr 16. The co -localisation of the PARIS gene s ariAB within 130 BREXSty raises the possibility that the BREX and PARIS defences work together in S. Typhimurium. Our 131 first aim was to confirm BREXSty activity in D23580. 132 133 To assess phage defence in D23580 we needed to isolate Salmonella phages. As phages isolated on 134 D23580 wild type (WT) would be inherently resistant to BREXSty, we first used a genetic approach to 135 generate a strain of D23580 that lacked BREXSty. The ST313 strain D23580 encodes 5 prophages that 136 encode their own antiphage systems, including the prophage BTP1 -encoded BstA 12. To reduce 137 interference from other antiphage systems, we began with the D23580Δφ mutant strain that lacks the 138 five major prophages. The entire BREXSty defence island, including PARIS, was then removed from 139 D23580Δφ using scar-less λ red recombination (Fig. S1) 36, resulting in strain D23850ΔφΔBREX 37. 140 141 Sewage effluent was obtained direct from source with the assistance of Northumbrian Water, and was 142 used for phage enrichment on D23850ΔφΔBREX. A range of plaques were obtained after these 143 enrichments, and 8 phage lysates were prepared following rounds of purification from visually distinct 144 plaques. Activity of the D2 3580 BREX defence island was confirmed using EOP assays with the 8 145 Salmonella phage isolates, testing the ability of the phages to plaque on D23580Δφ, with 146 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 6 D23850ΔφΔBREX as the control ( Fig. 1b). An EOP value of less than 1 indicates that a phage is less 147 efficient at forming plaques on the test strain compared to the control. Phages KMP, SB58 and SL2K 148 had an EOP of <1, with a reducetion in plaquing of ~ 100-fold compared to controls, indicating 149 sensitivity to BREXSty (Fig. 1b). Phage DB1 was more weakly affected, with an EOP of 0.13 (Fig. 1b). The 150 remaining four phages appeared unaffected by activity of BREXSty, with EOPS ~1 (Fig. 1b). These data 151 confirm that the BREX Sty defence island of D23580Δφ can provide active anti-phage activity in 152 Salmonella. 153 154 Impact of Salmonella D23580 BREX phage defence island gene deletions on 155 phage defence and methylation 156 Having investigated the impact of the D23580 BREX phage defence island, BREXSty, in the original 157 Salmonella host, we investigated BREX Sty in an E. coli background. The motivation for using this 158 heterologous host was to allow direct comparison with the previously characterised BREX phage 159 defence island from E. fergusonii 22, and use of our Durham collection of phages 33. E. coli is also a 160 more tractable experimental model for future experiments within this study. BREXSty was sub-cloned 161 in sections and then combined into plasmid pGGA by Golden Gate Assembly (GGA) 38, yielding plasmid 162 pBrxXLSty that contained the entire BREX and PARIS defence island, namely the eight genes from brxA 163 to brxL as depicted (Fig. 1a), under the control of the native promoters ( Fig. S2). Plasmid pTRB507 is 164 an equivalent empty vector control. Liquid cultures of E. coli DH5α WT, or cultures transformed with 165 either pBrxXLSty or pTRB507, were infected with Durham phage TB34 33, or lab phage T7 (ATCC BAA -166 1025-B2) (Figs. 2a-c). Infected control cultures were lysed by both phages; the T7 -infected cultures 167 did not recover, whereas the TB34-infected cultures began to grow again at 10-12 hrs post-infection, 168 presumably due to the selection of spontaneous TB34 -resistant mutants (Figs. 2a and b). In the 169 presence of pBrxXLSty, however, cultures infected with TB34 grew similarly to uninfected controls, 170 whilst cultures infected with T7 were lysed (Fig. 2c). These findings show that BREXSty is active in an E. 171 coli background, and demonstrates that pBrxXLSty provides defence against TB34, but not against T7. 172 173 To investigate the role of each phage defence gene in protection against TB34 infection, we generated 174 individual deletions of each D23580 BREX /PARIS gene in pBrxXL Sty, and a double mutant that lacked 175 both the ariA and ariB genes of the PARIS system ( Fig. S2). E. coli DH5α cells were transformed with 176 the mutant plasmids and liquid cultures of resulting strains were subsequently infected with TB34 and 177 T7 (Figs. 2d-l). Deletion of brxA, brxB, brxC, pglX and pglZ abolished defence against TB34 (Figs. 2d-h). 178 Our finding that deletion of brxL did not impact protection against TB34 revealed that BrxL is not 179 required for the phage defence activity of BREXSty against TB34 ( Fig. 2i). Deletion of aria and ariB, 180 either singly or together, also did not alter defence against TB34 (Figs. 2j-l). 181 182 183 184 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 7 185 186 187 188 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 8 Protection from infection by TB34 and T7 was then monitored using the quantitative EOP assay ( Fig. 189 3a). BREXSty encoded on pBrxXLSty provided a moderate 100-fold reduction in TB34 plating efficiency 190 and had no appreciable impact on T7 ( Fig. 3a). The 100-fold reduction matches the scale of phage 191 defence observed in Salmonella D23580Δφ against Salmonella phages (Fig. 1b). Therefore, plasmid 192 pBrxXLSty and BREXSty in the natural host chromosome provide a similar level of defence. Consistent 193 with results obtained with liquid cultures, deletion of brxA, brxB, brxC, pglX and pglZ ablated phage 194 defence in the EOP assay (Fig. 2; Fig. 3a). However, whereas deletion of brxL did not appear to impact 195 protection in liquid cultures (Fig. 2i), the EOP measurements revealed 10,000 -fold enhancement of 196 defence against TB34 in the absence of brxL compared to cells carrying pBrxXLSty WT (Fig. 3a). 197 Individual deletion of PARIS genes ariA and ariB caused a 10-fold increase in phage defence, while the 198 double ariA, ariB deletion had no additional impact (Fig. 3a). Collectively, these data demonstrate that 199 TB34 is targeted by type I BREX in the BREXSty D23580 BREX defence island, and that unlike the E. coli 200 and Acinetobacter BREX systems 17,25, BrxL is not necessarily a requirement for phage defence. 201 202 The EOP results of TB34 when tested against the brxL deletion and ariA, ariB double deletion strains 203 prompted us to test a wider range of phages. Using the Durham collection of 12 coliphages 33, we re-204 tested all phages against pBrxXL Sty, pBrxXL Sty-ΔbrxL and pBrxXL Sty-ΔariAΔariB (Fig. S3). Phages TB34, 205 Alma, BB1, CS16, Mav and Sipho had 10 - to 100-fold reduced EOPs on pBrxXLSty, compared to empty 206 vector controls ( Fig. S3). The brxL deletion caused a range of impacts . In some cases we observed 207 enhanced defence (TB34, Alma, Sipho), but in other case s there was no difference to an already 208 susceptible phage (BB1, CS16, Mav) ( Fig. S3). With phage Pau, against which BREX Sty WT had little 209 effect, the brxL deletion enhanced defence (Fig. S3). Other phages unaffected by the WT pBrxXL Sty 210 plasmid were also no t impacted by pBrxXLSty-ΔbrxL (Fig. S3 ). In contrast, the pBrxXLSty-ΔariAΔariB 211 construct generally produced similar EOP values compared to pBrxXLSty WT, though there was an 212 approximate ten-fold further reduction in EOP for phages Alma and Sip ( Fig. S3), and there was one 213 major difference where the ariA, ariB double deletion massively reduced the EOP of BB1 compared to 214 pBrxXLSty WT (Fig. S3). These data show that the PARIS system was itself not active against any tested 215 phage, and that deletion of brxL has phage-dependent impacts on defence (Fig. S3). 216 217 Having performed systematic analysis of gene deletions on phage defence, we then investigated a 218 second BREX phenotype; DNA methylation. PglX methyltransferases from type I BREX loci generate 219 N6-methylated adenines (N6mA) at the fifth position within 6-bp non-palindromic motif sequences of 220 host DNA 10,22,25. Restoring active function of the Salmonella LT2 StySA BREX system identified GATCAG 221 as the target motif sequence 29. W e explored the use of the MinION next -generation sequencing 222 system to detect N6mA methylation patterns. Previously, we performed this type of analysis using 223 methylation-deficient E. coli ER2796 39 in order to reduce background methylation. However, we were 224 unable to transform strain E. coli ER2796 with our pBrxXLSty constructs, perhaps because the defence 225 island impacted upon bacterial fitness in the absence of methylation. We therefore used E. coli DH5α 226 strains, noting that the background GATC methylation might interfere with detection of the proposed 227 GATCAG BREX methylation motif. Total genomic DNA was extracted from each strain and sequenced 228 by MinION. E. coli DH5α pBrxXLEferg, encoding the BREX phage defence island from E. fergusonii, was 229 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 9 230 231 232 233 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 10 used as an initial positive control to ensure the methylation detection procedure was working. We 234 successfully identified the GCTAAT methylation motif (Fig. S4a), as previously reported 22. To confirm 235 the Salmonella BREX motif we used a baseline control, wherein the pBrxXLSty WT sample was subjected 236 to whole genome amplification (WGA), which should remove DNA modifications. The WGA sample 237 contained the lowest detectable level of methylated GATCAG sequences, 12.87%, whilst pBrxXLSty WT 238 showed GATCAG methylation at 78.78% of sites, confirming that D23580 BREX produces N6mA at 239 GATCAG sequences (Fig. 3b; Fig. S4b). The brxA, brxB, brxC, pglX and pglZ mutants showed reduced 240 numbers of GATCAG methylation sites (Fig. 3b), indicating that all five gene products are required for 241 methylation. This finding is consistent with results involving the Acinetobacter BREX 17, but differs from 242 those obtained with E. coli BREX; the E. coli brxA was not required for methylation in conditions of 243 arabinose-induced BREX expression 25. In S. Typhimurium BREX, deletion of brxL did not reduce 244 methylation ( Fig. 3b) and the ariA, ariB and double mutants showed approximately WT levels of 245 methylation (Fig. 3b). 246 247 The observed changes in methylation levels identified the genetic requirements for BREX -mediated 248 methylation. However, the data did not agree with quantitative data on BREX methylation obtained 249 previously from Pacific Biosciences (PacBio) sequencing 22. To perform a direct comparison, we used 250 the same 12 strains to generate samples for PacBio sequencing (Fig. 3b). The PacBio results were more 251 robust than those from MinION, with 0% of motifs modified in the WGA sample and 100% of motifs 252 modified with pBrxXLSty WT. The BREX mutants also showed either no, or near-saturated, methylation 253 (Fig. 3b). The PARIS deletions resulted in close to WT levels of methylation by PacBio ( Fig. 3b), 254 indicating that PARIS is not involved in the observed methylation. These data show the genetic 255 requirements for D23580 BREX -dependent host methylation and demonstrate the utility of two 256 sequencing platforms when examining N6mA modifications. 257 258 Structure of PglX shows SAM binding for methyltransferase activity 259 It has not been understood how BREX systems recognize their cognate motifs. The likely candidate 260 protein, shown to be essential for methylation and defence, was the conserved PglX putative 261 methyltransferase. The closest structural homologue to the Alphafold predicted stru cture of PglX in 262 the PDB database is the Type IIL RM enzyme, MmeI 40, though domains are missing . As a result, in 263 order to learn more about BREX motif recognition , the structure of Salmonella PglX was sought 264 through X-ray crystallography. Following crystallization and data collection, an Alphafold model of PglX 265 was used as a search model for molecular repl acement, assisting the solution and refinement of the 266 crystallographic structure of Salmonella PglX bound to S-adenosyl-L-methionine (SAM), a co-factor for 267 methylation, to 3.4 Å (Fig. 4; Table 1). 268 269 The crystal structure contains two copies of PglX in the asymmetric unit , the smallest repeating unit 270 of the crystal . However, the arrangement of the two copies allows only weak interactions that are 271 likely formed due to interactions within the crystal rather than being biologically significant. The 272 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 11 273 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 12 architecture of PglX presents two distinct domains, N -terminal and C -terminal, linked by a central 274 short hinge region (residues 659 – 654) (Figs. 4a and b). Due to absence of available density, two short 275 loop regions were unable to be modelled (residues 53 – 56 and 418 – 420), but otherwise the full PglX 276 protein was resolved. SAM was also resolved bound within PglX (Fig. 4c). 277 278 The closest structural homologue for the solved PglX structure , as designated by the DALI server 41, 279 remains the Type IIL restriction-modification system, MmeI ( PDB 5HR4; Z-score 20.3). MmeI 280 demonstrates both N6mA DNA methyltransferase and DNA restriction activity 40 but the MmeI 281 structure only has 60.8% sequence coverage against PglX, (1225 residues and 745 residues for PglX 282 and MmeI, respectively) , and aligns to PglX with an RMSD of 7.13 Å (Fig. S5a). The majority of this 283 alignment falls within the N-terminal domain of PglX and bridges the hinge region, extending into the 284 C-terminal domain. The MmeI structure shows a methyltransferase domain bound to the SAM analog 285 sinefungin 40, and in our PglX structure SAM binds within the same pocket (Fig. 4 ). Within this 286 homologous domain of PglX (residues 227 – 661) sit the amino-methyltransferase motif I GxG residues 287 implicated in SAM binding (residues 315 – 317), and adenine specific motif IV responsible for 288 interacting with a flipped-out adenine base from the target DNA (NPPY; residues 509-512) (Fig. 4; Fig. 289 S5b). The presence and organisation of these motifs around the SAM molecule (Fig. 4c) is indicative 290 of a γ-class amino-methyltransferase 42, consistent with its homology to MmeI 40. Though MmeI has 291 both methyltransferase and restriction activities the MmeI nuclease domain (residues 1-155) was not 292 resolved in the MmeI structure 40. The nuclease domain of MmeI is separated by a helical linker. The 293 N-terminal domain of PglX contains a similar linker and an N-terminal helical bundle (residues 1 – 227), 294 but no nuclease domain (Figs. 4a and b). Assessing conservation between homologs in the UniRef 295 database using ConSurf 43, the MmeI -like DNA methyltransferase region of PglX appears highly 296 conserved compared to the N-terminal helical bundle domain (Fig. S 5c). Using DALI to search for 297 structural homologues of the C -terminal domain alone (residues 672 – 1221) returns Type I RM 298 specificity subunits. The immediate section of the C-terminal domain of PglX aligns with target 299 recognition domains (TRD) required for motif binding (residues 662 – 849). This is followed by two 300 long spacer helices (residues 850 – 960) that mimic dimeri zed spacers found in specificity factors of 301 Type I DNA methyltransferases such as EcoKI 44 (Fig. 4a and b). The spacers lead to a final C-terminal 302 region of unknown function (residues 961 – 1225). Interestingly, the spacer and C-terminal regions 303 extend 320 residues beyond the end of the alignment with MmeI and show a high degree of 304 conservation (Fig. 4a and b; Fig. S5a and c). This might suggest a specialised function conserved to 305 allow BREX activity, perhaps as a binding surface for other BREX components. As a result, t he PglX 306 structure, and lack of nuclease motifs and potential aligned catalytic residues, supports PglX acting as 307 a methyltransferase only, and not acting as a restriction enzyme. 308 309 With expression and purification methods established, and the structure supporting PglX as the BREX 310 methyltransferase (Fig. 4), a SAM-dependent methyltransferase assay was performed to assess the 311 ability of purified PglX to methylate DNA in vitro. Using E. coli DH5α genomic DNA known to contain 312 the target BREXSty motif as a substrate, PglX was added and incubated for 30 min at room temperature 313 in a buffer containing SAM. M ethyltransferase acti vity was measured indirectly via the reaction 314 product, S -adenosyl-L-homocysteine (SAH). No methylation was apparent from PglX under these 315 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 13 conditions (Fig. S6). We hypothesize that PglX methyltransferase activity likely requires the presence 316 of other BREX components. 317 318 Salmonella BREX can be inhibited by Ocr homologues through binding PglX 319 Ocr is the T7-encoded restriction system inhibitor that blocks phage defence activity of the E. coli BREX 320 system 27. Additionally, Ocr triggers Abi by the type II PARIS phage defence system 16. BREXSty also 321 encodes a homolog of PARIS (Fig. 1a). Though notably, no activity was observed for BREXSty against 322 phage T7 (Fig. 2 and Fig. 3a). Following the production of individual gene knockouts, it was possible 323 to individually assay inhibition of BREX and activation of PARIS by Ocr. To determine whether Ocr 324 inhibited BREX, vector pBAD30-ocr was generated. EOP assays were then carried out with E. coli DH5α 325 pBrxXLSty-ΔariAΔariB pBAD30-ocr and showed that expression of Ocr fully inhibited BREX defence (Fig. 326 5a). As Ocr is a product of T7, a coliphage, this experiment was also repeated using an Ocr homologue, 327 Gp5, encoded by Salmonella phage Sp6 45. Homology was inferred by protein sequence searches using 328 BLAST (NP_853565.1: 78.6% sequence similarity, 88% coverage) followed by predictive modelling 329 from protein sequence using AlphaFold 46. The structures of Ocr and Gp5 aligned with an RMSD of 330 0.91 Å. We again selected TB34 as a model phage and tested Gp5 activity. Results showed that Gp5 331 also fully inhibited the phage defence mediated by pBrxXLSty (Fig. 5a). 332 333 As we had demonstrated inhibition of BREX by overexpression of the inhibitors Ocr and Gp5, it was 334 postulated that the same experimental system might elicit phage defence mediated by the PARIS 335 system. This time, the pBrxXLSty-ΔpglX strain was used for co-expression of Ocr or Gp5, as this strain is 336 deficient for BREX phage defence but retains the PARIS system. The resulting EOP assays did not show 337 PARIS-dependent defence activity against TB34 (Fig. S7). We are therefore yet to find conditions that 338 stimulate activity of the Salmonella PARIS system. 339 340 We then aimed to recreate a PglX :Ocr complex 27, using our purified Salmonella PglX. The solution 341 state of native PglX was determined using analytical SEC. PglX eluted from the SEC column at 15.55 ml 342 (Fig. S8a), which indicated a size of ~150 kDa, matching the 143 kDa calculated weight of PglX. These 343 data indicate that PglX exists as a monomer in solution, supporting our conclusions from the PglX-SAM 344 structure (Fig. 4). Analytical SEC was then performed to determine whether Ocr directly interacts with 345 the Salmonella PglX. The Ocr sample was first examined by analytical SEC in isolation (Fig. S8a). Whilst 346 the Ocr SEC profile appeared to have multiple species, there was a dominant peak at 15.9 ml and a 347 shoulder at 18 ml. Ocr is known to be a dimer in solution 26,47, which would be 27.6 kDa and correspond 348 to the 18 ml peak, leaving the 15.9 ml peak unidentified. Purity of the Ocr sample had previously been 349 confirmed by mass spectrometry and SDS-PAGE (Fig. S8b and c). PglX and Ocr were then combined at 350 a 1:2 molar ratio prior to SEC (Fig. S8a). The combined sample produced additional peaks beyond 351 those from the individual PglX and Ocr samples (Fig. S8a). Of particular interest was the peak at an 352 elution volume of 14.2 ml that indicated a large complex of approximately ~379 kDa , potentially 353 comprised of at least two copies of PglX, and Ocr dimers (Fig. S8a). Elution volume is dependent on 354 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 14 355 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 15 protein molecular weight, and can also reflect the shape and size of the protein molecule itself. The 356 hydrodynamic radius of the PglX -Ocr complex seen by analytical SEC can be calculated from the 357 observed K av value 48, allowing comparison to the calculated hydrodynamic radius of predicted 358 PglX:Ocr complex models produced by AlphaFold 49. A model of two monomers of PglX and one Ocr 359 dimer produced by AlphaFold produced a predicted hydrodynamic radius of 58.3 Å , compared to a 360 calculated hydrodynamic radius of 63.9 Å for the observed A -SEC peak . This suggested that the 361 additional peak eluting at 14.2 ml represented a PglX-Ocr heterotetramer in solution. 362 363 PglX forms a heterotetrameric complex with inhibitor Ocr 364 To investigate the mechanism of BREX inhibition by Ocr, efforts were made to produce a structural 365 model via X-ray crystallography. PglX-SAM and Ocr were mixed at a 1:2 molar ratio and incubated 366 prior to setting crystallisation trials. After data collection and merging, and using our previously 367 derived PglX-SAM structure (Fig. 4) and the PDB structure of Ocr (1S7Z) as search models, the PglX -368 SAM:Ocr structure was solved to 3.5 Å (Figs. 5b and c; Table 1). 369 370 Within the asymmetric unit, PglX -SAM binds to a protomer of Ocr as a 1:1 complex, with the single 371 protomer of Ocr binding along the negatively charged region of the C -terminal domain of PglX. Data 372 on the solution state of Ocr (a dimer), coupled with our predictions of complex size by analytical SEC, 373 indicated that PglX:Ocr should form a larger complex. Indeed, when we searched for crystallographic 374 symmetry mates that showed packing of PglX -SAM:Ocr, the predicted complex was visible ( Figs. 5b 375 and c). In this complex, the Ocr protomers perfectly align and abut each other, forming the equivalent 376 of a solution state dimer , and the size matches our analytical SEC . We therefore concluded that this 377 heterotetrameric form represented the solution state of the PglX-SAM:Ocr complex (Figs. 5b and c). 378 379 Within PglX, there were again two regions of the sequence which could not be mo delled due to 380 insufficient density (residues 54 – 55 and 413 – 420). The latter is an extended gap in the same region 381 as a smaller gap in the PglX -SAM structure (D418 – F420), suggesting flexibility in this region. Also 382 visible in the PglX-SAM:Ocr structure is a bound SAM molecule, in the same ligand binding position as 383 seen in the PglX -SAM structure ( Figs. 4 and 5). The exact orientation of ribose and methionine 384 components of the molecule varied slightly, though this is likely due to variation in manual positioning 385 of the molecule during refinement, as well as the resolution. The PglX molecules from the PglX -SAM 386 and PglX-SAM:Ocr structures align closely with an RMSD of 1.34 Å, suggesting that binding of Ocr does 387 not elicit any substantive domain movement (Fig. S9). Important residue interactions for Ocr binding 388 were inferred using EMBL PISA 50. The complex is stabilised by a number of hydrogen bonds between 389 Ocr and the C-terminal domain of PglX (Fig. 5d). Six salt bridges are produced between R79, N35, N42, 390 N62, N76 and Q109 of Ocr and N1213, K1201, K1097, K1070, K1110, and K516 of PglX, respectively 391 (Fig. 5d). Though no movement is observed in PglX, the binding of Ocr to Type I RM complexes elicits 392 domain movement similar to DNA binding, suggesting either that PglX domain movement is reliant on 393 interactions with other BREX components, or that DNA binding occurs along the C -terminal domain 394 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 16 prior to movement towards the methyltransferase N-terminal domain. If other BREX components are 395 required for such movement, th e finding would be consistent with the lack of methyltransferase 396 activity in vitro in the absence of other BREX components ( Fig. S6) or the lack of methyltransferase 397 activity from PglX alone in vivo 25. Collectively, these data suggest that Ocr acts as a DNA mimic, 398 capable of sequestering PglX and therefore blocking BREX activity by preventing recognition of target 399 DNA. 400 401 Structural comparisons show multiple potential modes of DNA binding by 402 PglX 403 Ocr mimics the structure of 20-24 bp of bent B-form DNA 26, as shown by the binding of both molecules 404 to the EcoKI methyltransferase complex 44. Using the DNA -bound (PDB 2Y7H) and Ocr -bound (PDB 405 2Y7C) complexes of EcoKI, the Ocr and DNA molecules were superimposed onto each other . As a 406 result, the Ocr molecule in the PglX -SAM:Ocr structure was aligned with the Ocr molecule in 2Y7C, 407 effectively aligning the B -form DNA from 2Y7H to the Ocr molecule in PglX -SAM:Ocr structure ( Fig. 408 S10a). There does appear to be enough space for an extended DNA molecule to pass through the 409 groove in the hinge region in this orientation, but Ocr is not long enough to extend through this region 410 (Figs. 6a and b; Fig. S10b). This implicates the C-terminal domain in DNA binding, though raises the 411 possibility of an alternative DNA binding orientation. 412 413 The surface charge of PglX was calculated using APBS software plugin 51 and modelled in PyMOL 52 to 414 attempt to predict a lternate DNA binding position s (Fig. S11a). Notably, PglX displayed a large 415 positively charged surface area in the hinge region between the N-and C-terminal domains, extending 416 further along the inside of the C -terminal domain. As MmeI was solved in a DNA -bound state (PDB 417 5HR4), we could s uperpose these two structures and remov e MmeI, leaving the DNA molecule s at 418 within th e positively charged hinge region of PglX (Fig. 6b; Fig. S 11b). Notably, the angle of the 419 superimposed DNA molecule from the MmeI structure (PDB 5HR4) differs from the previously 420 identified angle of the 2Y7C DNA molecule (Fig. 6b). Further to this, the DNA molecule from the MmeI 421 structure contained an adenine base which had been flipped out of the DNA molecule, in preparation 422 for methyl transfer. Looking at the position of the superimposed MmeI DNA molecule, this adenine 423 base is positioned close to the SAM m olecule in PglX ( Fig. S11b). Together, these data suggest that 424 PglX might bind DNA within the hinge region in a similar conformation to that seen with MmeI, though 425 the exact orientation of the DNA molecule may shift around the position of the adenine base. In 426 support of this prediction, the donated methyl group of the SAM is not quite positioned correctly for 427 transfer to the flipped adenine (Fig. S11b). In this model, unlike for Ocr mimicking DNA, the distal C-428 terminal region of PglX remains largely removed from the DNA molecule, though binding of DNA may 429 require, or produce, a conformational change in PglX that brings this domain closer to the DNA. 430 431 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 17 432 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 18 PglX can be rationally engineered to alter phage target and methylation 433 motif 434 Rational engin eering of PglX could potentially allow for a BREX system to be targeted against a 435 different set of phages, and for the generation of specific methylation patterns. To this end, p rotein 436 sequences from BREX -related methyltransferases with assigned DNA recogn ition motifs were 437 collected and added to the sequences of BREX methyltransferases identified in the REBASE RM 438 database 53. BLASTp was then used to find 32 distin ct sequences that displayed high sequence 439 similarity scores to PglX (<E100) (Fig. S12). Most of the predicted motifs from REBASE were inferred 440 by matching the BREX methyltransferase to an N6mA modification observed in genomic sequencing 441 data. MmeI is the closest structural homologue of PglX and the residues essential for motif recognition 442 have been identified from structural data 40. As with PglX, MmeI recognises a 6 bp motif (TCCRAC) and 443 produces N6mA modifications at the 5th adenine base. Structural alignments of MmeI and PglX 444 allowed identification of the residue s of PglX that aligned with the residues involved in MmeI motif 445 recognition and suggested regions in which to focus the search for covariation in BREX 446 methyltransferase sequence alignments. Candidate residues and alterations were then chosen based 447 on these alignments. For example, for motif position -1 (relative to the modified adenine base); lysine 448 was conserved at residue 802 for enzymes recognising cytosine at this position, or histidine was 449 conserved at residue 838 for enzymes recognising guanosine at t his position, or asparagine was 450 conserved at residue 838 for enzymes recognising adenine at this position (Fig. S12). We designed 23 451 mutants that altered all five of the non-modified base positions in the PglX recognition motif 452 (Supplementary Table S 1). The regions targeted for mutation were overlaid on our structures and 453 shown to gather mainly within the TRD of PglX (between residues 684 – 838), with one additional loop 454 (residues 591 – 600) within the methyltransferase domain (Fig. 6c). 455 456 Following the design of the PglX mutants, an assay system was required to test function. Generating 457 each of the mutants individually in the 17.9 kb pBrxXL Sty plasmid would have been costly and time 458 consuming. Instead, a complementation system was designed that utili zed the pBrxXL Sty-ΔpglX 459 construct. The BREX Sty pglX gene was cloned into pBAD30. Complement ation of the pBrxXLSty-ΔpglX 460 construct with the pBAD30 -pglX plasmid in EOP assays provided phage defence against TB34, albeit 461 slightly lower than that seen from the E. coli DH5α pBrxXLSty construct (Fig. 7a). Next, a marker was 462 required to indicate whether the recognition motif had been modified. Again, it was preferable to 463 initially test this through functional EOP assays as sequencing for methylation chang es caused by all 464 23 mutants would be laborious and expensive. Fortunately, the activity of pBrxXLSty had already been 465 characterised against the Durham Phage Collection and phages in this collection had been sequenced 466 to allow enumeration of BREX recognition motifs 33. This allowed the identification of one phage, Trib, 467 which was susceptible to E. coli and E. fergusonii BREX systems but contained no native Salmonella 468 D23580 BREX recognition motifs and therefore was not impacted by BREX Sty (Fig. 7a) 33. Trib did, 469 however, encode all of the predicted re -engineered motifs ( Supplementary Table S 1). This finding 470 allowed us to first screen all mutants for phage defence activity against phage Trib before 471 determination of the recognition motif of any active mutants by sequencing. 472 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 19 473 474 475 476 477 478 479 480 481 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 20 EOP assays were carried out in triplicate for all 23 pBAD30-pglX mutants co -expressed with the 482 pBrxXLSty-ΔpglX construct in E. coli DH5α (data not shown). Mutant 3 appeared to provide around 10-483 fold protection against Trib (Fig. 7a), similar to phage defence levels provided by BREXEferg against this 484 phage 33. Mutants 8, 10, 15 and 22 showed sporadic reductions in EOP, usually around two -fold. 485 Mutant 4 consistently produced poor overnight growth and failed to provide sufficient bacterial lawns 486 for plaque enumeration, even after increasing the inoculum volume. Remaining mutants 487 demonstrated no noticeable reduction in plaquing efficiency. To co nfirm the BREX system remained 488 functional against other targets, mutants 3, 8, 10, 15 and 22 were also assayed against phage TB34. 489 Mutant 3 caused a reduction in EOP for TB34 similar to that shown against Trib, though around two-490 fold higher than produced by the E. coli DH5α pBrxXLSty strain (Fig. 7a). The remaining 18 mutants did 491 not show any reduction in EOP against TB34 , despite TB34 encoding the expected re -engineered 492 motifs, and were deemed to be inactive. The re was also a small reduction in BREX activity in the 493 complemented system (Fig. 7a). Accordingly, the T802A and S838N mutations of mutant 3 were also 494 generated directly within the pglX gene of pBrxXLSty, resulting in pBrxXL Sty(pglX mut.3) that did not 495 require complementation. This new construct was assayed against both TB34 and Trib. Now in context 496 within the BREX locus, EOP values were reduced further for both TB34 and Trib against E. coli DH5α 497 pBrxXLSty(pglX mut.3), though still not quite as low as shown by the activity of the WT BREX system 498 against TB34 (Fig. 7a). 499 500 Next, the host genomes of E. coli DH5α pBrxXL Sty(pglX mut.3) and E. coli DH5α pBrxXL Sty-ΔpglX + 501 pBAD30-pglX(mut.3) strains were sequenced and genomic methylation levels were assessed by PacBio 502 sequencing, alongside the WT strains (Fig. 7b). The E. coli DH5α pBrxXLSty-ΔpglX + pBAD30-pglX control 503 had almost 100% methylation at GATC AG sites, demonstrating that the complementation system 504 mediated efficient methylation in comparison to pBrxXLSty (Fig. 7b). Analysis of the mutant 3 strains 505 revealed methylation at almost 100% of GATMAG motifs (Fig. 7b). This indicated that the mutations 506 of mutant 3, T802A and S838N, had not altered the recognised motif to GATAAG as predicted, but had 507 broadened recognition to include both the original GATC AG motif and also GATA AG. These data 508 collectively demonstrate the successful re-engineering of PglX to target BREX against new phages, and 509 to methylate altered DNA sequence motifs. The experiments also demonstrated that PglX is the sole 510 specificity factor in the BREX phage defence system, providing motif recognition for b oth phage 511 targeting and host methylation. 512 513 514 515 516 517 518 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 21

Discussion

519 This study provides microbiological, genetic and epigenomic characterisation of the BREX phage 520 defence island within Salmonella D23580. We present the first structures of the putative PglX 521 methyltransferase, bound to SAM and in complex with the phage -derived inhibitor Ocr. Finally, we 522 demonstrate successful rational engineering of BREX, opening up the potential for tailored phage 523 targeting and generation of specific N6mA motifs. This work identifies PglX as the sole specificity factor 524 for methylation and phage defence within BREX. 525 526 Clustered phage d efence systems can provide additive 22 or even synergistic 54 protection. The 527 Salmonella D23580 BREX phage defence island has an embedded PARIS system (Fig. 1a), suggesting a 528 complementary relationship; PARIS has been shown to cause abortive infection upon encountering 529 the phage encoded anti -restriction protein, Ocr, which in turn inhibit s BREX defence in E. coli 16,27. 530 Using an E. coli model, we saw no activity from the Salmonella BREX phage defence island against Ocr-531 encoding phage T7 (Fig. 2). The reason that BREXSty had no impact was because T7 does not encode 532 any GATC AG motifs . PARIS also did not respond to Ocr ( Fig. 2 ). Using an Ocr homolog from a 533 Salmonella phage also did not activate PARIS ( Fig. S7), and so we can only conclude that the PARIS 534 system may provide protection, but that a susceptible phage has not yet been tested. 535 536 As with previous studies, Salmonella brxB, brxC, pglX and pglZ proved essential for both restriction 537 and methylation (Fig. 3 ) 17,25. However, brxA was required for phage defence and methylation in 538 Salmonella BREX ( Fig. 3 ) and Acinetobacter BREX 17, but was shown to be dispensable for both 539 activities in E. coli BREX 25. BrxA is a DNA-binding protein 23 with an unknown role in BREX activity, so 540 we are yet to understand the variable requirement for brxA. Salmonella brxL was demonstrated to be 541 dispensable for host methylation (Fig. 3b) and this matches the observed phenotype in Acinetobacter 542 and E. coli 17,25. Curiously, whilst brxL was essential for phage defence in both E. coli and Acinetobacter 543 BREX systems 17,25, it was not required for Salmonella BREX (Fig. 3a). BrxL was recently shown to form 544 a dimer of hexameric rings, forming a barrel-like structure that binds and translocates along DNA 24. 545 Thus, BrxL had been considered to have an essential role as the “effector” for BREX phage defence. 546 Clearly this is not the case in the Salmonella BREX system, which is made more apparent by EOP results 547 for E. coli DH5α pBrxXLSty-ΔbrxL tested against the Durham phage collection (Fig. S3) 33. Deletion of 548 brxL enhanced protection by several orders of magnitude for certain phages (Fig. S3). It is possible 549 that Salmonella BrxL modulates or regulates BREX activity in some way . RM systems are often 550 associated with restriction alleviation proteins that activate in times of stress, reducing restriction 551 activity and increasing methylation activity; a phenotype characteristic of Type I RM systems 55–57. It is 552 possible that BrxL plays an analogous role to restriction alleviation proteins within BREX and that 553 defence activity increases in the absence of BrxL. However, if that were the case, why is this phenotype 554 not observed for brxL deletions in E. coli or Acinetobacter BREX systems? Overexpression of a C -555 terminal fragment of BrxL has been shown to upregulate several genes elsewhere in the Salmonella 556 genome, including certain prophage genes 29. It was postulated that because the corresponding Lon-557 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 22 like domain in th e C-terminal BrxL fragment has similarity to the Lon -related C-terminal domain of 558 RadA that is required for DNA branch migration in homologous recombination 58, BrxL may inhibit 559 phage DNA replication at DNA forks. This would be somewhat in keeping with the model of BrxL 560 complexes translocating along DNA. The brxL deletion data provide additional insight to this model as 561 they suggest that whilst BrxL -dependent BREX defence may interrupt replication forks, other BREX 562 components have another activity sufficient to prevent phage DNA replication. 563 564 To better understand the activity of other BREX components we produced the first structure of PglX, 565 demonstrating that the N -terminal domain has a methyltransferase fold, and binds SAM ( Fig. 4). In 566 contrast, fold, conserved residues, and surface properties of the C-terminal domain suggest a role in 567 DNA recognition and binding. Despite repeated efforts w e could not crystallize PglX with DNA. We 568 hypothesised that Ocr binding might provide insight into DNA binding by PglX . We showed that Ocr 569 and Salmonella homolog Gp5 both impacted BREX phage defence (Fig. 5a), and produced stable 570 complexes of PglX:Ocr (Fig. S8a). The resulting structure involved the interaction of an Ocr dimer with 571 two PglX monomers ( Figs. 5b and c). The structure of PglX in th e Ocr-bound complex varied little in 572 comparison to the PglX -SAM structure, and there was no movement of domains upon Ocr binding. 573 Using these two structures, we developed two models for DNA binding by PglX, via (i) alignment with 574 a 20 bp DNA molecule represented by Ocr and (ii) alignment via DNA bound to MmeI (Figs. 6a and b; 575 Fig. S 10). As t he Ocr -bound structure only allows placement of a short, 20 bp , DNA molecule, it 576 interacts with the C-terminal domain but does not enter the hinge region between N-terminal and C-577 terminal domains. The MmeI-bound DNA is positioned to interact with the hinge and TRD. Our data 578 should aid the design of oligos for future structural studies of PglX bound to DNA , and supported 579 efforts to engineer BREX activity (Fig. 6c). 580 581 Rational engineering of PglX broadened motif recognition, allowing the Salmonella BREX to target new 582 phages and methylate new BREX motifs ( Fig. 7). We were able to switch recognition for position -1 583 (relative to the point of methylation). MmeI recognises guanine at this position using R810 to form a 584 hydrogen bond with guanine in the major groove, and an A774L mutant was shown to prevent binding 585 of an A-T base pairing at position -1 through steric interference, switching specificity from R:Y to G:C 586 40,59. The T802A and S838N mutations in PglX mutant 3 correspond to the positions of the A774 and 587 R810 residues in MmeI, respectively, and are within the TRD. As rapid adaptability and evolution are 588 vital factors in the phage -bacteria arms race that increase survivability of the local population 60, it 589 follows that PglX would be the target of variability as a means to alter BREX defence specificity. Indeed, 590 phase variation is common in pglX genes, but not other BREX components 10,61. 591 592 The inability of PglX to perf orm methylation during our in vitro reaction, nor when recombinantly 593 expressed in the absence of other BREX genes in vivo 25, implies higher order BREX complexes might 594 be required. Such complexes could induce domain movements that would provide agreement with 595 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 23 both proposed models of DNA binding. The arrangement of PglX monomers in the Ocr -bound 596 structure is also potentially interesting, as a larger BREX complex might scan both sides of a dsDNA for 597 the non-palindromic BREX motif by employing two PglX monomers, akin to the use by Type III and 598 some dimeric Type II RM systems. Clearly, further work is needed on BREX components and complexes 599 to uncover mechanistic details. The current study demonstrates that PglX is the sole BREX specificity 600 factor, responsible for both the recognition and targeting of individual BREX motifs for host 601 methylation and the resulting prevention of phage replication. 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 24

Materials and methods

624 Bacterial strains 625 Strains used in this study are shown in Supplementary Table 2 . We have described the Salmonella 626 D23850Δφ strain previously 62. The Salmonella D23850ΔφΔBREX strain was generated as described 627 previously 37, using scarless lambda red recombination (Fig. S1). Unless stated otherwise, E. coli strains 628 DH5α (Invitrogen), BL21 ( λDE3, Invitrogen) and ER2796 (NEB) were grown at 37 °C, either on agar 629 plates or shaking at 220 rpm for liquid cultures. Luria broth (LB) was used as the standard growth 630 media for liquid cultures, and was supplemented with 0.35% w/v or 1.5% w/v agar for semi-solid and 631 solid agar plates, respectively. Growth was monitored using a spectrophotometer (WPA Biowave 632 C08000) measuring optical density at 600 nm (OD 600). When necessary, growth media wa s 633 supplemented with ampicillin (Ap, 100 µg/ml) or chloramphenicol (Cm, 25 µg/ml). Protein was 634 expressed from pSAT1 or pBAD30 plasmid backbones by addition of 0.5 mM isopropyl-β-D-635 thiogalactopyranoside (IPTG) or 0.1% L-arabinose, respectively. 636 637 638 Use of envi ronmental phages 639 Phages used in this study are shown in Supplementary Table 2 . Coliphages in the Durham phage 640 collection have been d escribed previously 33. For Salmonella phages, sewage effluent was collected 641 from a sampling site in Durham, courtesy of Northumbrian Water Ltd. Filtrates were supplemented 642 with 10 ml of LB, and inoculated with 10 ml of D23580ΔφΔBREX. Cultures were grown for 3 days before 643 a 1 ml aliquot s were transferred to sterile microcentrifuge tubes and centrifuged at 12000 x g for 5 644 min at 4 °C. The supernatants were transferred to new microcentrifuge tubes and 100 μl of chloroform 645 was added to kill any remaining bacteria. Phage isolation was then carried out as previously described 646 33. 647 648 Plasmid constructs and cloning 649 Primers used in this study are shown in Supplementary Table 3, and plasmids used in this study are 650 shown in Supplementary Table 4 . Ligation independent cloning (LIC) was utili zed to create protein 651 overexpression plasmids from pSAT1-LIC and pBAD30-LIC, as described previously 63. This allowed the 652 expression of fusion proteins with cleavable tags for efficient purification of recombinant proteins. 653 The pBrxXL Sty plasmid was created previously 33 and contains the entire Salmonella D23580 BREX 654 coding region, including the regio n 508 bp directly upstream of the brxA start codon to ensure that 655 any promoters and transcriptional regulatory sites required for BREX expression and function were 656 included. The creation of individual gene knockouts utili zed Gibson Assembly (Gibson Assembl y) 64. 657 Individual gene knockouts were designed within the context of the pBrxXL Sty vector to allow direct 658 comparison on the same plasmid backbone. PCR primers were designed to amplify the pBrxXL Sty 659 plasmid sequence either side of the gene to be removed ( Supplementary Table 3). Primers were 660 designed with overlapping regions to allow ligation of the amplicons via GA. GA designs consisted of 661 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 25 2-3 fragments of pBrxXLSty produced by PCR with primers containing 20 bp homologous overlaps from 662 upstream and downstream of the gene to be removed. Knockouts were designed for each of the six 663 BREX gene s, each of the two PARIS system genes, ariA and ariB, alongside an additional double 664 knockout of both PARIS system genes. PCR -amplified and gel -purified fragments were pooled in an 665 equimolar ratio to a final volume of 5 μl and added to 15 μl of assembly master mix. Reaction mixtures 666 were incubated at 50 °C for 1 hr, then visualized on and gel purified from agarose gels. Resulting 667 products which displayed the correct size were used to transform E. coli DH5α and cells were plated 668 on Cm agar plates and incubated at 37 °C overnight. Plasmids from resulting colonies were extracted 669 and sequenced (DBS Genomics) to confirm correct assembly. Gene knockouts for which GA was not 670 successful were instead synthesised by Genscript. Primers for GA protocols were synthesised by IDT 671 and were designed using the Benchling cloning design software, available online (benchling.com). 672 673 DNA sequencing 674 All genomic DNA extraction steps in this study were carried out using either a Zymo Miniprep Plus kit 675 (Cambridge Biosciences) or a Monarch gDNA extraction kit (NEB). Bacterial genomic sequencing was 676 performed by either MinION Mk1C nanopore sequencing or PacBio sequencing. 677 678 For MinION sequencing, DNA repair and end prep, barcode ligation and adapter ligation steps were 679 carried out accord ing to Oxford Nanopore protocols (available at: community.nanopore.com) using 680 the NEBNext Companion Module (New England Biolabs), Native Barcoding Expansions (EXP -NBD104 681 and EXP-NBD114) and ligation sequencing kit (SQK-LSK109), respectively. Sequencing was carried out 682 using a MinION Flow cell (R9.4.1) on a MinION Mk1C. Following generation of raw sequencing data, 683 basecalling was performed by the Guppy basecalling package 684 (github.com/nanoporetech/pyguppyclient) either during sequencing or post sequencing and data was 685 deconvoluted using the ont_fast5_api package (github.com/nanoporetech/ont_fast5_api). 686 Megalodon was used for the detection of modified bases and the estimation of genomic methylation 687 levels, with a 0.75 probability threshold for both modified and canonical bases for read selection and 688 average percentage methylation calculations. 689 690 Libraries for sequencing were prepared using the SMRTbell Template Prep kit 3.0 (Pacific Biosciences). 691 Bacterial gDNA was sheared using gTubes (Covaris) to produce DNA fragments with a mean size of 5–692 10 kb. The DNA was damage repaired and end repaired. SMRT -bell adapters were then ligated. 693 Exonuclease treatment removed Non SMRT-bell DNA. Sequencing was performed on a PacBio Sequel 694 IIe (Pacific Biosciences). Data were analysed using PacBio SMRTAnalysis on SMRTLink_9.0 software 695 Base Modification Analysis for Sequel data, to identify DNA modifications and their corr esponding 696 target motifs. 697 698 Growth and infection curves 699 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 26 Phage growth and infection curves were carried out to monitor phage resistance conferred by 700 pBrxXLSty WT and pBrxXLSty mutants in liquid culture. Growth was carried out in 200 μl culture volumes 701 at 37 °C with shaking in a 96 -well plate format, with OD 600 measurements taken every 5 min. Initial 702 screening of inoculation and infection conditions produced optimal results with initial inoculation 703 from overnight culture to OD 600 0.1 and phage multiplicity of infection (MOI) of 10 -6. As well as 704 infection with phage TB34, a negative control – phage T7 – and a positive control (uninfected culture) 705 were also run for each strain. All strains other than E. coli DH5α WT were grown with 25 μg/ml Cm. 706 707 Efficiency Of Plating assays. 708 Efficiency of plating (EOP) assays were carried out to assess the plaquing ability of phages in the 709 Durham Phage Collection against E. coli DH5α pBrxXLSty and BREX knockout strains relative to control 710 strains. We used serial dilutions of high titre lysates in phage buffer and dilutions were mixed with 711 overnight culture and molten 0.3% w/v agar, poured onto a 1% agar plate, dried and incubated 712 overnight at 37 °C. For strains containing pBAD30 vectors, overnight cultures were induced with 0.2% 713 w/v L-arabinose and incubated at 37 °C for 30 min prior to plating and both top and bottom agar layers 714 included 0.2% w/v L-arabinose to induce continuous expression over the course of lawn growth. The 715 EOP was calculated by dividing the pfu (plaque forming units) of the test strain by the pfu of the control 716 strain. Data shown are the mean and the standard deviation of at least 3 biological and technical 717 replicates. 718 719 Protein expression and purification 720 All large-scale protein expression was performed in 1 L volumes of 2x YT broth in 2 L flasks with shaking 721 at 180 rpm. In all cases, colonies from fresh transformation plates were used to inoculate 5 ml of 2x 722 YT broth and grown overnight at 37 °C. This culture was then used to seed a 65 ml volume of 2x YT 723 broth at 1 : 100 v/v and grown overnight at 37 °C to produce a second overnight culture. This culture 724 was then used to seed 1 L of 2x TY at a 1 : 200 ratio, cultures were grown at 37 °C until exponential 725 growth phase (OD600 0.3 – 0.7), induced, and protein was expressed at 18 °C overnight. 726 727 All purification st eps were performed either on ice or at 4 °C. Fast protein liquid chromatography 728 (FPLC) steps were carried out at 4 °C using an Akta Pure protein chromatography system (Cytiva). 729 Protein purity was assessed using SDS -PAGE. Cells were harvested by centrifugation at 4000 rpm for 730 15 min at 4 ⁰C and then resuspended in ice-cold A500 buffer (20 mM Tris HCl pH 7.9, 500 mM NaCl, 731 30 mM imidazole, 10% glycerol) . Cells were lysed by sonication using a Vibracell VCX500 732 ultrasonicator, the soluble fraction was separated from insoluble cell material by centrifugation at 733 20000 x g for 45 minutes at 4 °C and the supernatant was removed to a fresh, chilled tube for 734 purification. Soluble cell lysate was applied to a 5 ml pre-packed Ni-NTA His-Trap HP column (Cytiva) 735 using a benchtop peristaltic pump at around 1.5 ml/min to allow binding of the 6xHis tag to the nickel 736 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 27 resin. Columns were then washed with between 5 – 10 column volumes (CVs) of A500 to remove 737 residual unbound protein and isocratic elution steps were performed using A500 buffer with imidazole 738 concentrations adjusted to 30 mM, 50 mM, 90 mM, 150 mM and 250 mM. Clean samples were pooled, 739 dialysed into low salt A100 buffer ( 20 mM Tris HCl pH 7.9, 100 mM NaCl, 10 mM imidazole, 10% 740 glycerol) and applied to a 5 ml HiTrap Heparin HP column (Cytiva), allowing separation of proteins with 741 affinity for DNA. Bound protein was then washed with 5 – 10 CV of A100 and eluted using a salt 742 gradient with C1000 buffer (20 mM Tris HCl pH 7.9, 1 M NaCl, 10% glycerol). Clean fractions were then 743 pooled and digested with of human sentrin/SUMO-specific protease 2 (hSENP2) overnight at 4 °C to 744 remove purification tags. Samples were then applied to a second Ni-NTA His-Trap HP column, this time 745 allowing the now untagged protein of interest to flow through and removing remaining nickel binding 746 contaminants. Successful tag cleavage and subsequent protein purity was assessed by SDS-PAGE, with 747 tag cleavage visible as a noticeable reduction in protein molecular weight relative to tagged protein. 748 Finally, size exclusion chromatography (SEC) was used to separate proteins by size, using a HiPrep 749 16/60 Sephacryl S -200 SEC column (Cytiva) connected to the FPLC system. Protein samples were 750 dialysed overnight at 4 °C into S500 buffer (50 mM Tris HCl pH 7.9, 500 mM KCl, 10% glycerol ) and 751 concentrated to a 500 μl volume. The column was pre-equilibrated in S500, and the sample was loaded 752 through a 500 μl volume capillary loop at 0.5 ml/min. Sample was eluted over 1.2 CVs at 0.5 ml/min 753 and fractionated into 2 ml vol umes for analysis by SDS -PAGE. Purified protein from SEC was 754 concentrated to around 6 mg/ml and diluted in storage buffer (50 mM Tris HCl pH 7.9, 500 mM KCl, 755 70% glycerol) at a 1 : 2 ratio of protein to buffer, respectively, giving a final concentration of around 2 756 mg/ml. Samples were split into appropriately sized aliquots, snap frozen in liquid nitrogen and stored 757 at -80 °C for future use. 758 759 Protein crystallization and structure determination 760 Highly pure protein samples were used for crystallisation screen ing. Samples were either used 761 immediately following purification or thawed on ice from -80 °C storage. Samples were dialysed into 762 crystal buffer ( 20 mM Tris HCl pH 7.9, 150 mM NaCl, 2 .5 mM DTT ) and concentrated to 12 mg/ml. 763 Protein concentration determinat ion was performed using Nanodrop One (Thermofisher). Crystal 764 screens were set using the sitting drop vapour diffusion method either by hand or using a Mosquito 765 Xtal3 liquid handling robot (SPT Labtech). Crystal screens were incubated at 18 °C. All commerci ally 766 available crystal screens were produced by Molecular Dimensions. For PglX and SAM samples, PglX 767 was incubated with 1 mM SAM (Sigma) for 30 minutes on ice prior to addition to screens. For PglX -768 SAM:Ocr samples, PglX underwent the SAM incubation as above plus an additional 30 minute 769 incubation on ice with 2.74 mg/ml of Ocr. Ocr was recombinantly expressed and purified as previously 770 described 26,47. PglX -SAM crystallized in 0.2 M potassium bromide, 0.1 M Tris pH 7.5, 8% w/v PEG 771 20000, 5% w/v PEG 500. PglX-SAM:Ocr crystallized in 0.1 M sodium/potassium phosphate pH 6.2, 14% 772 w/v PEG 4000, 6% MPD. Crystallization was confirmed by microscopy, with larger crystals extracted 773 for X-ray diffraction. To harvest, 20 μl of screen condition was mixed with 20 μl of cryo buffer (25 mM 774 Tris HCl pH 7.9, 187.5 mM NaCl, 3.125 mM DTT, 80% glycerol) and the solution was mixed thoroughly 775 by vortexing. This solution was then added directly to the crystal drop at a 1 : 1 ratio. Crystals were 776 extracted using nylon cryo loops and stored in liq uid nitrogen until shipment. Data collection was 777 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 28 carried out remotely at Diamond Light Source, Oxford, UK on beamlines I04 and I24, using their 778 “Generic Data Acquisition” software (opengda.org). 779 780 Initial data processing was performed by automated processes on iSpyB (Diamond Light Source) using 781 the Xia2-DIALS X-ray data processing and integration tool 65. The same program was used to merge 782 multiple datasets and provide initial data on the space groups and unit cell sizes. Further data 783 reduction and production of dataset statistics was carried out using AIMLESS within CCP4i2 66. Merged 784 datasets were first processed in CCP4i2 using BUCCANEER and REFMAC 66, and then iteratively built 785 and refined in Coot 67 and Phenix 68, respectively. Quality of the final model was assessed using a 786 combination of CCP4i2, Phenix, Coot and the wwPDB validation server. Visualisation and structural 787 figure generation was performed in PyMol 52. For PglX, the crystal structure was solved by molecular 788 replacement in Phaser 69 using the PglX predicted model produced by AlphaFold 46. The SAM molecule 789 was downloaded from the PDB ligand repository and placed manually in Coot and similarly iteratively 790 built and refined. The structure of the PglX -SAM:Ocr heterodimer complex was solved by molecular 791 replacement in Phaser 69 using the PglX structure solved previously and the structure of Ocr ( PDB 792 1S7Z). 793 794 Analytical Size Exclusion Chromatography 795 Analytical SEC was performed on a Superose 6 10/300 GL SEC column (Cytiva, discontinued) connected 796 to an Akta Pure protein chromatography system (Cytiva). The column, system and loading loop were 797 washed between each run and equilibrated with 1.2 CVs of A-SEC buffer (20 mM Tris-HCl pH 7.9, 150 798 mM NaCl ). Protein samples were buffer exchanged into A -SEC buffer and concentrated. Final 799 concentration ranged between 1 μM and 5 μM, as required to give a distinct measurable elution peak. 800 Protein was loaded onto the system via a 100 μl capillary loop loaded using a 100 μl Hamilton syringe. 801 For PglX-SAM:Ocr samples, PglX was incubated with each on ice in the same process as that used for 802 crystallisation screening. Protein in capillary loops was injected onto the column with 1.5 ml of A-SEC 803 buffer and eluted over 1.2 CVs with A -SEC buffer at 0.5 ml/min. For estimation of protein molecular 804 weight, relative to elution volume (Ve), a calibration curve was produced from commercially available 805 high and low molecular weight pr otein calibration kits (Cytiva). Peaks were identified using the 806 Unicorn 7 software package (Cytiva). 807 808 Ve values were converted into the partitioning coefficient (Kav) for each sample using the equation: 809 810 Kav = Ve- Vo Vc- Vo 811 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 29 812 The molecular weight calibration curve is then plotted as Kav against Log10(Mr, kDa). The Stokes radius 813 calibration curve plotted as Log 10(Rst, Å) against Kav, allowing calculation of sample Stokes radius 814 measurements. Estimated stokes radius calculations were carried out using the HullRad Stokes radius 815 estimation server 49. 816 817 Methyltransferase assay 818 SAM-dependant N6mA DNA methylation activi ty of PglX was probed in vitro using an MTase -Glo 819 Methytransferase Assay kit (Promega). The kit allows indirect measurement of SAM -dependent 820 methyltransferase activity via production of the SAH reaction product. Through a proprietary two step 821 reaction, SAH is used to produce ADP then ATP, which in turn is used by a luciferase reporter enzyme 822 to generate a measurable luminescence signal. Signal can then be correlated to that produced by a 823 SAH standard curve. The methyltransferase assay was carried out as per manufacturer’s instructions 824 in a 96-well plate format. PglX was buffer exchanged into the methyltransferase assay reaction buffer 825 (80 mM Tris pH 8.8, 200 mM NaCl, 4 mM EDTA, 12 mM MgCl2, 4 mM dithiothreitol (DTT) and 826 concentrated to 1 μM. As a substrate, 100 ng of E. coli DH5α genomic DNA was used per reaction as 827 this should provide ample Salmonella BREX recognition motifs for methylation. The reaction mix was 828 then combined with the protein samples at a 1 : 1 ratio with 10 μM of SAM and the reaction was 829 incubated at room temperature for 30 minutes. The SAH standard curve was prepared by two -fold 830 serial dilutions of a 1 μM SAH stock in methyltransferase reaction buffer. Luminescence was measured 831 on a Biotek Synergy 2 plate reader. 832 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 30 DATA AVAILABILITY 833 The crystal structures of PglX-SAM and PglX-SAM:Ocr have been deposited in the Protein Data Bank 834 under accession number s 8C45 and 8Q56, respectively . All other data needed to evaluate the 835

Conclusions

in the paper are present in the paper and/or Supplementary Data. MinION and PacBio 836 data that support the findings of this study have been deposited in the European Nucleotide Archive 837 (ENA) at EMBL-EBI under accession number PRJEB71369. 838 839 FUNDING 840 This work was supported by an Engineering and Physical Sciences Research Council Molecular Sciences 841 for Medicine Centre for Doctoral Training studentship [grant number EP/S022791/1] to S.C.W., a 842 Biotechnology and Biological Sciences Research Council Newcastle -Liverpool-Durham Doctor al 843 Training Partnership studentship [grant number BB/M011186/1] to D.M.P., and a Lister Institute Prize 844 Fellowship to T.R.B. This work was supported in part by a Wellcome Trust Senior Investigator award 845 [grant number 106914/Z/15/Z] to J.C.D.H. For the purpose of open access, the authors have applied a 846 CC BY public copyright licence to any Author Accepted Manuscript version arising from this 847 submission. 848 849

Acknowledgements

850 We gratefully acknowledge Diamond Light Source for time on beamlines I04 and I24 under proposal 851 MX24948. 852 853 COMPETING INTERESTS 854 The authors declare no competing interests. 855 856 CONTRIBUTIONS 857 Analysed data: S.C.W., D.M.P., R.D.M., A.N., N.W. and T.R.B. Designed research: S.C.W., D.M.P., 858 R.D.M., A.N., D.T.F.D., D.L.S., N.W., J.C.D.H. and T.R.B. Performed research: S.C.W., D.M.P., R.D.M., 859 A.N., N.W. AND T.R.B. Wrote the paper: S.C.W., D.M.P., A.N., D.T.F.D, J.C.D.H and T.R.B . Funding 860 acquisition: J.C.D.H. and T.R.B. Supervised the study: D.L.S., J.C.D.H. and T.R.B. 861 862 863 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 31

References

864 1. Hampton, H. G., Watson, B. N. J. & Fineran, P. C. The arms race between bacteria and their 865 phage foes. Nature 577, 327–336 (2020). 866 2. Stern, A. & Sorek, R. The phage-host arms race: shaping the evolution of microbes. Bioessays 867 33, 43–51 (2011). 868 3. Tock, M. R. & Dryden, D. T. F. The biology of restriction and anti-restriction. Current Opinion 869 in Microbiology vol. 8 466–472 (2005). 870 4. Blower, T. R. et al. Mutagenesis and functional characterisation of the RNA and protein 871 components of the toxIN abortive infection and toxin-antitoxin locus of Erwinia. J Bacteriol 872 191, 6029–6039 (2009). 873 5. Fineran, P. C. et al. The phage abortive infection system, ToxIN, functions as a protein-RNA 874 toxin-antitoxin pair. Proc Natl Acad Sci U S A 106, 894–899 (2009). 875 6. Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. 876 Science 315, 1709–12 (2007). 877 7. Makarova, K. S., Wolf, Y. I., Snir, S. & Koonin, E. V. Defense islands in bacterial and archaeal 878 genomes and prediction of novel defense systems. J Bacteriol 193, 6039–6056 (2011). 879 8. Doron, S. et al. Systematic discovery of antiphage defense systems in the microbial 880 pangenome. Science (80-. ). 359, eaar4120 (2018). 881 9. Vassallo, C. N., Doering, C. R., Littlehale, M. L., Teodoro, G. I. C. & Laub, M. T. A functional 882 selection reveals previously undetected anti-phage defence systems in the E. coli 883 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 32 pangenome. Nat. Microbiol. 7, 1568–1579 (2022). 884 10. Goldfarb, T. et al. BREX is a novel phage resistance system widespread in microbial genomes. 885 EMBO J. 34, 169–83 (2015). 886 11. Lau, R. K. et al. Structure and Mechanism of a Cyclic Trinucleotide-Activated Bacterial 887 Endonuclease Mediating Bacteriophage Immunity. Mol. Cell 77, 723-733.e6 (2020). 888 12. Owen, S. V. et al. Prophages encode phage-defense systems with cognate self-immunity. Cell 889 Host Microbe 29, 1620-1633.e8 (2021). 890 13. Millman, A. et al. Bacterial Retrons Function In Anti-Phage Defense. Cell 183, 1551-1561.e12 891 (2020). 892 14. Bernheim, A. et al. Prokaryotic viperins produce diverse antiviral molecules. Nature 589, 120–893 124 (2021). 894 15. Tal, N. et al. Cyclic CMP and cyclic UMP mediate bacterial immunity against phages. Cell 184, 895 5728-5739.e16 (2021). 896 16. Rousset, F. et al. Phages and their satellites encode hotspots of antiviral systems. Cell Host 897 Microbe 30, 740-753.e5 (2022). 898 17. Luyten, Y. A. et al. Identification and characterization of the WYL BrxR protein and its gene as 899 separable regulatory elements of a BREX phage restriction system. Nucleic Acids Res. 50, 900 5171–5190 (2022). 901 18. Picton, D. M. et al. A widespread family of WYL-domain transcriptional regulators co-localizes 902 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 33 with diverse phage defence systems and islands. Nucleic Acids Res. 50, 5191–5207 (2022). 903 19. Blankenchip, C. L. et al. Control of bacterial immune signaling by a WYL domain transcription 904 factor. Nucleic Acids Res. 50, 5239–5250 (2022). 905 20. Hoskisson, P. A., Sumby, P. & Smith, M. C. M. The phage growth limitation system in 906 Streptomyces coelicolor A(3)2 is a toxin/antitoxin system, comprising enzymes with DNA 907 methyltransferase, protein kinase and ATPase activity. Virology 477, 100–109 (2015). 908 21. Makarova, K. S., Wolf, Y. I. & Koonin, E. V. Comparative genomics of defense systems in 909 archaea and bacteria. Nucleic Acids Res. 41, 4360–77 (2013). 910 22. Picton, D. M. et al. The phage defence island of a multidrug resistant plasmid uses both BREX 911 and type IV restriction for complementary protection from viruses. Nucleic Acids Res. 49, 912 11257–11273 (2021). 913 23. Beck, I. N., Picton, D. M. & Blower, T. R. Crystal structure of the BREX phage defence protein 914 BrxA. Curr. Res. Struct. Biol. 4, 211–219 (2022). 915 24. Shen, B. W. et al. Structure, substrate binding and activity of a unique AAA+ protein: the BrxL 916 phage restriction factor. Nucleic Acids Res. (2023) doi:10.1093/nar/gkad083. 917 25. Gordeeva, J. et al. BREX system of Escherichia coli distinguishes self from non-self by 918 methylation of a specific DNA site. Nucleic Acids Res. 47, 253–265 (2019). 919 26. Walkinshaw, M. D. et al. Structure of Ocr from bacteriophage T7, a protein that mimics b-920 form DNA. Mol. Cell 9, 187–194 (2002). 921 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 34 27. Isaev, A. et al. Phage T7 DNA mimic protein Ocr is a potent inhibitor of BREX defence. Nucleic 922 Acids Res. 48, 5397–5406 (2020). 923 28. Hattman, S., Schlagman, S., Goldstein, L. & Frohlich, M. Salmonella typhimurium SA host 924 specificity system is based on deoxyribonucleic acid-adenine methylation. J. Bacteriol. 127, 925 211–7 (1976). 926 29. Zaworski, J. et al. Reassembling a cannon in the DNA defense arsenal: Genetics of StySA, a 927 BREX phage exclusion system in Salmonella lab strains. PLOS Genet. 18, e1009943 (2022). 928 30. Stanaway, J. D. et al. The global burden of non-typhoidal salmonella invasive disease: a 929 systematic analysis for the Global Burden of Disease Study 2017. Lancet Infect. Dis. 19, 1312–930 1324 (2019). 931 31. Kingsley, R. A. et al. Epidemic multiple drug resistant Salmonella Typhimurium causing 932 invasive disease in sub-Saharan Africa have a distinct genotype. Genome Res. 19, 2279–2287 933 (2009). 934 32. Okoro, C. K. et al. Intracontinental spread of human invasive Salmonella Typhimurium 935 pathovariants in sub-Saharan Africa. Nat. Genet. 2012 4411 44, 1215–1221 (2012). 936 33. Kelly, A. et al. Diverse Durham collection phages demonstrate complex BREX defense 937 responses. Appl. Environ. Microbiol. 89, (2023). 938 34. Canals, R. et al. Adding function to the genome of African Salmonella Typhimurium ST313 939 strain D23580. PLOS Biol. 17, e3000059 (2019). 940 35. Zaworski, J. et al. Genome archaeology of two laboratory Salmonella enterica enterica sv 941 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 35 Typhimurium. G3 (Bethesda). 11, (2021). 942 36. Blank, K., Hensel, M. & Gerlach, R. G. Rapid and Highly Efficient Method for Scarless 943 Mutagenesis within the Salmonella enterica Chromosome. PLoS One 6, e15763 (2011). 944 37. Rodwell, E. V. et al. Isolation and Characterisation of Bacteriophages with Activity against 945 Invasive Non-Typhoidal Salmonella Causing Bloodstream Infection in Malawi. Viruses 2021, 946 Vol. 13, Page 478 13, 478 (2021). 947 38. Engler, C., Kandzia, R. & Marillonnet, S. A One Pot, One Step, Precision Cloning Method with 948 High Throughput Capability. PLoS One 3, e3647 (2008). 949 39. Anton, B. P. et al. Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient 950 Strain of Escherichia coli K-12. PLoS One 10, e0127446 (2015). 951 40. Callahan, S. J. et al. Structure of Type IIL Restriction-Modification Enzyme MmeI in Complex 952 with DNA Has Implications for Engineering New Specificities. PLOS Biol. 14, e1002442 (2016). 953 41. Holm, L. & Sander, C. Protein structure comparison by alignment of distance matrices. J Mol 954 Biol 233, 123–138 (1993). 955 42. Malone, T., Blumenthal, R. M. & Cheng, X. Structure-guided analysis reveals nine sequence 956 motifs conserved among DNA amino-methyltransferases, and suggests a catalytic mechanism 957 for these enzymes. J. Mol. Biol. 253, 618–32 (1995). 958 43. Ashkenazy, H. et al. ConSurf 2016: an improved methodology to estimate and visualize 959 evolutionary conservation in macromolecules. Nucleic Acids Res. 44, W344-50 (2016). 960 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 36 44. Kennaway, C. K. et al. The structure of M.EcoKI Type I DNA methyltransferase with a DNA 961 mimic antirestriction protein. Nucleic Acids Res. 37, 762–770 (2009). 962 45. Dobbins, A. T. et al. Complete Genomic Sequence of the Virulent Salmonella Bacteriophage 963 SP6. J. Bacteriol. 186, 1933–1944 (2004). 964 46. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nat. 2021 965 5967873 596, 583–589 (2021). 966 47. Ye, F. et al. Structural basis of transcription inhibition by the DNA mimic protein Ocr of 967 bacteriophage T7. Elife 9, (2020). 968 48. La Verde, V., Dominici, P. & Astegno, A. Determination of Hydrodynamic Radius of Proteins by 969 Size Exclusion Chromatography. Bio-protocol 7, e2230 (2017). 970 49. Fleming, P. J. & Fleming, K. G. HullRad: Fast Calculations of Folded and Disordered Protein 971 and Nucleic Acid Hydrodynamic Properties. Biophys. J. 114, 856–869 (2018). 972 50. Krissinel, E. & Henrick, K. Inference of Macromolecular Assemblies from Crystalline State. J. 973 Mol. Biol. 372, 774–797 (2007). 974 51. Baker, N. A., Sept, D., Joseph, S., Holst, M. J. & McCammon, J. A. Electrostatics of 975 nanosystems: Application to microtubules and the ribosome. Proc. Natl. Acad. Sci. 98, 10037–976 10041 (2001). 977 52. Schrödinger, L. The PyMOL Molecular Graphics System, Version~1.3r1. (2010). 978 53. Roberts, R. J., Vincze, T., Posfai, J. & Macelis, D. REBASE: a database for DNA restriction and 979 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 37 modification: enzymes, genes and genomes. Nucleic Acids Res. 51, D629–D630 (2023). 980 54. Wu, Y. et al. Synergistic anti-phage activity of bacterial defence systems. bioRxiv 981 2022.08.21.504612 (2023) doi:10.1101/2022.08.21.504612. 982 55. Makovets, S., Powell, L. M., Titheradge, A. J. B., Blakely, G. W. & Murray, N. E. Is modification 983 sufficient to protect a bacterial chromosome from a resident restriction endonuclease? Mol. 984 Microbiol. 51, 135–147 (2003). 985 56. Thoms, B. & Wackernagel, W. Expression of ultraviolet-induced restriction alleviation in 986 Escherichia coli K-12. Detection of a lambda phage fraction with a retarded mode of DNA 987 injection. Biochim. Biophys. Acta 739, 42–7 (1983). 988 57. Thoms, B. & Wackernagel, W. Genetic control of damage-inducible restriction alleviation in 989 Escherichia coli K12: an SOS function not repressed by lexA. Mol. Gen. Genet. 197, 297–303 990 (1984). 991 58. Cooper, D. L. & Lovett, S. T. Recombinational branch migration by the RadA/Sms paralog of 992 RecA in Escherichia coli. Elife 5, (2016). 993 59. Morgan, R. D. & Luyten, Y. A. Rational engineering of type II restriction endonuclease DNA 994 binding and cleavage specificity. Nucleic Acids Res. 37, 5222–5233 (2009). 995 60. Bikard, D. & Marraffini, L. A. Innate and adaptive immunity in bacteria: mechanisms of 996 programmed genetic variation to fight bacteriophages. Curr. Opin. Immunol. 24, 15–20 997 (2012). 998 61. Sumby, P. & Smith, M. C. M. Genetics of the phage growth limitation (Pgl) system of 999 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 38 Streptomyces coelicolor A3(2). Mol. Microbiol. 44, 489–500 (2002). 1000 62. Owen, S. V. et al. Characterization of the Prophage Repertoire of African Salmonella 1001 Typhimurium ST313 Reveals High Levels of Spontaneous Induction of Novel Phage BTP1. 1002 Front. Microbiol. 8, 235 (2017). 1003 63. Cai, Y. et al. A nucleotidyltransferase toxin inhibits growth of Mycobacterium tuberculosis 1004 through inactivation of tRNA acceptor stems. Sci. Adv. 6, eabb6651 (2020). 1005 64. Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. 1006 Nat. Methods 6, 343–345 (2009). 1007 65. Winter, G. xia2 : an expert system for macromolecular crystallography data reduction. J. Appl. 1008 Crystallogr. 43, 186–190 (2010). 1009 66. Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D. 1010 Biol. Crystallogr. 67, 235–42 (2011). 1011 67. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr D 1012 Biol Crystallogr 60, 2126–2132 (2004). 1013 68. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular 1014 structure solution. Acta Crystallogr. D. Biol. Crystallogr. 66, 213–21 (2010). 1015 69. McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007). 1016 1017 1018 .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint 39 TABLE 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 Table 1. X-Ray data collection and refinement statistics Structure PglX-SAM PglX-SAM:Ocr PDB Code 8C45 8Q56 Wavelength 0.9795 0.9795 Resolution range 48.98 - 3.402 (3.523 - 3.402) 59.61 - 3.5 (3.625 - 3.5) Space group P 41 21 2 C 1 2 1 Unit cell, a b c (Å), α β γ (°) 138.539 138.539 407.956 90 90 90 238.458 60.786 146.637 90 114.889 90 Total reflections 104405 47094 (8532) Unique reflections 55611 (5460) 24556 (2426) Multiplicity 1.9 1.9 Completeness (%) 87.15 (15.55) 97.84 (80.53) Mean I/sigma(I) 8 (0.1) 3.8 (0.3) Rmerge 0.047 0.028 Rmeas 0.067 (2.142) 0.092 (0.756) CC1/2 0.999 (0.214) 0.995 (0.378) Reflections used in refinement 48492 (849) 24038 (1957) Reflections used for Rfree 2444 (43) 1922 (144) Rwork 0.2745 (0.4253) 0.2462 (0.4074) Rfree 0.2992 (0.4026) 0.2917 (0.4202) Number of non-hydrogen atoms 19848 10776 macromolecules 19848 10747 ligands 98 49 solvents 0 2 Protein residues 2432 1318 RMS (bonds, Å) 0.005 0.004 RMS (angles, °) 0.91 0.78 Ramachandran favored (%) 90.36 91.6 Ramachandran allowed (%) 9.64 8.4 Ramachandran outliers (%) 0 0 Average B-factor 169.33 138.5 macromolecules 169.33 138.54 ligands 104 139 solvent N/A 113.43 Values in parenthesis are for the highest resolution shell .CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted April 13, 2024. ; https://doi.org/10.1101/2024.04.12.589231doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0