Application of a Novel Fusion Tag System for Enhanced Soluble Expression of Recombinant Proteins in Escherichia coli

preprint OA: closed
Full text JSON View at publisher
Full text 108,910 characters · extracted from preprint-html · click to expand
Application of a Novel Fusion Tag System for Enhanced Soluble Expression of Recombinant Proteins in Escherichia coli | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Application of a Novel Fusion Tag System for Enhanced Soluble Expression of Recombinant Proteins in Escherichia coli Li-Zhen Luo, Jian-Tao Cai, Zi-Ying Tan, Yu-Qing Chen, Zhe Hu, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8180750/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 12 Feb, 2026 Read the published version in BMC Biotechnology → Version 1 posted 15 You are reading this latest preprint version Abstract Background The Escherichia coli expression system is widely used for recombinant protein production, but its utility is often limited by the formation of insoluble inclusion bodies. Although fusion tags can enhance solubility, their effectiveness varies unpredictably across different target proteins, and the optimal tag must typically be determined empirically. Results Here, we developed a novel fusion tag system for the high-throughput screening of soluble protein expression in E. coli . This system consists of eight medium-sized, TEV-cleavable fusion tags (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, and YjgD) cloned into a standardized pET-28b(+) backbone. We systematically evaluated the impact of these tags on the solubility and function of three model proteins (eGFP, EcFabG, and Mals) and six challenging proteins (PulA, NodE, FabF1XL, FabF2XL, FabZXL, and FabGXL). Our results demonstrated that the efficacy of each tag was highly protein-dependent. Notably, tags such as MsyB and Snut dramatically increased the soluble proportion of eGFP from 15% to over 85%, while the SlyD tag significantly enhanced both the solubility and activity of Mals. For several difficult-to-express proteins, soluble expression was only achieved with specific tags, highlighting the critical importance of tag selection. Conclusions Our study presents a versatile and efficient platform for the rapid production of soluble recombinant proteins. By enabling parallel screening of multiple fusion partners, this system facilitates the identification of optimal conditions for enhancing protein solubility and function, thereby addressing a key bottleneck in recombinant protein applications. Recombinant protein expression Solubility enhancement Fusion tag screening Escherichia coli TEV protease Inclusion bodies pX vector system Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction The Escherichia coli expression system is a cornerstone in biotechnology for producing recombinant proteins, prized for its well-characterized genetics, rapid growth, high yield, and cost-effectiveness [1]. Despite these advantages, the heterologous expression of many proteins in E. coli is often hampered by several challenges, including codon bias, proteolytic degradation, and—most notably—the aggregation of target proteins into insoluble inclusion bodies [2]. The formation of these inactive aggregates represents a major bottleneck, influenced by factors ranging from cultivation conditions (e.g., temperature, pH, ionic strength) to intrinsic properties of the target protein itself (such as its amino acid sequence) [3]. Various strategies have been employed to improve the solubility and yield of biologically active proteins, including optimization of expression conditions, engineering of host strains, co-expression of molecular chaperones, and the use of fusion tags[3, 4]. Among these, fusion tags—typically derived from highly soluble proteins—have become a mainstream solution. By being fused to the N- or C-terminus of the target protein, these tags can promote solubility, enhance stability, and facilitate purification. A number of such tags from E. coli and other organisms have been widely adopted, including Maltose Binding Protein (MBP), Thioredoxin (TrxA), N-utilization substance A (NusA), and Small Ubiquitin-like Modifier (SUMO) [5–7]. Beyond these well-established tags, a diverse array of other partners with solubilizing potential has been explored. For instance, arsenate reductase (ArsC) exhibits high cytoplasmic solubility and folding capacity, serving as an effective fusion partner [8]. The glucose-specific phosphotransferase IIA component (Crr) and the spermidine/putrescine-binding periplasmic protein (PotD) have also been shown to significantly increase the solubility of heterologous proteins [9]. Disulfide bond formation protein A (DsbA) is particularly useful for promoting the soluble expression of proteins requiring disulfide bond formation [10], while the periplasmic trypsin inhibitor Ecotin has facilitated the expression of challenging proteins like human pepsinogen [11]. Furthermore, highly acidic polypeptides such as MsyB and YjgD are believed to improve solubility by increasing the net negative charge and hydrophilicity of the fusion protein [12, 13]. The anti-aggregation protein SlyD has proven effective in solubilizing aggregation-prone proteins in the cytoplasm [14], and the Solubility 'eNhancing' Ubiquitous Tag (SNUT), derived from a portion of the trans-peptidase sortase found in Staphylococcus aureus , confers favorable solubility characteristics toward target proteins [3]. However, no single fusion tag is universally effective, as their success is highly dependent on the specific target protein [12, 15]. Moreover, the tag itself can sometimes interfere with the structure, function, or immunogenicity of the protein of interest. For many applications in therapeutics and functional studies, the removal of the fusion tag is therefore essential [15]. This is typically achieved by incorporating a specific protease cleavage site between the tag and the target protein. Proteases like Factor Xa, thrombin, and enterokinase have been used, but they often suffer from inefficiency, high cost, or non-specific cleavage[2, 16]. In contrast, Tobacco Etch Virus (TEV) protease offers high specificity and efficiency, recognizing the ENLYFQ/G sequence and cleaving between Q and G, making it an excellent tool for tag removal [17, 18]. To address the need for a systematic and efficient method to identify the optimal fusion tag for a given protein, we designed and constructed a comprehensive recombinant protein expression system for E. coli . This system allows for the parallel cloning of a target gene into eight different vectors, each equipped with a medium-sized fusion tag (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, or YjgD) (Table 1 ), a flexible linker, a TEV protease site, and an N-terminal 6×His affinity tag. We systematically investigated the effects of these tags on the solubility and function of three model proteins: enhanced Green Fluorescent Protein (eGFP) [18, 19], 3-ketoacyl-ACP reductase (EcFabG) [20] and maltogenic amylase (Mals). Furthermore, we evaluated the system's efficacy on six difficult-to-express proteins (PulA, NodE, FabF1XL, FabF2XL, FabZXL and FabGXL) to demonstrate its broad utility. Our findings establish this system as a robust and versatile platform for enhancing the soluble expression of diverse recombinant proteins. Table 1 Solubility enhancer tags used in this study Tags Source Full Name Size (kDa) Reference ArsC Escherichia coli Arsenate reductase 15.7 [8] Crr Escherichia coli Glucose-specific phosphotransferase (PTS) enzyme IIA 18.1 [9] DsbA Escherichia coli Disulfide bond formation protein A 23.0 [10] Ecotin Escherichia coli E. coli trypsin inhibitor 18.1 [11] MsyB Escherichia coli An acidic E. coli protein 14.1 [13] SlyD Escherichia coli An aggregation-resistant protein 20.7 [14] Snut Staphylococcus aureus Solubility 'eNhancing' Ubiquitous Tag 16.8 [3] YjgD Escherichia coli The hypothetical E. coli ORF 15.5 [12] 1 Materials and Methods 1.1 Bacterial Strains, Plasmids, and Growth Conditions The bacterial strains and plasmids used in this study are listed in Table S1 . E. coli strains were cultivated in Luria-Bertani (LB) medium at 37°C. Antibiotics and inducers were supplemented as needed at the following final concentrations: ampicillin, 100 µg/mL; kanamycin, 30 µg/mL; and isopropyl-β-D-thiogalactoside (IPTG), 240 µg/mL. Bacterial growth was monitored by measuring the optical density at 600 nm (OD 600 ). Restriction enzymes, high-fidelity DNA polymerase, T4 DNA ligase, and other molecular biology reagents were obtained from Takara Biotechnology (Dalian, China). Primers were synthesized, and DNA sequencing was performed by Sangon Biotech (Shanghai, China). All other chemicals were of molecular biology grade. 1.2 Vector Construction All expression vectors were derived from the pET-28b(+) backbone. The general structure of the fusion constructs is depicted in Fig. 1 . Briefly, the gene encoding each fusion tag (e.g., ArsC) was amplified from E. coli MG1655 genomic DNA using primer pairs Arsc-P1 and Arsc-P2 listed in Table S2. Sequential overlap extension PCR was used to assemble the final cassette containing the tag, a synthetic linker (GGGGS)₂, the TEV protease recognition site (ENLYFQ/G), and a 6×His tag. The resulting fragment was digested with Nco I and Bam HI and ligated into similarly digested pET-28b(+) to generate the initial fusion tag vector, pArsC. The other seven tag fragments (Crr, DsbA, Ecotin, MsyB, SlyD, Snut, YjgD) were amplified, digested with Nco I and Kpn I, and subsequently cloned into the equivalently digested pArsC backbone, thereby replacing the original ArsC tag to generate the final pX vector series (pCrr, pDsbA, pEcotin, pMsyB, pSlyD, pSnut, and pYjgD). All constructs were verified by DNA sequencing (Sangon Biotech, Guangzhou). The genes encoding Mals and PulA were synthesized with GenSmart™ codon optimization for E. coli , based on sequences from Bacillus sp. WPD616 and Thermotoga neapolitana , respectively. The eGFP gene was amplified from pET28b-egfp (a kind gift from associate Prof Wang) with primer pairs listed in Table S2. The EcFabG gene was amplified from E. coli MG1655 genomic DNA. The genes for NodE, FabF1XL, FabF2XL, FabZXL and FabGXL were amplified from Sinorhizobium meliloti Rm1021 genomic DNA. All the target genes sequences were then parallel cloned into the pX vectors with the same Nde I and Hin dIII digestion. All final expression constructs were confirmed by DNA sequencing. 1.3 Protein Expression and Solubility Analysis Recombinant vectors were transformed into E. coli BL21(DE3) competent cells. Single transformants were inoculated into 5 mL of LB medium and grown overnight at 37°C. The overnight cultures were diluted 1:20 into fresh LB medium and grown at 37°C until the OD₆₀₀ reached approximately 0.6–0.8. Protein expression was induced by adding IPTG to a final concentration of 240 µg/mL, followed by incubation for 4 hours at 37°C or 16 hours at 20°C. Cells were harvested by centrifugation, resuspended in Lysis Buffer (50 mM NaH₂PO₄, 300 mM NaCl, 10 mM imidazole, pH 8.0), and disrupted by sonication on ice. The soluble (supernatant) and insoluble (pellet) fractions were separated by centrifugation and analyzed by 12.5% SDS-PAGE. Protein solubility was assessed based on the distribution of the target protein in the supernatant versus the pellet. For proteins found primarily in the insoluble fraction, induction conditions were optimized by lowering the temperature and extending the induction time. Mals and EcFabG proteins were purified using nickel-nitrilotriacetic acid (Ni-NTA) affinity chromatography as previously described [20]. Protein concentrations were quantified by densitometric analysis of SDS-PAGE gels using ImageJ software. 1.4 Fluorescence Analysis of eGFP Fusion Tags The purified eGFP fusion proteins were diluted to a final concentration of 1 µM in 50 mM sodium phosphate buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, pH 8.0). eGFP fluorescence measurements were carried out using a SpectraMax i3x Multi-Mode Microplate Reader (Molecular Devices) [19]. The excitation wavelength was set to 460 nm, and emission spectra were recorded accordingly to analyze the effect of the eight fusion tags on the expression of eGFP fluorescence. 1.5 Enzymatic Activity Assay of EcFabG Fusions The activity of EcFabG and its fusion variants was assessed in vitro by reconstituting the fatty acid synthesis pathway as described previously [20]. The assay mixture contained 0.1 m sodium phosphate (pH 7.0), 0.1 µg each of EcFabD, EcFabH, EcFabG, EcFabZ, EcFabI and 50 µm NADH, 50 µm NADPH, 1 mm β-mercaptoethanol, 100 µm malonyl‐CoA, 50 µm holo‐ACP and 100 µm acetyl‐CoA in a final volume of 40 µL. The assay mixtures were incubated at 37°C for 1 h and resolved by conformation‐sensitive gel electrophoresis on 20% polyacrylamide gels containing 2.5 M urea for separation. The gels were stained with Coomassie brilliant blue R250 to visualize the acyl-ACP products. 1.6 TEV Protease Cleavage and Mals Activity Assay Purified Tag-Mals fusion proteins were treated with TEV protease to remove the fusion tags [21]. The enzymatic activity of both tagged and tag-free Mals was determined using the 3,5-dinitrosalicylic acid (DNS) method, which measures the reducing sugars released from starch hydrolysis [22]. Briefly, the reaction mixture containing 1% (w/v) soluble starch in 50 mM sodium citrate buffer (pH 5.5) and an appropriate amount of enzyme was incubated at 60°C for 10 min. The reaction was stopped by adding DNS reagent, followed by boiling for 10 min. The amount of reducing sugars released was measured spectrophotometrically at 540 nm. One unit (U) of enzyme activity was defined as the amount of enzyme required to produce 1 µmol of reducing sugar (expressed as glucose equivalents) per minute under the assay conditions. A standard curve was prepared using glucose solutions of known concentrations (Figure S1 ). 2 Results 2.1 Construction of a Versatile Fusion Tag Expression System We successfully constructed a series of pET-28b(+)-derived expression vectors, designated as the pX series. As illustrated in Fig. 1 , each vector is designed to fuse one of eight different tags (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, or YjgD) to the N-terminus of a target protein via a synthetic linker. The construct also includes a TEV protease cleavage site and a standardized multiple cloning site (MCS) flanked by dual 6×His affinity tags. In total, 81 recombinant expression vectors were constructed in batches to accommodate the various tag-target protein (eGFP, EcFabG, Mals, PulA, NodE, FabF1XL, FabF2XL, FabZXL or FabGXL) combinations (Table S1 ). 2.2 Fusion Tags Differentially Modulate eGFP Solubility and Fluorescence The green fluorescent protein GFP is a widely used reporter gene for in vivo expression and is recognized to be partially soluble when expressed in E. coli [14]. We first employed enhanced Green Fluorescent Protein (eGFP) as a model protein to evaluate the solubility-enhancing effects of the eight fusion tags. Solubility analysis by SDS-PAGE demonstrated that for eGFP, all tested fusion proteins achieved excellent soluble expression except those with Crr and DsbA tags (Fig. 2 A). Densitometric quantification of the gels revealed that eGFP fused with ArsC, Ecotin, MsyB, SlyD, Snut, and YjgD was predominantly soluble. Notably, the MsyB and Snut tags exhibited particularly remarkable effects, increasing the soluble proportion of eGFP from 15% (wild-type, WT) to 87% and 92%, respectively. Furthermore, the ArsC tag not only enhanced the solubility but also boosted the overall expression level of eGFP (Fig. 2 B). The results indicated that MsyB, ArsC, Ecotin, SlyD, YjgD and Snut effectively promoted the soluble expression of eGFP. We next assessed whether the tags interfered with eGFP function by measuring fluorescence intensity. As summarized in Fig. 2 C, the fluorescence intensity of the DsbA-tagged eGFP was significantly reduced, suggesting that this tag may interfere with the proper folding of eGFP. In contrast, the other seven tagged proteins exhibited fluorescence intensities comparable to that of wild-type eGFP, indicating that these tags promoted soluble expression without compromising functional integrity. 2.3 Impact of Fusion Tags on EcFabG Solubility and Enzyme Activity We next examined the effect of the fusion tags on 3-ketoacyl-ACP reductase (EcFabG), a soluble enzyme involved in the bacterial fatty acid synthesis pathway. Solubility analysis by SDS-PAGE demonstrated that EcFabG fusions with YjgD, MsyB, DsbA, and ArsC were largely soluble (Fig. 3 A). Densitometric quantification of the gels indicated that the solubility of EcFabG fused with Crr and SlyD decreased to 38% and 35%, respectively, compared to 93% for the wild-type enzyme (Fig. 3 B). This demonstrates that fusion with Crr or SlyD led to predominant inclusion body formation, significantly compromising EcFabG solubility. In contrast, tags such as Ecotin and Snut substantially reduced the overall expression level of EcFabG. This indicated that the Crr, SlyD, Ecotin and Snut tags had a greater effect on the solubility of EcFabG. To evaluate the functional integrity of the tagged EcFabG proteins, we reconstituted the fatty acid synthesis pathway in vitro . The EcFabG catalyzes the reduction of 3-ketoacyl-ACP to 3-hydroxyacyl-ACP. The enzymatic activity was assessed via conformationally sensitive gel electrophoresis (Fig. 3 C). EcFabG fused with YjgD MsyB or ArsC retained significant reductase activity. In contrast, the activity of EcFabG carrying the Crr, Ecotin, DsbA or SlyD tags was markedly impaired, indicating that these tags interfered with the enzyme's catalytic function. 2.4 TEV Protease Cleavage and Its Effect on Mals Activity Given that some tags enhanced solubility but compromised the activity of EcFabG and the fluorescence intensity of eGFP, we further investigated the effect of tag removal on maltogenic amylase (Mals). As shown in Fig. 4 A, SDS-PAGE analysis of soluble (S) and insoluble (P) fractions revealed distinct solubility profiles for Mals fused with different tags. Notably, Mals fused with tags such as ArsC, SlyD, DsbA, MsyB, Crr, and Snut showed prominent bands in the soluble fractions, indicating significantly improved the soluble expression of Mals. In contrast, the Ecotin-tagged Mals was predominantly detected in the insoluble pellets, suggesting poor solubility under the tested conditions. Quantification of solubility via grayscale density analysis further supported these observations (Fig. 4 B). Most fusion tags significantly improved the solubility of Mals compared to the no-tag control, with Crr, MsyB, SlyD and Snut exhibiting the most pronounced effects. Enzymatic activity assays conducted before and after TEV protease cleavage provided insights into the functional influence of the fusion tags (Fig. 4 C). Prior to cleavage, several tagged Mals, particularly those with SlyD, MsyB and Crr, displayed high maltogenic amylase activity. Following tag removal by TEV protease, a general reduction in activity was observed across most constructs, implying that the presence of the fusion tag substantially enhanced Mals activity compared to the tag-free protein. Notably, the SlyD-tagged Mals retained the highest activity both before and after cleavage, highlighting its dual benefit as a solubility and positively influenced the enzymatic function of Mals. The results indicated that the fusion tags differentially influenced the solubility and activity of Mals. 2.5 Enhancement of Soluble Expression for Challenging Target Proteins Previous studies have found that it is difficult to express and purify some of the enzymes involved in fatty acid synthesis in S. meliloti Rm1021, making it impossible to carry out biochemical studies [23]. To further validate the broad applicability of our system, we tested its efficacy on five difficult-to-express proteins from Sinorhizobium meliloti (NodE, FabGXL, FabF1XL, FabF2XL, and FabZXL), which are typically insoluble or poorly expressed in E. coli . As shown in Fig. 5 , the fusion tags dramatically improved their solubility profiles: NodE showed prominent soluble expression when fused with MsyB or Snut (Fig. 5 a). FabGXL, which was largely insoluble without a tag, achieved soluble expression only with the YjgD tag (Fig. 5 b). For FabF1XL and FabF2XL, the YjgD and MsyB tags demonstrated the most effective solubilization, while other tags were less effective (Fig. 5 c, d). In contrast, all eight fusion tags successfully promoted the soluble expression of FabZXL (Fig. 5 e). These results underscore the target protein-dependent nature of tag efficacy and highlight the utility of our multi-tag system for identifying optimal solubility conditions for a wide range of recalcitrant proteins. 3 Discussion The formation of insoluble inclusion bodies remains a major obstacle to the widespread application of prokaryotic expression systems for recombinant protein production [2, 24]. To address this challenge, we developed a versatile fusion tag system that enables rapid screening for optimal solubility enhancers. In this study, the pullulanase PulA could not be expressed in E. coli BL21(DE3) (Figure S2), likely due to its origin from the hyperthermophilic anaerobe Thermotoga neapolitana . Although the PulA gene was codon-optimized for E. coli , it still failed to express, suggesting that factors beyond codon usage—such as protein folding kinetics or compatibility with the host proteostasis network—may have contributed to its insolubility [2, 25]. Similarly, proteins such as Mals and NodE, which may contain rare codons or complex folding requirements, achieved soluble expression only when fused with specific tags. For instance, MsyB markedly enhanced the solubility of NodE (Fig. 5 a), while ArsC, SlyD, DsbA, MsyB, Crr, and Snut all significantly improved the solubility of Mals (Fig. 4 ). Other target proteins, including eGFP, FabGXL, FabF1XL, FabF2XL, and FabZXL, were predominantly expressed as inclusion bodies in the absence of fusion tags. While soluble expression was achievable with tag assistance, the efficacy of each tag varied considerably depending on the target protein. For example, all eight tags improved FabZXL solubility (Fig. 5 e), whereas only YjgD enabled soluble expression of FabGXL (Fig. 5 b). Similarly, MsyB and YjgD were the most effective for FabF1XL and FabF2XL (Fig. 5 c–d). ArsC, Ecotin, MsyB, SlyD, Snut, and YjgD fusion tags can promote the solubilization of eGFP (Fig. 2 ). These observations highlight the target-specific nature of fusion tag efficacy and underscore the importance of employing a multi-tag screening approach [15, 26]. The solubilization mechanism of fusion tags remains incompletely understood, though it is often suggested to relate to their own folding properties and biophysical characteristics, such as surface charge or hydrophilicity [12, 27]. To gain preliminary insight into the mechanisms underlying solubility enhancement, we analyzed the core biophysical properties of our eight fusion tags (Supplementary Table S3). We observed a general trend wherein tags characterized by a high negative net charge and hydrophilic nature, as indicated by a negative GRAVY index, and particularly those with highly acidic properties (pI < 5.0) such as MsyB and YjgD, tended to be the most effective and versatile solubility enhancers. This finding is consistent with prior studies on acidic fusion partners [12, 13]. However, notable exceptions highlight the complexity of the mechanism. For instance, the superior performance of SlyD with Mals likely stems from its intrinsic chaperone activity rather than its charge properties alone [14]. Interestingly, the less acidic Snut (pI 6.324, GRAVY − 1.106, net charge − 1.79) still demonstrated considerable solubilization efficacy, performing well with target proteins such as eGFP, Mals, NodE, and FabZXL. Furthermore, the ability of certain tags (e.g., YigD) to solubilize particularly recalcitrant proteins (FabGXL) for which others failed including the MsyB, underscores that optimal tag selection results from a complex, individualized match between the tag's biophysical properties and the target protein's specific folding pathway and structural needs. However, the molecular dimensions and structural properties of fusion tags can also interfere with the folding and functionality of target proteins, as evidenced by our experimental data. A notable example is the DsbA tag, which substantially quenched eGFP fluorescence despite maintaining reasonable solubility (Fig. 2 C). In this experimental design, the soluble protein EcFabG was intentionally selected to analyze the impact of fusion tags on its solubility and function. As anticipated, the addition of tags reduced its solubility ratio to varying degrees, with the most significant decrease reaching 58%. Furthermore, tags including Crr, Ecotin, DsbA, and SlyD were found to impair EcFabG's enzymatic activity (Fig. 3 C). These results emphasize that while fusion tags can enhance solubility, they may also interfere with protein function. It is therefore common practice to remove fusion tags after purification [28]. Interestingly, however, several tags in this study—including SlyD, MsyB, and Crr—enhanced the enzymatic activity of Mals even before cleavage (Fig. 4 C). This suggests that, in some cases, fusion partners may do more than improve solubility; they may also assist in folding or stabilize the active conformation of certain target proteins [14, 15]. The underlying mechanisms warrant further investigation. In summary, our results confirm that no single fusion tag is universally effective for all target proteins. The optimal tag must be empirically determined through parallel screening [15]. The pX vector system developed in this study provides a convenient and efficient platform for such screening, enabling rapid identification of the most suitable fusion tag to enhance both the solubility and functional yield of diverse recombinant proteins. Conclusions In this study, we developed a novel and versatile pX vector series for enhancing the soluble expression of recombinant proteins in E. coli. This system integrates eight medium-sized, TEV-cleavable fusion tags into a standardized backbone, enabling high-throughput parallel cloning and screening. Our comprehensive evaluation using diverse model and challenging target proteins demonstrates that fusion tags exert protein-specific effects on solubility, expression yield, and biological activity. No single tag was universally optimal, underscoring the necessity of empirical screening. The pX platform effectively addresses this need by facilitating the rapid identification of the most suitable fusion partner for a given protein. With its proven efficacy in promoting the soluble and functional expression of even recalcitrant proteins, this system represents a valuable and streamlined tool for both academic and industrial protein research. Abbreviations E. coli , Escherichia coli LB, Luria Bertani IPTG, Isopropyl-β-D-1-thiogalactopyranoside MCS, multi-cloning site TEV, Tobacco Etch Virus eGFP, enhanced green fluorescence protein EcFabG, 3-ketoacyl-ACP reductase Mals, maltogenic amylase ArsC, Arsenate reductase Crr, Glucose-specific phosphotransferase (PTS) enzyme IIA DsbA, Disulfide bond formation protein A Ecotin, E. coli trypsin inhibitor MsyB, An acidic E. coli protein SlyD, An aggregation-resistant protein Snut, Solubility 'eNhancing' Ubiquitous Tag YjgD, The hypothetical E. coli ORF Declarations Acknowledgements Not applicable. Author contributions LLZ and MJC: Writing–original draft, Writing–review & editing, Data curation, Visualization. LLZ, CJT, TZY and CYQ: carried out all experiments. ZWB and MJC: planning, design, and coordination of the research. HZ and WHH: Project administration. All authors reviewed the manuscript. Funding This study was supported by the following projects: Industry-University Cooperation Project (H20230317) and National Natural Science Foundation of China (32570032). Availability of data and materials All of the data generated and used in this work are included in the manuscript and are available as supplementary material. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare no competing interests. References Hayat SMG, Farahani N, Golichenari B, Sahebkar A. Recombinant Protein Expression in Escherichia coli ( E. coli ): What We Need to Know. Curr Pharm Des. 2018;24(6):718–725. https://doi.org/10.2174/1381612824666180131121940. Rosano GL, Ceccarelli EA. Recombinant protein expression in Escherichia coli : advances and challenges. Frontiers in microbiology. 2014;5:172. https://doi.org/10.3389/fmicb.2014.00172. Caswell J, Snoddy P, McMeel D, Buick RJ, Scott CJ. Production of recombinant proteins in Escherichia coli using an N-terminal tag derived from sortase. Protein Expr Purif. 2010;70(2):143–150. https://doi.org/10.1016/j.pep.2009.10.012. Paraskevopoulou V, Falcone FH. Polyionic Tags as Enhancers of Protein Solubility in Recombinant Protein Expression. Microorganisms. 2018;6(2):47. https://doi.org/10.3390/microorganisms6020047. di Guana C, Lib P, Riggsa PD, Inouyeb H. Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose-binding protein. Gene. 1988;67(1):21–30. https://doi.org/10.1016/0378-1119(88)90004-2. Malakhov MP, Mattern MR, Malakhova OA, Drinker M, Weeks SD, Butt TR. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics. 2004;5(1-2):75–86. https://doi.org/10.1023/b:Jsfg.0000029237.70316.52. LaVallie ER, DiBlasio EA, Kovacic S, Grant KL, Schendel PF, McCoy JM. A Thioredoxin Gene Fusion Expression System That Circumvents Inclusion Body Formation in the E. coli Cytoplasm. Nat Biotechnol. 1993;11(2):187–193. https://doi.org/10.1038/nbt0293-187. Song JA, Lee DS, Park JS, Han KY, Lee J. A novel Escherichia coli solubility enhancer protein for fusion expression of aggregation-prone heterologous proteins. Enzyme Microb Technol. 2011;49(2):124–130. https://doi.org/10.1016/j.enzmictec.2011.04.013. Han K, Seo H, Song J, Ahn K, Park J, Lee J. Transport proteins PotD and Crr of Escherichia coli , novel fusion partners for heterologous protein expression. Biochim Biophys Acta. 2007;1774(12):1536–1543. https://doi.org/10.1016/j.bbapap.2007.09.012. Zhang Y, Olsen DR, Nguyen KB, Olson PS, Rhodes ET, Mascarenhas D. Expression of eukaryotic proteins in soluble form in Escherichia coli . Protein Expr Purif. 1998;12(2):159–165. https://doi.org/10.1006/prep.1997.0834. Malik A, Rudolph R, Söhling B. A novel fusion protein system for the production of native human pepsinogen in the bacterial periplasm. Protein Expression Purif. 2006;47(2):662–671. https://doi.org/10.1016/j.pep.2006.02.018. Zou Z, Cao L, Zhou P, Su Y, Sun Y, Li W. Hyper-acidic protein fusion partners improve solubility and assist correct folding of recombinant proteins expressed in Escherichia coli . J Biotechnol. 2008;135(4):333–339. https://doi.org/10.1016/j.jbiotec.2008.05.007. Su Y, Zou Z, Feng S, Zhou P, Cao L. The acidity of protein fusion partners predominantly determines the efficacy to improve the solubility of the target proteins expressed in Escherichia coli . J Biotechnol. 2007;129(3):373–382. https://doi.org/10.1016/j.jbiotec.2007.01.015. Han KY, Song JA, Ahn KY, Park JS, Seo HS, Lee J. Solubilization of aggregation-prone heterologous proteins by covalent fusion of stress-responsive Escherichia coli protein, SlyD. Protein Eng Des Sel. 2007;20(11):543–549. https://doi.org/10.1093/protein/gzm055. Köppl C, Lingg N, Fischer A, Kröß C, Loibl J, Buchinger W, Schneider R, Jungbauer A, Striedner G, Cserjan-Puschmann M. Fusion Tag Design Influences Soluble Recombinant Protein Production in Escherichia coli . International Journal of Molecular Sciences. 2022;23(14):7678. https://doi.org/10.3390/ijms23147678. Cserjan-Puschmann M, Lingg N, Engele P, Kröß C, Loibl J, Fischer A, Bacher F, Frank A-C, Öhlknecht C, Brocard C et al . Production of Circularly Permuted Caspase-2 for Affinity Fusion-Tag Removal: Cloning, Expression in Escherichia coli , Purification, and Characterization. 2020;10(12):1592. https://doi.org/10.3390/biom10121592. Kapust RB, Tözsér J, Copeland TD, Waugh DS. The P1′ specificity of tobacco etch virus protease. Biochem Biophys Res Commun. 2002;294(5):949–955. https://doi.org/10.1016/S0006-291X(02)00574-0. Wang HZ, Chu ZZ, Chen CC, Cao AC, Tong X, Ouyang CB, Yuan QH, Wang MN, Wu ZK, Wang HH et al . Recombinant Passenger Proteins Can Be Conveniently Purified by One-Step Affinity Chromatography. PLoS One. 2015;10(12):e0143598. https://doi.org/10.1371/journal.pone.0143598. Cha HJ, Wu CF, Valdes JJ, Rao G, Bentley WE. Observations of green fluorescent protein as a fusion partner in genetically engineered Escherichia coli : monitoring protein expression and solubility. Biotechnol Bioeng. 2000;67(5):565–574. https://doi.org/10.1002/(SICI)1097-0290(20000305)67:53.0.CO;2-P. Hu Z, Ma J, Chen Y, Tong W, Zhu L, Wang H, Cronan JE. Escherichia coli FabG 3-ketoacyl-ACP reductase proteins lacking the assigned catalytic triad residues are active enzymes. J Biol Chem. 2021;296:100365. https://doi.org/10.1016/j.jbc.2021.100365. Raran-Kurussi S, Cherry S, Zhang D, Waugh DS. Removal of Affinity Tags with TEV Protease. Methods in Molecular Biology. 2017;1586:221–230. https://doi.org/10.1007/978-1-4939-6887-9_14. LI Y, Su L, Wu J, Wu D. Recombinant Expression and Fermentation Optimization of B. stearothermophilu Maltogenic Amylases in Bacillus subtilis . Journal of Food Science and Biotechnology. 2020;39(02):1–9. https://doi.org/10.3969/j.issn.1673-1689.2020.02.001. Haag AF, Wehmeier S, Muszyński A, Kerscher B, Fletcher V, Berry SH, Hold GL, Carlson RW, Ferguson GP. Biochemical characterization of Sinorhizobium meliloti mutants reveals gene products involved in the biosynthesis of the unusual lipid A very long-chain fatty acid. The Journal of biological chemistry. 2011;286(20):17455–17466. https://doi.org/10.1074/jbc.M111.236356. Zhao L, Cao J, Liu X, Li Y, Wu J, Su L. Optimizing protein folding in prokaryotes: Strategies to enhance soluble expression of recombinant proteins. Bioresour Technol. 2026;439:133266. https://doi.org/10.1016/j.biortech.2025.133266. Fang J, Zou L, Zhou X, Cheng B, Fan J. Synonymous rare arginine codons and tRNA abundance affect protein production and quality of TEV protease variant. PLoS One. 2014;9(11):e112254. https://doi.org/10.1371/journal.pone.0112254. Park J-S, Han K-Y, Lee J-H, Song J-A, Ahn K-Y, Seo H-S, Sim S-JJ, Kim S-W, Lee J. Solubility enhancement of aggregation-prone heterologous proteins by fusion expression using stress-responsive Escherichia coli protein, RpoS. BMC Biotechnol. 2008;8:15–15. https://doi.org/10.1186/1472-6750-8-15. Chen J-P, Gong J-S, Su C, Li H, Xu Z-H, Shi J-S. Improving the soluble expression of difficult-to-express proteins in prokaryotic expression system via protein engineering and synthetic biology strategies. Metab Eng. 2023;78:99–114. https://doi.org/10.1016/j.ymben.2023.05.007. Raran-Kurussi S, Waugh DS. Expression and Purification of Recombinant Proteins in Escherichia coli with a His(6) or Dual His(6)-MBP Tag. Methods Mol Biol. 2017;1607:1–15. https://doi.org/10.1007/978-1-4939-7000-1_1. Additional Declarations No competing interests reported. Supplementary Files Supplementarymaterial251122.docx Cite Share Download PDF Status: Published Journal Publication published 12 Feb, 2026 Read the published version in BMC Biotechnology → Version 1 posted Editorial decision: Revision requested 06 Jan, 2026 Reviews received at journal 05 Jan, 2026 Reviews received at journal 05 Jan, 2026 Reviews received at journal 28 Dec, 2025 Reviews received at journal 21 Dec, 2025 Reviewers agreed at journal 17 Dec, 2025 Reviewers agreed at journal 15 Dec, 2025 Reviewers agreed at journal 15 Dec, 2025 Reviewers agreed at journal 15 Dec, 2025 Reviewers agreed at journal 04 Dec, 2025 Reviewers invited by journal 03 Dec, 2025 Editor invited by journal 28 Nov, 2025 Editor assigned by journal 24 Nov, 2025 Submission checks completed at journal 24 Nov, 2025 First submitted to journal 22 Nov, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8180750","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":555380938,"identity":"6243d976-5a87-4e25-981c-23bcbd183d84","order_by":0,"name":"Li-Zhen Luo","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Li-Zhen","middleName":"","lastName":"Luo","suffix":""},{"id":555380939,"identity":"817816fa-2a6f-40d2-91d5-be78ae53c629","order_by":1,"name":"Jian-Tao Cai","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Jian-Tao","middleName":"","lastName":"Cai","suffix":""},{"id":555380940,"identity":"9bc958b8-6edd-49c6-965d-3c06f1441172","order_by":2,"name":"Zi-Ying Tan","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Zi-Ying","middleName":"","lastName":"Tan","suffix":""},{"id":555380941,"identity":"0a885453-f387-4c69-a356-cf8efe5059fe","order_by":3,"name":"Yu-Qing Chen","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Yu-Qing","middleName":"","lastName":"Chen","suffix":""},{"id":555380942,"identity":"1bd68b98-3283-4c98-911d-b315bc2dd026","order_by":4,"name":"Zhe Hu","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Zhe","middleName":"","lastName":"Hu","suffix":""},{"id":555380943,"identity":"799fd77f-46d5-48a2-ae77-73aaa50bdf7b","order_by":5,"name":"Hai-Hong Wang","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Hai-Hong","middleName":"","lastName":"Wang","suffix":""},{"id":555380944,"identity":"72d7d25c-596a-4a8b-bdc9-9c68842e3944","order_by":6,"name":"Wen-Bin Zhang","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Wen-Bin","middleName":"","lastName":"Zhang","suffix":""},{"id":555380945,"identity":"6b96e9b8-63a9-43ff-935e-e210985bee9f","order_by":7,"name":"Jin-Cheng Ma","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA8UlEQVRIiWNgGAWjYHCChAMMDEDEwHwAJkC0FjaYUsJaQACkhceAOC267QceHi6ouSNnzr/m84cfNYcZ+NlzDBh+7sCtxexMQsLhGceeGVvOeLtNsufYYQbJnjcGjL1n8Gg5ANTC23A4ccONs9uYGdgOMxjcyDFgZmzDo+X8A7CW+g03zjz+zPDvMIM9QS03ILYkGJzvYZBmbAPaIkFQC9AWnmOHDTfcYDOT7O1L55E486zgYC9eh+Ukf+apOSxvcP7w4w8/vlnL8bcnb3zwE48WYHQkQGgJCM0DIg7g08DAwA6V5yegbhSMglEwCkYuAACQIF2lwk2N6AAAAABJRU5ErkJggg==","orcid":"","institution":"South China Agricultural University","correspondingAuthor":true,"prefix":"","firstName":"Jin-Cheng","middleName":"","lastName":"Ma","suffix":""}],"badges":[],"createdAt":"2025-11-22 13:38:13","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8180750/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8180750/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12896-026-01109-1","type":"published","date":"2026-02-12T15:57:32+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":97693441,"identity":"3376e3d0-ab63-4674-ba38-86d66b0709bd","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":19276924,"visible":true,"origin":"","legend":"","description":"","filename":"ApplicationofaNovelFusionTagSystemforEnhancedSolubleExpressionofRecombinantProteinsinEscherichiacoli251122.docx","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/b1115fe5efc8db7a75003e9c.docx"},{"id":97893800,"identity":"4941f31f-f1bd-4ff7-8f34-791c7b2827a9","added_by":"auto","created_at":"2025-12-10 15:31:14","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8841,"visible":true,"origin":"","legend":"","description":"","filename":"39814dd3fe474ffd8d284dca14cf8c9d.json","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/436f5f05fa178ae6686623fa.json"},{"id":97894238,"identity":"c21f9eb3-273a-4c3d-9d0e-4b3598bd434e","added_by":"auto","created_at":"2025-12-10 15:32:05","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7345607,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial251122.docx","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/c1cfbc8ef7afd330a3853b65.docx"},{"id":97693426,"identity":"35a54280-00ed-4bc1-9d62-73a64d670d77","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":68888,"visible":true,"origin":"","legend":"","description":"","filename":"39814dd3fe474ffd8d284dca14cf8c9d1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/4bd486609d4058146d382aac.xml"},{"id":97693429,"identity":"062e43d0-a2ff-4ab6-8579-d70f869184c9","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"jpeg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1813144,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/9604edbd407f4b8a47035810.jpeg"},{"id":97894436,"identity":"fa5d501e-308c-4f52-ba02-a2e7f2d77227","added_by":"auto","created_at":"2025-12-10 15:32:29","extension":"jpeg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2293392,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/cb6c24c9ebe7833db6c3d571.jpeg"},{"id":97892817,"identity":"74962ef0-eefb-44f9-83d4-d07cd59386db","added_by":"auto","created_at":"2025-12-10 15:22:41","extension":"jpeg","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5424172,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/4581f94c2cf48f9eee4e5436.jpeg"},{"id":97693438,"identity":"543a5338-5479-40fc-bd96-3881cff1734b","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"jpeg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4950878,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/b792fc636af61a7c97d945e1.jpeg"},{"id":97693440,"identity":"283b31ca-34d3-4b36-bca2-9d5f4ca7ecb9","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"jpeg","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4667986,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/abdbbad48ec357e7f8d4acbb.jpeg"},{"id":97893609,"identity":"49aa32f6-bda0-4057-8103-b4dcd1d96119","added_by":"auto","created_at":"2025-12-10 15:30:50","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":36114,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/01190f634136fd259c74e4cd.png"},{"id":97693427,"identity":"7636720c-a973-4fde-b02d-b92dc932e229","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":46717,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/ab10965f3fa1089c699b6df6.png"},{"id":97894027,"identity":"0a205c6b-2419-4943-8e2e-da7aebb0d14f","added_by":"auto","created_at":"2025-12-10 15:31:49","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":48543,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/b7a62b990637ad2079d95140.png"},{"id":97894133,"identity":"38dbc729-3514-46a0-ba87-340a3314a99c","added_by":"auto","created_at":"2025-12-10 15:31:57","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":46100,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/6ac89d77efe91810354eff25.png"},{"id":97693434,"identity":"b95a1de1-85fd-4213-983c-e58b04822732","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":59924,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/57f3160241ce873e453b2766.png"},{"id":97693433,"identity":"81fc5d22-36e5-4a94-99df-73e893570bda","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"xml","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":68019,"visible":true,"origin":"","legend":"","description":"","filename":"39814dd3fe474ffd8d284dca14cf8c9d1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/9d7de8d3d02e63a37533fba8.xml"},{"id":97693435,"identity":"1fb661eb-8850-4671-ba41-ceb6663eb560","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"html","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":76176,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/cd8da866d66844004f54e9f8.html"},{"id":97693419,"identity":"0ae83557-7e96-4eab-a040-046d323b4bff","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":36114,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSchematic diagram of the pX vector series with various N-terminal fusion tags.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll vectors share a conserved structure featuring a standardized multiple cloning site (MCS) flanked by dual 6×His tags, a synthetic linker peptide, and a TEV protease cleavage site. A panel of fusion tags (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, and YjgD) are incorporated upstream of the MCS to facilitate diverse protein expression and purification strategies.\u003c/p\u003e","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/a7f694b97f3d625cc85cfd9e.png"},{"id":97693420,"identity":"ebe8334d-1080-426b-8be1-649e5ffff03a","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":46717,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalysis of solubility and fluorescence of eGFP fused with different tags.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) Recombinant protein expression in \u003cem\u003eE. coli\u003c/em\u003e BL21 (DE3) harboring different fusion tag constructs. Cultures were grown in LB medium and induced with 240 μg/mL IPTG at 37°C for 4 h. SDS-PAGE analysis of the soluble fraction (supernatants, S) and insoluble fraction (pellets, P) of eGFP fused with various tags. M, protein molecular weight marker. (B) Quantification of protein solubility ratio based on grayscale density analysis for eGFP. (C) Comparison of fluorescence intensity (RLU) between tagged and No-tag eGFP. Fluorescence of eGFP was measured at excitation 489 nm/emission 511 nm. Data represent mean ± SD (n=3); ns, not significant (P \u0026gt; 0.05, Student's t-test).\u003c/p\u003e","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/4ecb17a5e108b298aa6906ac.png"},{"id":97894240,"identity":"d618eb4b-67de-4ded-87e4-1604ac99ff1a","added_by":"auto","created_at":"2025-12-10 15:32:05","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":48543,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEvaluation of solubility and enzyme activity for EcFabG fusion proteins.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) SDS-PAGE analysis of soluble (S) and insoluble (P) fractions of EcFabG fused with different tags expressed in \u003cem\u003eE. coli\u003c/em\u003e BL21 (DE3). Cultures were induced at 37°C for 4 h. M, protein molecular weight marker. (B) Quantification of solubility ratio based on grayscale density for EcFabG. (C) \u003cem\u003eIn vitro \u003c/em\u003eenzymatic activity assay of EcFabG fusions using conformationally sensitive gel electrophoresis. The reaction shows the conversion of malonyl-ACP (Mal-ACP) and octanoyl-ACP (C\u003csub\u003e8\u003c/sub\u003e-ACP) to decanoyl-ACP (C\u003csub\u003e10\u003c/sub\u003e-ACP) and dodecanoyl-ACP (C\u003csub\u003e12\u003c/sub\u003e-ACP). Control reaction showing enzymatic activity of No-tag EcFabG.\u003c/p\u003e","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/8f5df03418a598e4fa27ba93.png"},{"id":97693424,"identity":"94b30d1a-eb50-4800-b001-9b7f717b08db","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":46100,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFunctional analysis of tagged and tag-free Mals proteins.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) SDS-PAGE analysis of soluble (S) and insoluble (P) fractions of Mals with different fusion tags expressed in \u003cem\u003eE. coli\u003c/em\u003e BL21 (DE3). M, protein molecular weight marker. (B) Quantification of solubility ratio based on grayscale density for Mals. (C) Quantitative analysis of maltogenic amylase activity before and after TEV protease cleavage (μmol/ml/min).\u003c/p\u003e","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/18fbb74ff5baa642a3cafadd.png"},{"id":97693421,"identity":"51713bb5-a795-4a5a-bc13-d50a34519717","added_by":"auto","created_at":"2025-12-08 11:17:08","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":59924,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEnhancement of expression and solubility of NodE, FabGXL, FabF1XL, FabF2XL and FabZXLusing the pX vector series.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSDS-PAGE analysis of NodE (A), FabGXL (B), FabF1XL (C), FabF2XL (D) and FabZXL (E) expression from the soluble fraction (supernatants, S) and insoluble fraction (pellets, P) in\u003cem\u003e E. coli\u003c/em\u003e BL21 (DE3) using the pX medium-sized fusion tag vector series. M, molecular weight marker.\u003c/p\u003e","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/af20c314cbfe9daa7c205038.png"},{"id":102786635,"identity":"0c549971-1ad5-4694-be23-9557356e0dbd","added_by":"auto","created_at":"2026-02-16 16:14:26","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1268539,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/80abb35d-4b92-4361-bf7e-1bc7859a3a42.pdf"},{"id":97892804,"identity":"30616a87-6ef0-490a-9108-b770dbc2ae3d","added_by":"auto","created_at":"2025-12-10 15:22:18","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":7345607,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial251122.docx","url":"https://assets-eu.researchsquare.com/files/rs-8180750/v1/95bf96b52bc0e8e7f1d16676.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Application of a Novel Fusion Tag System for Enhanced Soluble Expression of Recombinant Proteins in Escherichia coli","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe \u003cem\u003eEscherichia coli\u003c/em\u003e expression system is a cornerstone in biotechnology for producing recombinant proteins, prized for its well-characterized genetics, rapid growth, high yield, and cost-effectiveness [1]. Despite these advantages, the heterologous expression of many proteins in \u003cem\u003eE. coli\u003c/em\u003e is often hampered by several challenges, including codon bias, proteolytic degradation, and\u0026mdash;most notably\u0026mdash;the aggregation of target proteins into insoluble inclusion bodies [2]. The formation of these inactive aggregates represents a major bottleneck, influenced by factors ranging from cultivation conditions (e.g., temperature, pH, ionic strength) to intrinsic properties of the target protein itself (such as its amino acid sequence) [3].\u003c/p\u003e\u003cp\u003eVarious strategies have been employed to improve the solubility and yield of biologically active proteins, including optimization of expression conditions, engineering of host strains, co-expression of molecular chaperones, and the use of fusion tags[3, 4]. Among these, fusion tags\u0026mdash;typically derived from highly soluble proteins\u0026mdash;have become a mainstream solution. By being fused to the N- or C-terminus of the target protein, these tags can promote solubility, enhance stability, and facilitate purification. A number of such tags from \u003cem\u003eE. coli\u003c/em\u003e and other organisms have been widely adopted, including Maltose Binding Protein (MBP), Thioredoxin (TrxA), N-utilization substance A (NusA), and Small Ubiquitin-like Modifier (SUMO) [5\u0026ndash;7].\u003c/p\u003e\u003cp\u003eBeyond these well-established tags, a diverse array of other partners with solubilizing potential has been explored. For instance, arsenate reductase (ArsC) exhibits high cytoplasmic solubility and folding capacity, serving as an effective fusion partner [8]. The glucose-specific phosphotransferase IIA component (Crr) and the spermidine/putrescine-binding periplasmic protein (PotD) have also been shown to significantly increase the solubility of heterologous proteins [9]. Disulfide bond formation protein A (DsbA) is particularly useful for promoting the soluble expression of proteins requiring disulfide bond formation [10], while the periplasmic trypsin inhibitor Ecotin has facilitated the expression of challenging proteins like human pepsinogen [11]. Furthermore, highly acidic polypeptides such as MsyB and YjgD are believed to improve solubility by increasing the net negative charge and hydrophilicity of the fusion protein [12, 13]. The anti-aggregation protein SlyD has proven effective in solubilizing aggregation-prone proteins in the cytoplasm [14], and the Solubility 'eNhancing' Ubiquitous Tag (SNUT), derived from a portion of the trans-peptidase sortase found in \u003cem\u003eStaphylococcus aureus\u003c/em\u003e, confers favorable solubility characteristics toward target proteins [3].\u003c/p\u003e\u003cp\u003eHowever, no single fusion tag is universally effective, as their success is highly dependent on the specific target protein [12, 15]. Moreover, the tag itself can sometimes interfere with the structure, function, or immunogenicity of the protein of interest. For many applications in therapeutics and functional studies, the removal of the fusion tag is therefore essential [15]. This is typically achieved by incorporating a specific protease cleavage site between the tag and the target protein. Proteases like Factor Xa, thrombin, and enterokinase have been used, but they often suffer from inefficiency, high cost, or non-specific cleavage[2, 16]. In contrast, Tobacco Etch Virus (TEV) protease offers high specificity and efficiency, recognizing the ENLYFQ/G sequence and cleaving between Q and G, making it an excellent tool for tag removal [17, 18].\u003c/p\u003e\u003cp\u003eTo address the need for a systematic and efficient method to identify the optimal fusion tag for a given protein, we designed and constructed a comprehensive recombinant protein expression system for \u003cem\u003eE. coli\u003c/em\u003e. This system allows for the parallel cloning of a target gene into eight different vectors, each equipped with a medium-sized fusion tag (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, or YjgD) (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), a flexible linker, a TEV protease site, and an N-terminal 6\u0026times;His affinity tag. We systematically investigated the effects of these tags on the solubility and function of three model proteins: enhanced Green Fluorescent Protein (eGFP) [18, 19], 3-ketoacyl-ACP reductase (EcFabG) [20] and maltogenic amylase (Mals). Furthermore, we evaluated the system's efficacy on six difficult-to-express proteins (PulA, NodE, FabF1XL, FabF2XL, FabZXL and FabGXL) to demonstrate its broad utility. Our findings establish this system as a robust and versatile platform for enhancing the soluble expression of diverse recombinant proteins.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSolubility enhancer tags used in this study\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTags\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSource\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eFull Name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eSize (kDa)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eReference\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eArsC\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eArsenate reductase\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e15.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[8]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCrr\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGlucose-specific phosphotransferase (PTS) enzyme IIA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e18.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[9]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDsbA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eDisulfide bond formation protein A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e23.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[10]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEcotin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003eE. coli\u003c/em\u003e trypsin inhibitor\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e18.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[11]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMsyB\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAn acidic \u003cem\u003eE. coli\u003c/em\u003e protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[13]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSlyD\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAn aggregation-resistant protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e20.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[14]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSnut\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eStaphylococcus aureus\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSolubility 'eNhancing' Ubiquitous Tag\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e16.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[3]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eYjgD\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eThe hypothetical \u003cem\u003eE. coli\u003c/em\u003e ORF\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e15.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[12]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e"},{"header":"1 Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e1.1 Bacterial Strains, Plasmids, and Growth Conditions\u003c/h2\u003e\u003cp\u003eThe bacterial strains and plasmids used in this study are listed in Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e. \u003cem\u003eE. coli\u003c/em\u003e strains were cultivated in Luria-Bertani (LB) medium at 37\u0026deg;C. Antibiotics and inducers were supplemented as needed at the following final concentrations: ampicillin, 100 \u0026micro;g/mL; kanamycin, 30 \u0026micro;g/mL; and isopropyl-β-D-thiogalactoside (IPTG), 240 \u0026micro;g/mL. Bacterial growth was monitored by measuring the optical density at 600 nm (OD\u003csub\u003e600\u003c/sub\u003e). Restriction enzymes, high-fidelity DNA polymerase, T4 DNA ligase, and other molecular biology reagents were obtained from Takara Biotechnology (Dalian, China). Primers were synthesized, and DNA sequencing was performed by Sangon Biotech (Shanghai, China). All other chemicals were of molecular biology grade.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e1.2 Vector Construction\u003c/h2\u003e\u003cp\u003eAll expression vectors were derived from the pET-28b(+) backbone. The general structure of the fusion constructs is depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Briefly, the gene encoding each fusion tag (e.g., ArsC) was amplified from \u003cem\u003eE. coli\u003c/em\u003e MG1655 genomic DNA using primer pairs Arsc-P1 and Arsc-P2 listed in Table S2. Sequential overlap extension PCR was used to assemble the final cassette containing the tag, a synthetic linker (GGGGS)₂, the TEV protease recognition site (ENLYFQ/G), and a 6\u0026times;His tag. The resulting fragment was digested with \u003cem\u003eNco\u003c/em\u003eI and \u003cem\u003eBam\u003c/em\u003eHI and ligated into similarly digested pET-28b(+) to generate the initial fusion tag vector, pArsC. The other seven tag fragments (Crr, DsbA, Ecotin, MsyB, SlyD, Snut, YjgD) were amplified, digested with \u003cem\u003eNco\u003c/em\u003eI and \u003cem\u003eKpn\u003c/em\u003eI, and subsequently cloned into the equivalently digested pArsC backbone, thereby replacing the original ArsC tag to generate the final pX vector series (pCrr, pDsbA, pEcotin, pMsyB, pSlyD, pSnut, and pYjgD). All constructs were verified by DNA sequencing (Sangon Biotech, Guangzhou).\u003c/p\u003e\u003cp\u003eThe genes encoding Mals and PulA were synthesized with GenSmart\u0026trade; codon optimization for \u003cem\u003eE. coli\u003c/em\u003e, based on sequences from \u003cem\u003eBacillus\u003c/em\u003e sp. WPD616 and \u003cem\u003eThermotoga neapolitana\u003c/em\u003e, respectively. The eGFP gene was amplified from pET28b-egfp (a kind gift from associate Prof Wang) with primer pairs listed in Table S2. The EcFabG gene was amplified from \u003cem\u003eE. coli\u003c/em\u003e MG1655 genomic DNA. The genes for NodE, FabF1XL, FabF2XL, FabZXL and FabGXL were amplified from \u003cem\u003eSinorhizobium meliloti\u003c/em\u003e Rm1021 genomic DNA. All the target genes sequences were then parallel cloned into the pX vectors with the same \u003cem\u003eNde\u003c/em\u003eI and \u003cem\u003eHin\u003c/em\u003edIII digestion. All final expression constructs were confirmed by DNA sequencing.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e1.3 Protein Expression and Solubility Analysis\u003c/h2\u003e\u003cp\u003eRecombinant vectors were transformed into \u003cem\u003eE. coli\u003c/em\u003e BL21(DE3) competent cells. Single transformants were inoculated into 5 mL of LB medium and grown overnight at 37\u0026deg;C. The overnight cultures were diluted 1:20 into fresh LB medium and grown at 37\u0026deg;C until the OD₆₀₀ reached approximately 0.6\u0026ndash;0.8. Protein expression was induced by adding IPTG to a final concentration of 240 \u0026micro;g/mL, followed by incubation for 4 hours at 37\u0026deg;C or 16 hours at 20\u0026deg;C. Cells were harvested by centrifugation, resuspended in Lysis Buffer (50 mM NaH₂PO₄, 300 mM NaCl, 10 mM imidazole, pH 8.0), and disrupted by sonication on ice. The soluble (supernatant) and insoluble (pellet) fractions were separated by centrifugation and analyzed by 12.5% SDS-PAGE. Protein solubility was assessed based on the distribution of the target protein in the supernatant versus the pellet. For proteins found primarily in the insoluble fraction, induction conditions were optimized by lowering the temperature and extending the induction time. Mals and EcFabG proteins were purified using nickel-nitrilotriacetic acid (Ni-NTA) affinity chromatography as previously described [20]. Protein concentrations were quantified by densitometric analysis of SDS-PAGE gels using ImageJ software.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e1.4 Fluorescence Analysis of eGFP Fusion Tags\u003c/h2\u003e\u003cp\u003eThe purified eGFP fusion proteins were diluted to a final concentration of 1 \u0026micro;M in 50 mM sodium phosphate buffer (50 mM NaH\u003csub\u003e2\u003c/sub\u003ePO\u003csub\u003e4\u003c/sub\u003e, 300 mM NaCl, pH 8.0). eGFP fluorescence measurements were carried out using a SpectraMax i3x Multi-Mode Microplate Reader (Molecular Devices) [19]. The excitation wavelength was set to 460 nm, and emission spectra were recorded accordingly to analyze the effect of the eight fusion tags on the expression of eGFP fluorescence.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e1.5 Enzymatic Activity Assay of EcFabG Fusions\u003c/h2\u003e\u003cp\u003eThe activity of EcFabG and its fusion variants was assessed \u003cem\u003ein vitro\u003c/em\u003e by reconstituting the fatty acid synthesis pathway as described previously [20]. The assay mixture contained 0.1 m sodium phosphate (pH 7.0), 0.1 \u0026micro;g each of EcFabD, EcFabH, EcFabG, EcFabZ, EcFabI and 50 \u0026micro;m NADH, 50 \u0026micro;m NADPH, 1 mm β-mercaptoethanol, 100 \u0026micro;m malonyl‐CoA, 50 \u0026micro;m holo‐ACP and 100 \u0026micro;m acetyl‐CoA in a final volume of 40 \u0026micro;L. The assay mixtures were incubated at 37\u0026deg;C for 1 h and resolved by conformation‐sensitive gel electrophoresis on 20% polyacrylamide gels containing 2.5 M urea for separation. The gels were stained with Coomassie brilliant blue R250 to visualize the acyl-ACP products.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e1.6 TEV Protease Cleavage and Mals Activity Assay\u003c/h2\u003e\u003cp\u003ePurified Tag-Mals fusion proteins were treated with TEV protease to remove the fusion tags [21]. The enzymatic activity of both tagged and tag-free Mals was determined using the 3,5-dinitrosalicylic acid (DNS) method, which measures the reducing sugars released from starch hydrolysis [22]. Briefly, the reaction mixture containing 1% (w/v) soluble starch in 50 mM sodium citrate buffer (pH 5.5) and an appropriate amount of enzyme was incubated at 60\u0026deg;C for 10 min. The reaction was stopped by adding DNS reagent, followed by boiling for 10 min. The amount of reducing sugars released was measured spectrophotometrically at 540 nm. One unit (U) of enzyme activity was defined as the amount of enzyme required to produce 1 \u0026micro;mol of reducing sugar (expressed as glucose equivalents) per minute under the assay conditions. A standard curve was prepared using glucose solutions of known concentrations (Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e"},{"header":"2 Results","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Construction of a Versatile Fusion Tag Expression System\u003c/h2\u003e\u003cp\u003eWe successfully constructed a series of pET-28b(+)-derived expression vectors, designated as the pX series. As illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, each vector is designed to fuse one of eight different tags (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, or YjgD) to the N-terminus of a target protein via a synthetic linker. The construct also includes a TEV protease cleavage site and a standardized multiple cloning site (MCS) flanked by dual 6\u0026times;His affinity tags. In total, 81 recombinant expression vectors were constructed in batches to accommodate the various tag-target protein (eGFP, EcFabG, Mals, PulA, NodE, FabF1XL, FabF2XL, FabZXL or FabGXL) combinations (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Fusion Tags Differentially Modulate eGFP Solubility and Fluorescence\u003c/h2\u003e\u003cp\u003eThe green fluorescent protein GFP is a widely used reporter gene for \u003cem\u003ein vivo\u003c/em\u003e expression and is recognized to be partially soluble when expressed in \u003cem\u003eE. coli\u003c/em\u003e [14]. We first employed enhanced Green Fluorescent Protein (eGFP) as a model protein to evaluate the solubility-enhancing effects of the eight fusion tags. Solubility analysis by SDS-PAGE demonstrated that for eGFP, all tested fusion proteins achieved excellent soluble expression except those with Crr and DsbA tags (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Densitometric quantification of the gels revealed that eGFP fused with ArsC, Ecotin, MsyB, SlyD, Snut, and YjgD was predominantly soluble. Notably, the MsyB and Snut tags exhibited particularly remarkable effects, increasing the soluble proportion of eGFP from 15% (wild-type, WT) to 87% and 92%, respectively. Furthermore, the ArsC tag not only enhanced the solubility but also boosted the overall expression level of eGFP (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). The results indicated that MsyB, ArsC, Ecotin, SlyD, YjgD and Snut effectively promoted the soluble expression of eGFP.\u003c/p\u003e\u003cp\u003eWe next assessed whether the tags interfered with eGFP function by measuring fluorescence intensity. As summarized in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC, the fluorescence intensity of the DsbA-tagged eGFP was significantly reduced, suggesting that this tag may interfere with the proper folding of eGFP. In contrast, the other seven tagged proteins exhibited fluorescence intensities comparable to that of wild-type eGFP, indicating that these tags promoted soluble expression without compromising functional integrity.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Impact of Fusion Tags on EcFabG Solubility and Enzyme Activity\u003c/h2\u003e\u003cp\u003eWe next examined the effect of the fusion tags on 3-ketoacyl-ACP reductase (EcFabG), a soluble enzyme involved in the bacterial fatty acid synthesis pathway. Solubility analysis by SDS-PAGE demonstrated that EcFabG fusions with YjgD, MsyB, DsbA, and ArsC were largely soluble (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Densitometric quantification of the gels indicated that the solubility of EcFabG fused with Crr and SlyD decreased to 38% and 35%, respectively, compared to 93% for the wild-type enzyme (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). This demonstrates that fusion with Crr or SlyD led to predominant inclusion body formation, significantly compromising EcFabG solubility. In contrast, tags such as Ecotin and Snut substantially reduced the overall expression level of EcFabG. This indicated that the Crr, SlyD, Ecotin and Snut tags had a greater effect on the solubility of EcFabG.\u003c/p\u003e\u003cp\u003eTo evaluate the functional integrity of the tagged EcFabG proteins, we reconstituted the fatty acid synthesis pathway \u003cem\u003ein vitro\u003c/em\u003e. The EcFabG catalyzes the reduction of 3-ketoacyl-ACP to 3-hydroxyacyl-ACP. The enzymatic activity was assessed via conformationally sensitive gel electrophoresis (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). EcFabG fused with YjgD MsyB or ArsC retained significant reductase activity. In contrast, the activity of EcFabG carrying the Crr, Ecotin, DsbA or SlyD tags was markedly impaired, indicating that these tags interfered with the enzyme's catalytic function.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e2.4 TEV Protease Cleavage and Its Effect on Mals Activity\u003c/h2\u003e\u003cp\u003eGiven that some tags enhanced solubility but compromised the activity of EcFabG and the fluorescence intensity of eGFP, we further investigated the effect of tag removal on maltogenic amylase (Mals). As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA, SDS-PAGE analysis of soluble (S) and insoluble (P) fractions revealed distinct solubility profiles for Mals fused with different tags. Notably, Mals fused with tags such as ArsC, SlyD, DsbA, MsyB, Crr, and Snut showed prominent bands in the soluble fractions, indicating significantly improved the soluble expression of Mals. In contrast, the Ecotin-tagged Mals was predominantly detected in the insoluble pellets, suggesting poor solubility under the tested conditions. Quantification of solubility via grayscale density analysis further supported these observations (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). Most fusion tags significantly improved the solubility of Mals compared to the no-tag control, with Crr, MsyB, SlyD and Snut exhibiting the most pronounced effects.\u003c/p\u003e\u003cp\u003eEnzymatic activity assays conducted before and after TEV protease cleavage provided insights into the functional influence of the fusion tags (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). Prior to cleavage, several tagged Mals, particularly those with SlyD, MsyB and Crr, displayed high maltogenic amylase activity. Following tag removal by TEV protease, a general reduction in activity was observed across most constructs, implying that the presence of the fusion tag substantially enhanced Mals activity compared to the tag-free protein. Notably, the SlyD-tagged Mals retained the highest activity both before and after cleavage, highlighting its dual benefit as a solubility and positively influenced the enzymatic function of Mals. The results indicated that the fusion tags differentially influenced the solubility and activity of Mals.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e2.5 Enhancement of Soluble Expression for Challenging Target Proteins\u003c/h2\u003e\u003cp\u003ePrevious studies have found that it is difficult to express and purify some of the enzymes involved in fatty acid synthesis in \u003cem\u003eS. meliloti\u003c/em\u003e Rm1021, making it impossible to carry out biochemical studies [23]. To further validate the broad applicability of our system, we tested its efficacy on five difficult-to-express proteins from \u003cem\u003eSinorhizobium meliloti\u003c/em\u003e (NodE, FabGXL, FabF1XL, FabF2XL, and FabZXL), which are typically insoluble or poorly expressed in \u003cem\u003eE. coli\u003c/em\u003e. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, the fusion tags dramatically improved their solubility profiles: NodE showed prominent soluble expression when fused with MsyB or Snut (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea). FabGXL, which was largely insoluble without a tag, achieved soluble expression only with the YjgD tag (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eb). For FabF1XL and FabF2XL, the YjgD and MsyB tags demonstrated the most effective solubilization, while other tags were less effective (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ec, d). In contrast, all eight fusion tags successfully promoted the soluble expression of FabZXL (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ee). These results underscore the target protein-dependent nature of tag efficacy and highlight the utility of our multi-tag system for identifying optimal solubility conditions for a wide range of recalcitrant proteins.\u003c/p\u003e\u003c/div\u003e"},{"header":"3 Discussion","content":"\u003cp\u003eThe formation of insoluble inclusion bodies remains a major obstacle to the widespread application of prokaryotic expression systems for recombinant protein production [2, 24]. To address this challenge, we developed a versatile fusion tag system that enables rapid screening for optimal solubility enhancers. In this study, the pullulanase PulA could not be expressed in \u003cem\u003eE. coli\u003c/em\u003e BL21(DE3) (Figure S2), likely due to its origin from the hyperthermophilic anaerobe \u003cem\u003eThermotoga neapolitana\u003c/em\u003e. Although the PulA gene was codon-optimized for \u003cem\u003eE. coli\u003c/em\u003e, it still failed to express, suggesting that factors beyond codon usage\u0026mdash;such as protein folding kinetics or compatibility with the host proteostasis network\u0026mdash;may have contributed to its insolubility [2, 25]. Similarly, proteins such as Mals and NodE, which may contain rare codons or complex folding requirements, achieved soluble expression only when fused with specific tags. For instance, MsyB markedly enhanced the solubility of NodE (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea), while ArsC, SlyD, DsbA, MsyB, Crr, and Snut all significantly improved the solubility of Mals (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eOther target proteins, including eGFP, FabGXL, FabF1XL, FabF2XL, and FabZXL, were predominantly expressed as inclusion bodies in the absence of fusion tags. While soluble expression was achievable with tag assistance, the efficacy of each tag varied considerably depending on the target protein. For example, all eight tags improved FabZXL solubility (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ee), whereas only YjgD enabled soluble expression of FabGXL (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eb). Similarly, MsyB and YjgD were the most effective for FabF1XL and FabF2XL (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ec\u0026ndash;d). ArsC, Ecotin, MsyB, SlyD, Snut, and YjgD fusion tags can promote the solubilization of eGFP (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). These observations highlight the target-specific nature of fusion tag efficacy and underscore the importance of employing a multi-tag screening approach [15, 26].\u003c/p\u003e\u003cp\u003eThe solubilization mechanism of fusion tags remains incompletely understood, though it is often suggested to relate to their own folding properties and biophysical characteristics, such as surface charge or hydrophilicity [12, 27]. To gain preliminary insight into the mechanisms underlying solubility enhancement, we analyzed the core biophysical properties of our eight fusion tags (Supplementary Table S3). We observed a general trend wherein tags characterized by a high negative net charge and hydrophilic nature, as indicated by a negative GRAVY index, and particularly those with highly acidic properties (pI\u0026thinsp;\u0026lt;\u0026thinsp;5.0) such as MsyB and YjgD, tended to be the most effective and versatile solubility enhancers. This finding is consistent with prior studies on acidic fusion partners [12, 13]. However, notable exceptions highlight the complexity of the mechanism. For instance, the superior performance of SlyD with Mals likely stems from its intrinsic chaperone activity rather than its charge properties alone [14]. Interestingly, the less acidic Snut (pI 6.324, GRAVY \u0026minus;\u0026thinsp;1.106, net charge \u0026minus;\u0026thinsp;1.79) still demonstrated considerable solubilization efficacy, performing well with target proteins such as eGFP, Mals, NodE, and FabZXL. Furthermore, the ability of certain tags (e.g., YigD) to solubilize particularly recalcitrant proteins (FabGXL) for which others failed including the MsyB, underscores that optimal tag selection results from a complex, individualized match between the tag's biophysical properties and the target protein's specific folding pathway and structural needs.\u003c/p\u003e\u003cp\u003eHowever, the molecular dimensions and structural properties of fusion tags can also interfere with the folding and functionality of target proteins, as evidenced by our experimental data. A notable example is the DsbA tag, which substantially quenched eGFP fluorescence despite maintaining reasonable solubility (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). In this experimental design, the soluble protein EcFabG was intentionally selected to analyze the impact of fusion tags on its solubility and function. As anticipated, the addition of tags reduced its solubility ratio to varying degrees, with the most significant decrease reaching 58%. Furthermore, tags including Crr, Ecotin, DsbA, and SlyD were found to impair EcFabG's enzymatic activity (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). These results emphasize that while fusion tags can enhance solubility, they may also interfere with protein function. It is therefore common practice to remove fusion tags after purification [28]. Interestingly, however, several tags in this study\u0026mdash;including SlyD, MsyB, and Crr\u0026mdash;enhanced the enzymatic activity of Mals even before cleavage (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). This suggests that, in some cases, fusion partners may do more than improve solubility; they may also assist in folding or stabilize the active conformation of certain target proteins [14, 15]. The underlying mechanisms warrant further investigation.\u003c/p\u003e\u003cp\u003eIn summary, our results confirm that no single fusion tag is universally effective for all target proteins. The optimal tag must be empirically determined through parallel screening [15]. The pX vector system developed in this study provides a convenient and efficient platform for such screening, enabling rapid identification of the most suitable fusion tag to enhance both the solubility and functional yield of diverse recombinant proteins.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eIn this study, we developed a novel and versatile pX vector series for enhancing the soluble expression of recombinant proteins in E. coli. This system integrates eight medium-sized, TEV-cleavable fusion tags into a standardized backbone, enabling high-throughput parallel cloning and screening. Our comprehensive evaluation using diverse model and challenging target proteins demonstrates that fusion tags exert protein-specific effects on solubility, expression yield, and biological activity. No single tag was universally optimal, underscoring the necessity of empirical screening. The pX platform effectively addresses this need by facilitating the rapid identification of the most suitable fusion partner for a given protein. With its proven efficacy in promoting the soluble and functional expression of even recalcitrant proteins, this system represents a valuable and streamlined tool for both academic and industrial protein research.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003e\u003cem\u003eE. coli\u003c/em\u003e,\u003cem\u003e\u0026nbsp;Escherichia coli\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eLB, Luria Bertani\u003c/p\u003e\n\u003cp\u003eIPTG, Isopropyl-\u0026beta;-D-1-thiogalactopyranoside\u003c/p\u003e\n\u003cp\u003eMCS, multi-cloning site\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTEV, Tobacco Etch Virus\u003c/p\u003e\n\u003cp\u003eeGFP, enhanced green fluorescence protein\u003c/p\u003e\n\u003cp\u003eEcFabG, 3-ketoacyl-ACP reductase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMals, maltogenic amylase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eArsC, Arsenate reductase\u003c/p\u003e\n\u003cp\u003eCrr, Glucose-specific phosphotransferase (PTS) enzyme IIA\u003c/p\u003e\n\u003cp\u003eDsbA, Disulfide bond formation protein A\u003c/p\u003e\n\u003cp\u003eEcotin, \u003cem\u003eE. coli\u003c/em\u003e trypsin inhibitor\u003c/p\u003e\n\u003cp\u003eMsyB, An acidic \u003cem\u003eE. coli\u0026nbsp;\u003c/em\u003eprotein\u003c/p\u003e\n\u003cp\u003eSlyD, An aggregation-resistant protein\u003c/p\u003e\n\u003cp\u003eSnut, Solubility \u0026apos;eNhancing\u0026apos; Ubiquitous Tag\u003c/p\u003e\n\u003cp\u003eYjgD, The hypothetical \u003cem\u003eE. coli\u003c/em\u003e ORF\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLLZ and MJC: Writing\u0026ndash;original draft, Writing\u0026ndash;review \u0026amp; editing, Data curation, Visualization. LLZ, CJT, TZY and CYQ: carried out all experiments. ZWB and MJC: planning, design, and coordination of the research. HZ and WHH: Project administration. All authors reviewed the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by the following projects: Industry-University Cooperation Project (H20230317) and National Natural Science Foundation of China (32570032).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll of the data generated and used in this work are included in the manuscript and are available as supplementary material.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eHayat SMG, Farahani N, Golichenari B, Sahebkar A. Recombinant Protein Expression in \u003cem\u003eEscherichia coli\u003c/em\u003e (\u003cem\u003eE. coli\u003c/em\u003e): What We Need to Know. Curr Pharm Des. 2018;24(6):718\u0026ndash;725. https://doi.org/10.2174/1381612824666180131121940.\u003c/li\u003e\n\u003cli\u003eRosano GL, Ceccarelli EA. Recombinant protein expression in \u003cem\u003eEscherichia coli\u003c/em\u003e: advances and challenges. Frontiers in microbiology. 2014;5:172. https://doi.org/10.3389/fmicb.2014.00172.\u003c/li\u003e\n\u003cli\u003eCaswell J, Snoddy P, McMeel D, Buick RJ, Scott CJ. Production of recombinant proteins in\u003cem\u003e Escherichia coli \u003c/em\u003eusing an N-terminal tag derived from sortase. Protein Expr Purif. 2010;70(2):143\u0026ndash;150. https://doi.org/10.1016/j.pep.2009.10.012.\u003c/li\u003e\n\u003cli\u003eParaskevopoulou V, Falcone FH. Polyionic Tags as Enhancers of Protein Solubility in Recombinant Protein Expression. Microorganisms. 2018;6(2):47. https://doi.org/10.3390/microorganisms6020047.\u003c/li\u003e\n\u003cli\u003edi Guana C, Lib P, Riggsa PD, Inouyeb H. Vectors that facilitate the expression and purification of foreign peptides in \u003cem\u003eEscherichia coli\u003c/em\u003e by fusion to maltose-binding protein. Gene. 1988;67(1):21\u0026ndash;30. https://doi.org/10.1016/0378-1119(88)90004-2.\u003c/li\u003e\n\u003cli\u003eMalakhov MP, Mattern MR, Malakhova OA, Drinker M, Weeks SD, Butt TR. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics. 2004;5(1-2):75\u0026ndash;86. https://doi.org/10.1023/b:Jsfg.0000029237.70316.52.\u003c/li\u003e\n\u003cli\u003eLaVallie ER, DiBlasio EA, Kovacic S, Grant KL, Schendel PF, McCoy JM. A Thioredoxin Gene Fusion Expression System That Circumvents Inclusion Body Formation in the\u003cem\u003e E. coli \u003c/em\u003eCytoplasm. Nat Biotechnol. 1993;11(2):187\u0026ndash;193. https://doi.org/10.1038/nbt0293-187.\u003c/li\u003e\n\u003cli\u003eSong JA, Lee DS, Park JS, Han KY, Lee J. A novel \u003cem\u003eEscherichia coli \u003c/em\u003esolubility enhancer protein for fusion expression of aggregation-prone heterologous proteins. Enzyme Microb Technol. 2011;49(2):124\u0026ndash;130. https://doi.org/10.1016/j.enzmictec.2011.04.013.\u003c/li\u003e\n\u003cli\u003eHan K, Seo H, Song J, Ahn K, Park J, Lee J. Transport proteins PotD and Crr of \u003cem\u003eEscherichia coli\u003c/em\u003e, novel fusion partners for heterologous protein expression. Biochim Biophys Acta. 2007;1774(12):1536\u0026ndash;1543. https://doi.org/10.1016/j.bbapap.2007.09.012.\u003c/li\u003e\n\u003cli\u003eZhang Y, Olsen DR, Nguyen KB, Olson PS, Rhodes ET, Mascarenhas D. Expression of eukaryotic proteins in soluble form in \u003cem\u003eEscherichia coli\u003c/em\u003e. Protein Expr Purif. 1998;12(2):159\u0026ndash;165. https://doi.org/10.1006/prep.1997.0834.\u003c/li\u003e\n\u003cli\u003eMalik A, Rudolph R, S\u0026ouml;hling B. A novel fusion protein system for the production of native human pepsinogen in the bacterial periplasm. Protein Expression Purif. 2006;47(2):662\u0026ndash;671. https://doi.org/10.1016/j.pep.2006.02.018.\u003c/li\u003e\n\u003cli\u003eZou Z, Cao L, Zhou P, Su Y, Sun Y, Li W. Hyper-acidic protein fusion partners improve solubility and assist correct folding of recombinant proteins expressed in \u003cem\u003eEscherichia coli\u003c/em\u003e. J Biotechnol. 2008;135(4):333\u0026ndash;339. https://doi.org/10.1016/j.jbiotec.2008.05.007.\u003c/li\u003e\n\u003cli\u003eSu Y, Zou Z, Feng S, Zhou P, Cao L. The acidity of protein fusion partners predominantly determines the efficacy to improve the solubility of the target proteins expressed \u003cem\u003ein Escherichia coli\u003c/em\u003e. J Biotechnol. 2007;129(3):373\u0026ndash;382. https://doi.org/10.1016/j.jbiotec.2007.01.015.\u003c/li\u003e\n\u003cli\u003eHan KY, Song JA, Ahn KY, Park JS, Seo HS, Lee J. Solubilization of aggregation-prone heterologous proteins by covalent fusion of stress-responsive \u003cem\u003eEscherichia coli \u003c/em\u003eprotein, SlyD. Protein Eng Des Sel. 2007;20(11):543\u0026ndash;549. https://doi.org/10.1093/protein/gzm055.\u003c/li\u003e\n\u003cli\u003eK\u0026ouml;ppl C, Lingg N, Fischer A, Kr\u0026ouml;\u0026szlig; C, Loibl J, Buchinger W, Schneider R, Jungbauer A, Striedner G, Cserjan-Puschmann M. Fusion Tag Design Influences Soluble Recombinant Protein Production in \u003cem\u003eEscherichia coli\u003c/em\u003e. International Journal of Molecular Sciences. 2022;23(14):7678. https://doi.org/10.3390/ijms23147678.\u003c/li\u003e\n\u003cli\u003eCserjan-Puschmann M, Lingg N, Engele P, Kr\u0026ouml;\u0026szlig; C, Loibl J, Fischer A, Bacher F, Frank A-C, \u0026Ouml;hlknecht C, Brocard C\u003cem\u003e et al\u003c/em\u003e. Production of Circularly Permuted Caspase-2 for Affinity Fusion-Tag Removal: Cloning, Expression in \u003cem\u003eEscherichia coli\u003c/em\u003e, Purification, and Characterization. 2020;10(12):1592. https://doi.org/10.3390/biom10121592.\u003c/li\u003e\n\u003cli\u003eKapust RB, T\u0026ouml;zs\u0026eacute;r J, Copeland TD, Waugh DS. The P1\u0026prime; specificity of tobacco etch virus protease. Biochem Biophys Res Commun. 2002;294(5):949\u0026ndash;955. https://doi.org/10.1016/S0006-291X(02)00574-0.\u003c/li\u003e\n\u003cli\u003eWang HZ, Chu ZZ, Chen CC, Cao AC, Tong X, Ouyang CB, Yuan QH, Wang MN, Wu ZK, Wang HH\u003cem\u003e et al\u003c/em\u003e. Recombinant Passenger Proteins Can Be Conveniently Purified by One-Step Affinity Chromatography. PLoS One. 2015;10(12):e0143598. https://doi.org/10.1371/journal.pone.0143598.\u003c/li\u003e\n\u003cli\u003eCha HJ, Wu CF, Valdes JJ, Rao G, Bentley WE. Observations of green fluorescent protein as a fusion partner in genetically engineered\u003cem\u003e Escherichia coli\u003c/em\u003e: monitoring protein expression and solubility. Biotechnol Bioeng. 2000;67(5):565\u0026ndash;574. https://doi.org/10.1002/(SICI)1097-0290(20000305)67:5\u0026lt;565::AID-BIT7\u0026gt;3.0.CO;2-P.\u003c/li\u003e\n\u003cli\u003eHu Z, Ma J, Chen Y, Tong W, Zhu L, Wang H, Cronan JE. \u003cem\u003eEscherichia coli \u003c/em\u003eFabG 3-ketoacyl-ACP reductase proteins lacking the assigned catalytic triad residues are active enzymes. J Biol Chem. 2021;296:100365. https://doi.org/10.1016/j.jbc.2021.100365.\u003c/li\u003e\n\u003cli\u003eRaran-Kurussi S, Cherry S, Zhang D, Waugh DS. Removal of Affinity Tags with TEV Protease. Methods in Molecular Biology. 2017;1586:221\u0026ndash;230. https://doi.org/10.1007/978-1-4939-6887-9_14.\u003c/li\u003e\n\u003cli\u003eLI Y, Su L, Wu J, Wu D. Recombinant Expression and Fermentation Optimization of \u003cem\u003eB. stearothermophilu\u003c/em\u003e Maltogenic Amylases in \u003cem\u003eBacillus subtilis\u003c/em\u003e. Journal of Food Science and Biotechnology. 2020;39(02):1\u0026ndash;9. https://doi.org/10.3969/j.issn.1673-1689.2020.02.001.\u003c/li\u003e\n\u003cli\u003eHaag AF, Wehmeier S, Muszyński A, Kerscher B, Fletcher V, Berry SH, Hold GL, Carlson RW, Ferguson GP. Biochemical characterization of \u003cem\u003eSinorhizobium meliloti \u003c/em\u003emutants reveals gene products involved in the biosynthesis of the unusual lipid A very long-chain fatty acid. The Journal of biological chemistry. 2011;286(20):17455\u0026ndash;17466. https://doi.org/10.1074/jbc.M111.236356.\u003c/li\u003e\n\u003cli\u003eZhao L, Cao J, Liu X, Li Y, Wu J, Su L. Optimizing protein folding in prokaryotes: Strategies to enhance soluble expression of recombinant proteins. Bioresour Technol. 2026;439:133266. https://doi.org/10.1016/j.biortech.2025.133266.\u003c/li\u003e\n\u003cli\u003eFang J, Zou L, Zhou X, Cheng B, Fan J. Synonymous rare arginine codons and tRNA abundance affect protein production and quality of TEV protease variant. PLoS One. 2014;9(11):e112254. https://doi.org/10.1371/journal.pone.0112254.\u003c/li\u003e\n\u003cli\u003ePark J-S, Han K-Y, Lee J-H, Song J-A, Ahn K-Y, Seo H-S, Sim S-JJ, Kim S-W, Lee J. Solubility enhancement of aggregation-prone heterologous proteins by fusion expression using stress-responsive \u003cem\u003eEscherichia coli \u003c/em\u003eprotein, RpoS. BMC Biotechnol. 2008;8:15\u0026ndash;15. https://doi.org/10.1186/1472-6750-8-15.\u003c/li\u003e\n\u003cli\u003eChen J-P, Gong J-S, Su C, Li H, Xu Z-H, Shi J-S. Improving the soluble expression of difficult-to-express proteins in prokaryotic expression system via protein engineering and synthetic biology strategies. Metab Eng. 2023;78:99\u0026ndash;114. https://doi.org/10.1016/j.ymben.2023.05.007.\u003c/li\u003e\n\u003cli\u003eRaran-Kurussi S, Waugh DS. Expression and Purification of Recombinant Proteins in \u003cem\u003eEscherichia coli \u003c/em\u003ewith a His(6) or Dual His(6)-MBP Tag. Methods Mol Biol. 2017;1607:1\u0026ndash;15. https://doi.org/10.1007/978-1-4939-7000-1_1.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-biotechnology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bbit","sideBox":"Learn more about [BMC Biotechnology](http://bmcbiotechnol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bbit/default.aspx","title":"BMC Biotechnology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Recombinant protein expression, Solubility enhancement, Fusion tag screening, Escherichia coli, TEV protease, Inclusion bodies, pX vector system","lastPublishedDoi":"10.21203/rs.3.rs-8180750/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8180750/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eThe \u003cem\u003eEscherichia coli\u003c/em\u003e expression system is widely used for recombinant protein production, but its utility is often limited by the formation of insoluble inclusion bodies. Although fusion tags can enhance solubility, their effectiveness varies unpredictably across different target proteins, and the optimal tag must typically be determined empirically.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eHere, we developed a novel fusion tag system for the high-throughput screening of soluble protein expression in \u003cem\u003eE. coli\u003c/em\u003e. This system consists of eight medium-sized, TEV-cleavable fusion tags (ArsC, Crr, DsbA, Ecotin, MsyB, SlyD, Snut, and YjgD) cloned into a standardized pET-28b(+) backbone. We systematically evaluated the impact of these tags on the solubility and function of three model proteins (eGFP, EcFabG, and Mals) and six challenging proteins (PulA, NodE, FabF1XL, FabF2XL, FabZXL, and FabGXL). Our results demonstrated that the efficacy of each tag was highly protein-dependent. Notably, tags such as MsyB and Snut dramatically increased the soluble proportion of eGFP from 15% to over 85%, while the SlyD tag significantly enhanced both the solubility and activity of Mals. For several difficult-to-express proteins, soluble expression was only achieved with specific tags, highlighting the critical importance of tag selection.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e\u003cp\u003eOur study presents a versatile and efficient platform for the rapid production of soluble recombinant proteins. By enabling parallel screening of multiple fusion partners, this system facilitates the identification of optimal conditions for enhancing protein solubility and function, thereby addressing a key bottleneck in recombinant protein applications.\u003c/p\u003e","manuscriptTitle":"Application of a Novel Fusion Tag System for Enhanced Soluble Expression of Recombinant Proteins in Escherichia coli","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-08 11:17:03","doi":"10.21203/rs.3.rs-8180750/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-01-06T10:48:21+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-05T17:21:00+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-05T08:58:59+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-29T04:11:07+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-21T19:40:37+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"194723930496662961862827779635822920503","date":"2025-12-17T13:14:03+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"276390254118268893651995954698808764570","date":"2025-12-15T18:18:12+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"191825457484681369895365434321630556521","date":"2025-12-15T15:36:09+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"232796793628724832576657822452599116847","date":"2025-12-15T11:42:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"239017597017332765731260405145229371790","date":"2025-12-04T08:21:04+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-12-04T03:31:13+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-11-28T06:51:26+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-11-24T08:24:25+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-11-24T08:22:22+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Biotechnology","date":"2025-11-22T13:26:35+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-biotechnology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bbit","sideBox":"Learn more about [BMC Biotechnology](http://bmcbiotechnol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bbit/default.aspx","title":"BMC Biotechnology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"325f77c1-b8cc-43cf-83eb-916a449064da","owner":[],"postedDate":"December 8th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-02-16T16:13:06+00:00","versionOfRecord":{"articleIdentity":"rs-8180750","link":"https://doi.org/10.1186/s12896-026-01109-1","journal":{"identity":"bmc-biotechnology","isVorOnly":false,"title":"BMC Biotechnology"},"publishedOn":"2026-02-12 15:57:32","publishedOnDateReadable":"February 12th, 2026"},"versionCreatedAt":"2025-12-08 11:17:03","video":"","vorDoi":"10.1186/s12896-026-01109-1","vorDoiUrl":"https://doi.org/10.1186/s12896-026-01109-1","workflowStages":[]},"version":"v1","identity":"rs-8180750","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8180750","identity":"rs-8180750","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00