A standardized set of pNX vectors for enhanced soluble expression of recombinant proteins in E. coli using small fusion tags | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A standardized set of pNX vectors for enhanced soluble expression of recombinant proteins in E. coli using small fusion tags Li-Zhen Luo, Wen-Bin Zhang, Zhe Hu, Ling-Hua Zhang, Jin-Cheng Ma, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7875802/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 19 Dec, 2025 Read the published version in Microbial Cell Factories → Version 1 posted 11 You are reading this latest preprint version Abstract Background The production of recombinant proteins in Escherichia coli ( E. coli ) is often hampered by the formation of inclusion bodies. While fusion tags can enhance solubility, existing systems are hampered by a lack of standardization, with tags scattered across disparate plasmid backbones and inconsistent cloning sites, complicating high-throughput screening. Results To address this, we constructed a standardized series of expression vectors, termed pNX, by incorporating nine small fusion tags (SUMO, LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) into a uniform pET-28b backbone. Each pNX vector features an identical configuration: a T7 promoter, an N-terminal fusion tag, a 6×His tag, a linker, a TEV protease cleavage site, a multiple cloning site (MCS), and a C-terminal 6×His tag. We evaluated this system using four model proteins (EcFabG, eGFP, XccXanA2, and XccXanL). Our results showed that specific tags significantly improved both the expression level and solubility of the target proteins without compromising their biological activity. Notably, the lipoyl domain (LD) was identified for the first time as an effective solubility enhancer. The standardized MCS enabled rapid, parallel cloning, facilitating the efficient screening of optimal fusion partners. Conclusions The pNX vector series provides a versatile and high-throughput platform for enhancing the soluble expression of challenging recombinant proteins in E. coli , streamlining the empirical identification of ideal fusion tags. Escherichia coli Fusion tags pNX vectors Protein solubility Lipoyl domain (LD) High-throughput screening Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 1 Background The production of recombinant proteins is a cornerstone of modern biotechnology, with applications ranging from therapeutic biologics to industrial enzymes. Since the first approval of a recombinant biologic for diabetes treatment over five decades ago [ 1 ], Escherichia coli has remained the predominant host for recombinant protein production due to its well-characterized genetics, rapid growth, low-cost cultivation, and high yield potential [ 2 – 4 ]. Compared to other expression systems such as Bacillus , yeast, insect, or mammalian cells, E. coli can be readily manipulated, are cultured inexpensively and grow rapidly [ 5 ]. Despite these advantages, the expression of heterologous proteins in E. coli often faces significant challenges. A primary issue is the frequent formation of insoluble and inactive protein aggregates, known as inclusion bodies [ 6 ]. This can be attributed to several factors, including codon usage bias that reduces translational efficiency [ 7 ], improper disulfide bond formation or protein misfolding, potential cytotoxicity of the foreign protein, and the distinct intracellular microenvironment of the bacterial host that may not support the correct folding of foreign proteins [ 6 , 8 ]. The formation of inclusion bodies necessitates laborious and often inefficient in vitro refolding procedures, substantially increasing the cost and complexity of downstream processing and hindering large-scale industrial production. Consequently, developing robust strategies to enhance the solubility of recombinant proteins in E. coli is a major focus of ongoing research. Common empirical approaches to improve soluble yield include modulating expression conditions—such as lowering the induction temperature [ 9 ], changing the E. coli expression strain [ 10 ], employing different promoters or induction conditions [ 11 ], and co-expressing molecular chaperones and folding regulators [ 12 ]. Among the most effective and generalizable strategies is the fusion of the target protein to a solubility-enhancing partner tag [ 13 ]. Several such tags have been extensively characterized. Maltose-binding protein (MBP) is a component of the maltose transport system in E. coli . MBP attracts the molecular chaperone GroEL and functions as an intramolecular chaperone, aiding in the proper folding of recombinant proteins. The MBP fusion expression system not only offers advantages such as high expression efficiency and ease of purification but also demonstrates favorable solubilization effects [ 14 , 15 ]. Small ubiquitin-related modifier (SUMO) modulates protein structure and function by covalently binding to the lysine side chains of the target proteins. SUMO attachment to the N-terminus of underexpressed proteins significantly enhances their expression and solubility in E. coli [ 16 ]. As a fusion tag, acyl carrier protein (ACP) could greatly increase the soluble expression level of Glucokinase (GlcK), α-Amylase (Amy) and GFP [ 17 ]. Highly positively charged recombinant protegrin-1 dimer and LL-37/histatin-5 peptides were expressed in E. coli , utilizing biotin carboxyl carrier protein (BCCP) mediated acidic amino acids as fusion tags [ 18 ]. The B1 immunoglobulin binding domain of Streptococcal protein G (GB1) is a relatively small solubility-enhancing tag (56 residues) that utilizes highly acidic sequences to increase the solubility of target proteins. It enhances the expression yield, stability, and solubility of fusion target peptides without compromising their structure or function, proving particularly useful in protein preparation for NMR studies [ 19 ]. Putative calcium-binding protein (Fh8) fusion tags also serve as effective solubility enhancers. Compared to larger fusion tags, their low molecular weight and solubility-enhancing properties make Fh8 a favorable choice for soluble protein production in E. coli [ 20 ]. A small metal-binding protein (SmbP) of 9.9 kDa was isolated from the periplasm of Nitrosomonas europaea . Compared to MBP and glutathione S-transferase (GST), Green Fluorescent Protein fused with Smbp at the N-terminus exhibited high solubility and low inclusion body formation rates, suggesting Smbp may be a preferred fusion protein for producing small-sized proteins [ 21 ]. The third domain of the periplasmic protein TolA (TolAIII) was used as a fusion partner in the expression of various proteins from bacteria and eukaryotes. TolAIII is small domain, expressed in high yields as a soluble protein in the cytoplasm of E. coli and proved to be useful in the preparation of other peptides and proteins [ 22 ]. Thioredoxin (TrxA) is a widely distributed, thermostable protein that functions as a hydrogen carrier and also appears as a domain in disulfide isomerase. TrxA enhances soluble expression of foreign proteins in E. coli by acting similarly to a molecular chaperone. It achieves this either by physically binding to target proteins to reduce the formation of misfolded intermediates or by covalently binding to prevent inclusion body formation [ 23 , 24 ]. However, no studies have identified a universally applicable fusion tag suitable for expressing any target protein, as the effectiveness of existing tags varies significantly depending on the protein of interest [ 8 ]. This necessitates screening multiple tags to identify the optimal partner for a protein of interest. A significant practical bottleneck is that these tags are often dispersed across different plasmid vectors with incompatible cloning sites and backbone architectures, making parallel cloning and comparative screening inefficient and time-consuming [ 8 ]. Furthermore, the presence of a fusion tag can sometimes interfere with the structure, function, or downstream applications of the target protein. Therefore, a seamless system for tag removal is highly desirable, particularly for the production of therapeutic proteins where the tag must be eliminated to ensure safety and efficacy [ 25 ]. By incorporating specific cleavage sequences recognized by proteases include factor Xa, thrombin, caspase-2, SUMO protease, and enterokinase, the fusion tag can be efficiently removed during purification [ 6 ]. Tobacco Etch Virus (TEV) protease is also widely used for this purpose due to its high specificity and ability to cleave precisely, often leaving no extraneous residues on the target protein [ 17 , 26 ]. To address the critical need for standardization and high-throughput compatibility, we developed the pNX vector series. This is a unified set of expression vectors based on the pET-28b backbone, each harboring one of nine small fusion tags (SUMO, LD (Lipoyl domain E2p) [ 27 ], ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) in an identical genetic context. All pNX vectors feature a standardized multi-cloning site (MCS) flanked by dual 6×His tags and a TEV protease cleavage site for efficient tag removal. In this study, we systematically evaluated the ability of these tags to enhance the solubility and maintain the biological activity of four model proteins: enhanced Green Fluorescent Protein (eGFP) [ 17 ], 3-ketoacyl-ACP reductase (EcFabG) [ 28 ], 3-hydroxybenozate AMP ligase (XccXanA2) [ 29 ], and Chain length factor (XccXanL) [ 29 ]. Our results not only validate the pNX system as a powerful tool for rapid screening but also lead to the novel finding that the Lipoyl domain E2p (LD) serves as an effective solubility enhancer. 2 Materials and methods 2.1 Plasmids, Bacterial Strains, and Growth Conditions The nine solubility-enhancing fusion tags utilized in this study are listed in Table 1 . Genes encoding SUMO, Fh8, GB1, and SmbP were synthesized by Sangon Biotech (Shanghai) Co., Ltd. and delivered in a PUC57-KAN vector. The ACP, BCCP, TolA, and TrxA genes were PCR-amplified from the genomic DNA of E. coli MG1655. The LD gene was amplified from the plasmid pGS331. All primers for tag amplification are listed in Table S1 . The bacterial strains and plasmids used are detailed in Table S2. E. coli strains were routinely cultured in Luria-Bertani (LB) medium at 37°C. Where appropriate, kanamycin and isopropyl β-D-1-thiogalactopyranoside (IPTG) were added to final concentrations of 30 µg/mL and 240 µg/mL, respectively. Table 1 Solubility enhancer tags used in this study Tags Source Full Name Size (kDa) Reference SUMO Saccharomyces cerevisiae Small ubiquitin-related modifier 11.1 [ 16 ] LD Escherichia coli Lipoyl domain E2p 8.9 [ 27 ] ACP Escherichia coli Acyl carrier protein 8.7 [ 17 ] BCCP Escherichia coli Biotin carboxyl carrier protein 9.2 [ 18 ] GB1 Streptococcus sp. The B1 immunoglobulin binding domain of protein G 6.2 [ 19 ] Fh8 Fasciola hepatica Putative calcium-binding protein 8 [ 20 ] SmbP Nitrosomonas europaea Small metal-binding protein 9.9 [ 21 ] TolA Escherichia coli Tol-Pal system protein TolA 9.9 [ 22 ] TrxA Escherichia coli Thioredoxin 11.7 [ 24 ] 2.2 Construction of the pNX Vector Series The pNX vector series was constructed sequentially, starting with the pNSUMO plasmid. The SUMO gene was initially amplified using primers SUMO P1 and P2. The resulting amplicon was then used as a template in four subsequent rounds of PCR with primer SUMO P1 paired separately with SUMO P3, P4, P5, and P6. This strategy sequentially appended the following elements to the 3'-end of the SUMO gene: a Kpn I site, synthetic linker peptides, a TEV protease recognition site, a sequence encoding a C-terminal 6×His tag, and finally Nde I and Bam HI sites. The final, extended SUMO fragment was digested with Nco I and Bam HI and ligated into the corresponding sites of the pET-28b vector, yielding the pNSUMO backbone plasmid. To generate the other vectors in the series, the genes for LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA were amplified using their respective primers (Table S1 ). Each PCR product was digested with Nco I and Kpn I and then cloned into the pNSUMO backbone plasmid digested with the same enzymes, thereby replacing the SUMO tag. This resulted in the construction of plasmids pNLD, pNACP, pNBCCP, pNGB1, pNFh8, pNSmbP, pNTolA, and pNTrxA, collectively designated as the pNX series. 2.3 Cloning of Target Genes The egfp gene was amplified from the plasmid pET-28b- egfp (a gift from Associate Professor Wang). The EcfabG gene was amplified from the genomic DNA of E. coli MG1655. The XccxanA2 and XccxanL genes were amplified from the genomic DNA of Xanthomonas campestris pv. Campestris 8004. All target genes were amplified using primer pairs (Table S1 ) designed to incorporate Nde I and Hin dIII restriction sites. The purified PCR fragments and the pNX vectors were digested with Nde I and Hin dIII, ligated, and transformed into E. coli DH5α. The resulting expression constructs were designated as pNX- egfp , pNX- EcfabG , pNX- XccxanA2 , and pNX- XccxanL (Table S2). 2.4 Protein Expression, Purification, and Tag Cleavage Recombinant pNX plasmids were transformed into E. coli BL21(DE3) for protein expression. Single colonies were used to inoculate 5 mL of LB medium with kanamycin, which were grown overnight at 37°C with shaking at 180 rpm. These cultures were diluted 1:100 into 100 mL of fresh LB medium and grown at 37°C to an OD₆₀₀ of 0.6. Protein expression was induced by adding IPTG to a final concentration of 240 µg/mL, followed by incubation for 4 hours at 37°C. Cells were harvested by centrifugation (4,000 rpm, 20 min, 4°C). For solubility analysis, cultures were normalized to an OD₆₀₀ of 1.0. Cells from 5 mL of normalized culture were pelleted (12,000 × g, 5 min), resuspended in 1 mL of lysis buffer, and disrupted by sonication on ice (3 cycles of 5 min each). The lysate was centrifuged (12,000 rpm, 20 min, 4°C) to separate the soluble (supernatant) and insoluble (pellet) fractions. The insoluble pellet was resuspended in 1 mL of denaturing lysis buffer (50 mM phosphate buffer, pH 8.0, 0.3 M NaCl, 20 mM imidazole, 8 M urea). Both fractions were analyzed by SDS-PAGE, and gels were stained with Coomassie Blue. Band intensities were quantified using ImageJ software. For protein purification, fusion proteins were purified from the soluble fraction under native conditions using Ni-NTA affinity chromatography (Qiagen). Protein concentrations were determined by the Bradford assay (Bio-Rad). Purified proteins were stored at -80°C after the addition of glycerol to a final concentration of 20% (v/v). For tag removal, purified fusion proteins were incubated with TEV protease (a gift from Associate Professor Wang) at a 1:10 (w/w) protease-to-protein ratio in reaction buffer (50 mM Tris-HCl, pH 8.0, 0.5 mM EDTA, 1 mM DTT) for 2 hours at 30°C. The cleavage reaction was centrifuged briefly (5,000 rpm, 1 min), and the supernatant was analyzed by SDS-PAGE to assess cleavage efficiency. 2.5 Fluorescence spectroscopy for GFP The fluorescence of purified eGFP proteins was assessed to evaluate functional integrity. Proteins were diluted to 1 µM in 50 mM sodium phosphate buffer (pH 8.0, 300 mM NaCl). Fluorescence measurements were performed using a SpectraMax i3x Multi-Mode Microplate Reader (Molecular Devices). The excitation wavelength was set to 460 nm, and the emission spectrum was recorded. 2.6 EcFabG Enzyme Activity Assays The oxidoreductase (OAR) activity of EcFabG fusion proteins was assayed in vitro as previously described [ 28 ]. Briefly, the 40 µL reaction mixture contained 0.1 M sodium phosphate buffer (pH 7.0), 0.1 µg each of EcFabB, fused EcFabG, EcFabA, and EcFabI, 50 µM NADH, 50 µM NADPH, 1 mM β-mercaptoethanol, 100 µM malonyl-ACP, and 100 µM octanoyl-ACP. Reaction products were analyzed by conformationally sensitive gel electrophoresis on 20% polyacrylamide gels containing 0.5-1 M urea, followed by staining with Coomassie Brilliant Blue R-250. For quantitative analysis, EcFabG activity was determined by monitoring the oxidation rate of NADPH at 340 nm. The 300 µL assay mixture contained 0.1 M sodium phosphate buffer (pH 8.0), 100 µM malonyl-ACP, 100 µM octanoyl-ACP, 5 mM β-mercaptoethanol, 0.2 mM NADPH, and 0.5 µg of purified EcFabB. After pre-incubation at 37°C for 1 h, the reaction was initiated by adding 20 µM of fused EcFabG. Activity, calculated based on NADPH consumption, was expressed in µmol/kg/sec and analyzed using GraphPad Prism software (version 6). 3 Results 3.1 Construction of a Standardized pNX Vector Series To establish a high-throughput platform for screening solubility-enhancing tags, we engineered a series of expression vectors, designated pNX, by inserting nine different fusion tags (SUMO, LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) into a uniform pET-28b backbone. All pNX vectors share an identical architecture of functional elements to ensure consistency and comparability (Fig. 1 ). This standardized configuration comprises, in sequential order: a T7 promoter, an N-terminal fusion tag, a 6×His tag, a synthetic linker, a TEV protease recognition site, a multiple cloning site (MCS), and a C-terminal 6×His tag. 3.2 High-Throughput Cloning of Target Genes into the pNX vectors To assess the functionality of the expression cassette, four target genes— egfp , EcfabG , XccxanA2 , and XccxanL —were cloned into the pNX vectors system. Each gene was amplified with primers introducing Nde I and Hin dIII restriction sites, digested, and ligated into the corresponding sites of all pNX vectors. This process generated a comprehensive set of expression constructs (listed in Table S2). The uniform MCS across the pNX series enabled the efficient, parallel cloning of each target gene into every vector, validating the system's utility for high-throughput screening. 3.3 Fusion Tags Enhance Heterologous Protein Expression We first investigated whether the fusion tags could increase the total expression level of the target proteins. The expression vectors pNX- EcfabG and pNX- egfp were transformed into E. coli BL21(DE3), and proteins were expressed in small-scale cultures. Analysis of whole-cell lysates by SDS-PAGE revealed that the impact of fusion tags was target-dependent (Fig. 2 ). For EcFabG, all tags except BCCP and TrxA increased the total protein yield compared to the untagged control. In the case of eGFP, the LD, GB1, TolA, and TrxA tags led to notably improved expression levels. These results demonstrate that while fusion tags can significantly enhance the production of heterologous proteins, their effectiveness varies with the target protein. 3.4 Specific Tags Markedly Improve Protein Solubility We next evaluated the ability of these tags to enhance the solubility of EcFabG and eGFP. The soluble (supernatant) and insoluble (pellet) fractions of cell lysates were analyzed separately by SDS-PAGE (Fig. 3 A, B). Densitometric quantification of the gels showed that for EcFabG, all tested fusions resulted in excellent soluble expression. The LD and Fh8 tags were particularly effective, increasing the soluble yield of EcFabG by 10.98% and 4.86%, respectively (Fig. 3 C). In the case of eGFP, the LD and SUMO fusions substantially increased the amount of soluble protein, achieving solubility rates of 71.85% and 50.17%, respectively, compared to 57.08% for wild-type eGFP, whereas ACP, BCCP, and SmbP fusions improved the ratio of soluble eGFP protein (Fig. 3 D). These findings confirm that fusion tags can be strategically selected to either increase total soluble yield or improve the folding efficiency of the target protein in E. coli . 3.5 Purification and Tag Removal via TEV Protease Cleavage To obtain tag-free target proteins, a TEV protease recognition site was incorporated between each fusion tag and the MCS. Following initial purification of the fusion proteins by Ni-NTA affinity chromatography, incubation with TEV protease efficiently cleaved the tags. Subsequent SDS-PAGE analysis confirmed the release of untagged target proteins at their expected molecular weights—25.6 kDa for EcFabG and 29.5 kDa for eGFP—which remained in the soluble fraction (Fig. 4 ). This two-step purification and cleavage process reliably yielded both tagged and tag-free versions of the proteins. 3.6 Small Size Tags Preserve Biological Activity A critical consideration is whether fusion tags impair the native function of the target protein. We therefore assessed the biological activity of the tagged proteins. The enzymatic activity of EcFabG fusions was evaluated in a reconstituted fatty acid synthesis system. All tagged versions of EcFabG successfully utilized octanoyl-ACP (C 8 -ACP) as a substrate to produce decanoyl-ACP (C 10 -ACP), with no visible impairment compared to the untagged enzyme (Fig. 5 A, B). Quantitative spectrophotometric assays further confirmed that there was no significant difference (P > 0.05) in NADPH oxidation rates between tagged and untagged EcFabG (Fig. 5 C). Similarly, the functional integrity of eGFP fusions was determined by measuring fluorescence intensity. No significant difference (P > 0.05) was observed between any of the tagged eGFP proteins and the tag-free control (Fig. 5 D). These results collectively demonstrate that the small size tags used in the pNX system do not compromise the biological activity of the model proteins. 3.7 The pNX System Rescues Expression of Challenging Proteins To validate the practical utility of the pNX vectors system, we applied it to two difficult-to-express proteins: XccXanA2, which is typically produced at low levels, and XccXanL, which is predominantly insoluble in E. coli . As shown in Fig. 6 , the parental pET-28b vector (no tag) yielded negligible soluble XccXanA2 and mostly insoluble XccXanL. However, screening with the pNX vectors identified specific tags that dramatically improved outcomes. The SUMO and Fh8 tags significantly enhanced both the expression and solubility of XccXanA2. For the highly insoluble XccXanL, fusion with the LD or SUMO tag resulted in a substantial increase in the soluble fraction. This demonstrates the power of the pNX system as an empirical tool for identifying optimal solubility partners for recalcitrant proteins. Discussion The E. coli expression system remains a workhorse for recombinant protein production, yet the persistent challenge of insoluble aggregation necessitates robust strategies to enhance solubility [ 5 ]. Fusion tags represent an effective strategy to mitigate these issues by enhancing protein solubility, improving yields, and facilitating purification in E. coli [ 30 ], but their utility has been hampered by the lack of standardized, comparable vector systems. In this study, we addressed this critical bottleneck by developing the pNX series—a unified set of vectors hosting nine distinct small-size solubility tags within an identical genetic backbone. Our systematic evaluation demonstrates that this system not only facilitates high-throughput screening for optimal solubility partners but also led to the identification of the Lipoyl domain (LD) as a novel and effective solubility enhancer. A primary finding of our work is that the effectiveness of fusion tags is highly dependent on the target protein, reinforcing the consensus in the field that no single tag is universally superior [ 31 ]. For instance, although LD and Fh8 significantly increased the soluble yield of EcFabG, their effect on the total expression level of eGFP was more moderate, especially for Fh8. LD and SUMO improved eGFP solubility primarily by increasing its overall expression level, where instead, tags like ACP, BCCP and SmbP significantly boosted its soluble fraction. This dichotomy suggests that different tags may operate through distinct mechanisms—some may act as genuine folding helpers, increasing the proportion of soluble protein, while others may simply boost overall translation, thereby increasing the amount of soluble protein. The pNX system is uniquely positioned to empirically distinguish between these modes of action for any given protein. Critically, we confirmed that the small-size tags in the pNX system, once cleaved, do not compromise the biological activity of the target proteins. The preserved enzymatic function of EcFabG and the unaltered fluorescence of eGFP across all tags are significant advantages over larger tags, which can sometimes cause steric hindrance or require refolding after cleavage [ 32 ]. It is noteworthy that in our hands, the SmbP tag did not diminish eGFP fluorescence, which contrasts with previous reports where the SmbP tag, for instance, led to significantly reduced fluorescence—possibly due to misfolding in the cytoplasmic environment rather than the periplasm for which it is optimized [ 21 , 33 ]. Furthermore, when fused to EcFabG, the BCCP tag resulted in approximately 5% reduction in solubility. This suggests that fusion to BCCP may not have correctly engaged with its specific lysine attachment sites, thereby failing to enhance solubility and potentially impairing proper folding or stability. Similarly, the TrxA tag appears to weaken functional coupling, likely due to incomplete or unstable covalent linkage with EcFabG. Thus, both BCCP and TrxA may negatively influence the expression and solubility of EcFabG. This discrepancy highlights the value of empirical screening and suggests that tag performance can vary based on the target protein and expression conditions. The practical utility of the pNX platform was unequivocally demonstrated through its application to the difficult-to-express proteins XccXanA2 and XccXanL. The system rapidly identified SUMO and Fh8 as effective tags for XccXanA2, and LD and SUMO as powerful solubilizing agents for XccXanL. Based on Protein-Sol solubility predictions, XccXanA2 and XccXanL were classified as low-solubility proteins, with scores of 0.309 and 0.406, respectively, falling below the common solubility threshold of 0.45. Although Protein-Sol software suggested that some tags might improve solubility, experimental results did not always align with these predictions. For instance, the fusion pNSmbP-XccXanL received a prediction score of 0.512, yet SmbP failed to enhance solubility in practice. This discrepancy may be attributed to excessively rapid recombinant protein expression in E. coli , leading to misfolding and inclusion body formation. These results underscore that for proteins with low predicted solubility, a priori prediction of the best fusion partner remains challenging. The empirical, high-throughput screening enabled by the pNX vectors thus provides a decisive strategy to overcome this limitation, transforming a traditionally slow and fragmented process into a rapid and systematic one. Among the tags tested, the lipoyl domain (LD) emerges as a particularly promising novel solubility partner. While LD is known for its role in the pyruvate dehydrogenase complex and its lipoylation [ 27 ], its function as a solubility-enhancing fusion tag has not been previously reported. Our hypothesis that its structural stability and potential chaperone-like interaction could aid folding appears to be validated by its strong performance with both EcFabG and the challenging XccXanL. This discovery expands the toolbox of available solubility tags and warrants further investigation into its mechanism of action. Conclusions The pNX vector series successfully bridges a significant technological gap in recombinant protein expression. By integrating nine small-size tags into a standardized, TEV-cleavable backbone, we have created a versatile and efficient platform for high-throughput solubility screening. The system's ability to enhance expression and solubility while preserving biological function, coupled with its proven efficacy on hard-to-express proteins, makes it a valuable resource for both academic research and industrial biotechnology. Abbreviations E. coli , Escherichia coli MBP, Maltose-binding protein SUMO, Small ubiquitin-related modifier LD, Lipoyl domain E2p ACP, acyl carrier protein BCCP, biotin carboxyl carrier protein GB1, B1 immunoglobulin binding domain of Streptococcal protein G Fh8, Putative calcium-binding protein SmbP, small metal-binding protein TolA, The third domain of the periplasmic protein TolA TrxA, Thioredoxin MCS, multi-cloning site TEV, Tobacco Etch Virus eGFP, enhanced green fluorescence protein EcFabG, 3-ketoacyl-ACP reductase XccXanA2, 3-hydroxybenozate AMP ligase XccXanL, Chain length factor LB, Luria Bertani IPTG, Isopropyl-β-D-1-thiogalactopyranoside Declarations Acknowledgements Not applicable. Author contributions LLZ and MJC: Writing–original draft, Writing–review & editing, Data curation, Visualization. ZWB: Formal analysis. ZLH: Resources. HZ and JXB: Project administration. All authors reviewed the manuscript. Funding This study was supported by the following projects: Industry-University Cooperation Project (H20230317), National Natural Science Foundation of China (32570032) and Double First-class Discipline Promotion Project (2021B10564001). Availability of data and materials All of the data generated and used in this work are included in the manuscript and are available as supplementary material. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare no competing interests. References Itakura K, Hirose T, Crea R, Riggs AD, Heyneker HL, Bolivar F, Boyer HW. Expression in Escherichia coli of a chemically synthesized gene for the hormone somatostatin. Science. 1977;198(4321):1056–1063. https://doi.org/10.1126/science.412251. Hayat SMG, Farahani N, Golichenari B, Sahebkar A. Recombinant Protein Expression in Escherichia coli ( E. coli ): What We Need to Know. Curr Pharm Des. 2018;24(6):718–725. https://doi.org/10.2174/1381612824666180131121940. Bi J, Tiong E, Koo YS, Zhou W, Wong FT. Further characterization and engineering of an 11-amino acid motif for enhancing recombinant soluble protein expression. Microbial Cell Factories. 2025;24(1):122. https://doi.org/10.1186/s12934-025-02738-5. Wang J, Guo H, Hou N, Xie Y, Zhang K, Li D. Research on enhancing the expression and immobilization of oxalate decarboxylase via the bicistronic translation coupling strategy. Journal of Cleaner Production. 2025;527:146650. https://doi.org/10.1016/j.jclepro.2025.146650. Zhong C, Wei P, Zhang YP. Enhancing functional expression of codon-optimized heterologous enzymes in Escherichia coli BL21(DE3) by selective introduction of synonymous rare codons. Biotechnol Bioeng. 2017;114(5):1054–1064. https://doi.org/10.1002/bit.26238. Rosano GL, Ceccarelli EA. Recombinant protein expression in Escherichia coli : advances and challenges. Frontiers in microbiology. 2014;5:172. https://doi.org/10.3389/fmicb.2014.00172. Yan Y, Liu X, Li Q, Chu X, Tian J, Wu N. Effect of rare codons in C-terminal of green fluorescent protein on protein production in Escherichia coli . Protein Expression Purif. 2018;149:23–30. https://doi.org/10.1016/j.pep.2018.04.011. Zhao L, Cao J, Liu X, Li Y, Wu J, Su L. Optimizing protein folding in prokaryotes: Strategies to enhance soluble expression of recombinant proteins. Bioresour Technol. 2026;439:133266. https://doi.org/10.1016/j.biortech.2025.133266. Hammarström M, Hellgren N, van Den Berg S, Berglund H, Härd T. Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli . Protein Sci. 2002;11(2):313–321. https://doi.org/10.1110/ps.22102. Miroux B, Walker JE. Over-production of Proteins in Escherichia coli : Mutant Hosts that Allow Synthesis of some Membrane Proteins and Globular Proteins at High Levels. J Mol Biol. 1996;260(3):289–298. https://doi.org/10.1006/jmbi.1996.0399. Qing G, Ma L-C, Khorchid A, Swapna GVT, Mal TK, Takayama MM, Xia B, Phadtare S, Ke H, Acton T et al . Cold-shock induced high-yield protein production in Escherichia coli . Nat Biotechnol. 2004;22(7):877–882. https://doi.org/10.1038/nbt984. de Marco A, De Marco V. Bacteria co-transformed with recombinant proteins and chaperones cloned in independent plasmids are suitable for expression tuning. J Biotechnol. 2004;109(1-2):45–52. https://doi.org/10.1016/j.jbiotec.2003.10.025. Tang NC, Su JC, Shmidov Y, Kelly G, Deshpande S, Sirohi P, Peterson N, Chilkoti A. Synthetic intrinsically disordered protein fusion tags that enhance protein solubility. Nat Commun. 2024;15(1):3727. https://doi.org/10.1038/s41467-024-47519-7. di Guana C, Lib P, Riggsa PD, Inouyeb H. Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose-binding protein. Gene. 1988;67(1):21–30. https://doi.org/10.1016/0378-1119(88)90004-2. Dyson MR, Shadbolt SP, Vincent KJ, Perera RL, McCafferty J. Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression. BMC Biotechnol. 2004;4:32. https://doi.org/10.1186/1472-6750-4-32. Malakhov MP, Mattern MR, Malakhova OA, Drinker M, Weeks SD, Butt TR. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics. 2004;5(1-2):75–86. https://doi.org/10.1023/b:Jsfg.0000029237.70316.52. Wang HZ, Chu ZZ, Chen CC, Cao AC, Tong X, Ouyang CB, Yuan QH, Wang MN, Wu ZK, Wang HH et al . Recombinant Passenger Proteins Can Be Conveniently Purified by One-Step Affinity Chromatography. PLoS One. 2015;10(12):e0143598. https://doi.org/10.1371/journal.pone.0143598. Orrapin S, Intorasoot S. Recombinant expression of novel protegrin-1 dimer and LL-37-linker–histatin-5 hybrid peptide mediated biotin carboxyl carrier protein fusion partner. Protein Expression Purif. 2014;93:46–53. https://doi.org/10.1016/j.pep.2013.10.010. Bao WJ, Gao YG, Chang YG, Zhang TY, Lin XJ, Yan XZ, Hu HY. Highly efficient expression and purification system of small-size protein domains in Escherichia coli for biochemical characterization. Protein Expression Purif. 2006;47(2):599–606. https://doi.org/10.1016/j.pep.2005.11.021. Costa SJ, Almeida A, Castro A, Domingues L, Besir H. The novel Fh8 and H fusion partners for soluble protein expression in Escherichia coli: a comparison with the traditional gene fusion technology. Appl Microbiol Biotechnol. 2013;97(15):6779–6791. https://doi.org/10.1007/s00253-012-4559-1. Vargas-Cortez T, Morones-Ramirez JR, Balderas-Renteria I, Zarate X. Expression and purification of recombinant proteins in Escherichia coli tagged with a small metal-binding protein from Nitrosomonas europaea. Protein Expression Purif. 2016;118:49–54. https://doi.org/10.1016/j.pep.2015.10.009. Anderluh G, Gökçe I, Lakey JH. Expression of proteins using the third domain of the Escherichia coli periplasmic-protein TolA as a fusion partner. Protein Expression Purif. 2003;28(1):173–181. https://doi.org/10.1016/S1046-5928(02)00681-2. Song J, Chen W, Lu Z, Hu X, Ding Y. Soluble expression, purification, and characterization of recombinant human flotillin-2 (reggie-1) in Escherichia coli . Mol Biol Rep. 2011;38(3):2091–2098. https://doi.org/10.1007/s11033-010-0335-4. LaVallie ER, DiBlasio EA, Kovacic S, Grant KL, Schendel PF, McCoy JM. A Thioredoxin Gene Fusion Expression System That Circumvents Inclusion Body Formation in the E. coli Cytoplasm. Nat Biotechnol. 1993;11(2):187–193. https://doi.org/10.1038/nbt0293-187. Köppl C, Lingg N, Fischer A, Kröß C, Loibl J, Buchinger W, Schneider R, Jungbauer A, Striedner G, Cserjan-Puschmann M. Fusion Tag Design Influences Soluble Recombinant Protein Production in Escherichia coli . Int J Mol Sci. 2022;23(14). https://doi.org/10.3390/ijms23147678. Kapust RB, Tözsér J, Copeland TD, Waugh DS. The P1′ specificity of tobacco etch virus protease. Biochem Biophys Res Commun. 2002;294(5):949–955. https://doi.org/10.1016/S0006-291X(02)00574-0. Ali ST, Guest JR. Isolation and characterization of lipoylated and unlipoylated domains of the E2p subunit of the pyruvate dehydrogenase complex of Escherichia coli . Biochem J. 1990;271(1):139–145. https://doi.org/10.1042/bj2710139. Hu Z, Ma J, Chen Y, Tong W, Zhu L, Wang H, Cronan JE. Escherichia coli FabG 3-ketoacyl-ACP reductase proteins lacking the assigned catalytic triad residues are active enzymes. J Biol Chem. 2021;296:100365. https://doi.org/10.1016/j.jbc.2021.100365. Cao XQ, Wang JY, Zhou L, Chen B, Jin Y, He YW. Biosynthesis of the yellow xanthomonadin pigments involves an ATP-dependent 3-hydroxybenzoic acid: acyl carrier protein ligase and an unusual type II polyketide synthase pathway. Mol Microbiol. 2018;110(1):16–32. https://doi.org/10.1111/mmi.14064. Malhotra A. Tagging for protein expression. Methods Enzymol. 2009;463:239–258. 10.1016/S0076-6879(09)63016-0. Bernier SC, Cantin L, Salesse C. Systematic analysis of the expression, solubility and purification of a passenger protein in fusion with different tags. Protein Expression Purif. 2018;152:92–106. https://doi.org/10.1016/j.pep.2018.07.007. Jo BH. An Intrinsically Disordered Peptide Tag that Confers an Unusual Solubility to Aggregation-Prone Proteins. Appl Environ Microbiol. 2022;88(7):e0009722. https://doi.org/10.1128/aem.00097-22. Feilmeier BJ, Iseminger G, Schroeder D, Webber H, Phillips GJ. Green fluorescent protein functions as a reporter for protein localization in Escherichia coli . J Bacteriol. 2000;182(14):4068–4076. https://doi.org/10.1128/jb.182.14.4068-4076.2000. Additional Declarations No competing interests reported. Supplementary Files Supplementarymaterial.docx Cite Share Download PDF Status: Published Journal Publication published 19 Dec, 2025 Read the published version in Microbial Cell Factories → Version 1 posted Editorial decision: Revision requested 17 Nov, 2025 Reviews received at journal 17 Nov, 2025 Reviews received at journal 09 Nov, 2025 Reviews received at journal 29 Oct, 2025 Reviewers agreed at journal 29 Oct, 2025 Reviewers agreed at journal 27 Oct, 2025 Reviewers agreed at journal 22 Oct, 2025 Reviewers invited by journal 21 Oct, 2025 Editor assigned by journal 21 Oct, 2025 Submission checks completed at journal 21 Oct, 2025 First submitted to journal 16 Oct, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7875802","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":537119552,"identity":"0421bdf9-0578-449f-b032-a96d6b520323","order_by":0,"name":"Li-Zhen Luo","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Li-Zhen","middleName":"","lastName":"Luo","suffix":""},{"id":537119553,"identity":"d1507cc4-20bf-4ee7-8aa3-ba694db603da","order_by":1,"name":"Wen-Bin Zhang","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Wen-Bin","middleName":"","lastName":"Zhang","suffix":""},{"id":537119554,"identity":"6beed325-5bd0-4198-be18-b004859a1e23","order_by":2,"name":"Zhe Hu","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Zhe","middleName":"","lastName":"Hu","suffix":""},{"id":537119555,"identity":"81078291-74cf-48ba-bf0f-8c9faee96db7","order_by":3,"name":"Ling-Hua Zhang","email":"","orcid":"","institution":"South China Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Ling-Hua","middleName":"","lastName":"Zhang","suffix":""},{"id":537119556,"identity":"fb5cdab3-2d7a-4bc4-bd9f-ff493b7c7d58","order_by":4,"name":"Jin-Cheng Ma","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA8UlEQVRIiWNgGAWjYHCChAMMDEDEwHwAJkC0FjaYUsJaQACkhceAOC267QceHi6ouSNnzr/m84cfNYcZ+NlzDBh+7sCtxexMQsLhGceeGVvOeLtNsufYYQbJnjcGjL1n8Gg5ANTC23A4ccONs9uYGdgOMxjcyDFgZmzDo+X8A7CW+g03zjz+zPDvMIM9QS03ILYkGJzvYZBmbAPaIkFQC9AWnmOHDTfcYDOT7O1L55E486zgYC9eh+Ukf+apOSxvcP7w4w8/vlnL8bcnb3zwE48WYHQkQGgJCM0DIg7g08DAwA6V5yegbhSMglEwCkYuAACQIF2lwk2N6AAAAABJRU5ErkJggg==","orcid":"","institution":"South China Agricultural University","correspondingAuthor":true,"prefix":"","firstName":"Jin-Cheng","middleName":"","lastName":"Ma","suffix":""},{"id":537119557,"identity":"8dd66883-4678-4591-8ca1-674d4c5a46dd","order_by":5,"name":"Xue-Bin Jiang","email":"","orcid":"","institution":"Guangzhou University","correspondingAuthor":false,"prefix":"","firstName":"Xue-Bin","middleName":"","lastName":"Jiang","suffix":""}],"badges":[],"createdAt":"2025-10-16 09:38:42","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7875802/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7875802/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12934-025-02903-w","type":"published","date":"2025-12-19T15:57:32+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":94945707,"identity":"c7257928-e3aa-46bc-9ce6-b5dd5e98c4f5","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":19401762,"visible":true,"origin":"","legend":"","description":"","filename":"AstandardizedsetofpNXvectorsforenhancedsolubleexpressionofrecombinantproteinsinE.coliusingsmallfusiontags251019.docx","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/8e0a866015b0dd7c096fe736.docx"},{"id":94945702,"identity":"54960a27-73f9-4eaf-97f7-6471dbdfe072","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7255,"visible":true,"origin":"","legend":"","description":"","filename":"d5bfae62ca4b489e92f8585bb269d55d.json","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/46f54f318b9d987b282bc466.json"},{"id":94988959,"identity":"48f216e5-1a34-4419-b327-e6181aa2e9e6","added_by":"auto","created_at":"2025-11-03 07:11:30","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":26222,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/dcf06aa2b7f162167fd98433.docx"},{"id":94945714,"identity":"382da1a5-f798-4af0-a7f1-cde8cce83d13","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":111483,"visible":true,"origin":"","legend":"","description":"","filename":"d5bfae62ca4b489e92f8585bb269d55d1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/00a33b598b2bc13fbf445774.xml"},{"id":94945721,"identity":"bab8a25c-d308-4729-a0db-337c268926c7","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"jpeg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":671112,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/15196468d5e04f1eca7b9b12.jpeg"},{"id":94945715,"identity":"669f69c2-c16c-40f5-b71e-8e876cb8d23b","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"jpeg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4275016,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/773dcc7428347e366c22c2b0.jpeg"},{"id":94945713,"identity":"7a0211ef-0e09-4a17-8709-0c5c9f041347","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"jpeg","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4518676,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/d54de9efc1a094ecc1983f27.jpeg"},{"id":94989056,"identity":"21fc007d-0120-4c67-9546-616a03d4a1f0","added_by":"auto","created_at":"2025-11-03 07:11:54","extension":"jpeg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":479210,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/55b1138cdc48649122de126a.jpeg"},{"id":94945716,"identity":"b3a844d5-f89a-4773-afc0-3bf4aef79f9c","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"jpeg","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4605032,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/52256f206aa26397b6c16fe8.jpeg"},{"id":94945709,"identity":"4c1a91d9-e995-4de4-85ee-dc320b88c3dc","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"jpeg","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5200430,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/cd4eea04d13fc1b66d122f07.jpeg"},{"id":94945719,"identity":"5b2228ad-7c3d-4eb1-976a-e504aabf6a58","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":124573,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/811599b54a530b368132a827.png"},{"id":94945711,"identity":"7bbaa3d4-b776-425f-8db0-1833c8514a0d","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":62770,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/f2f321a25cb0c2adf3fd4015.png"},{"id":94945724,"identity":"d04d9883-4748-4d5d-9d71-2b82e18faacb","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":69317,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/e1f0d0b97f3387a05864a6d3.png"},{"id":94987971,"identity":"eab5f7a8-cb41-411b-851e-341044e04f3b","added_by":"auto","created_at":"2025-11-03 07:02:37","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":44270,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/976fcb9c72e69ac20e18f617.png"},{"id":94988888,"identity":"f7403dd7-add6-4f5f-8f59-63bed456e6cd","added_by":"auto","created_at":"2025-11-03 07:11:17","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":46722,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/66afa5408c403eb429400bfd.png"},{"id":94945722,"identity":"4914c6f2-1bb2-494c-852d-7ce4bac1aaf8","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":63243,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/de9aa48ffab45b7fb356479c.png"},{"id":94987557,"identity":"d4a7b70b-e3f8-4157-bbeb-02b37080c663","added_by":"auto","created_at":"2025-11-03 07:02:05","extension":"xml","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":109696,"visible":true,"origin":"","legend":"","description":"","filename":"d5bfae62ca4b489e92f8585bb269d55d1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/8a3bfbab6150c1549c7b4083.xml"},{"id":94945720,"identity":"6425602b-9d4f-421e-83f3-71c9c8de3124","added_by":"auto","created_at":"2025-11-02 09:23:52","extension":"html","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":121383,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/a3e2752bd08367cd39a8837d.html"},{"id":94945701,"identity":"423e41af-fa93-4f19-a0e6-1c2538805063","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":212432,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSchematic representation of the pNX vector series.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll vectors share a conserved structure featuring a T7 promoter, a variable N-terminal fusion tag (SUMO, LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, or TrxA), followed by a 6×His tag, a synthetic linker, a TEV protease cleavage site, a standardized multiple cloning site (MCS), and a C-terminal 6×His tag. The uniform architecture enables direct comparison of tag performance.\u003c/p\u003e","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/90d108095d4135a80625930a.png"},{"id":94988515,"identity":"b9820c31-496e-4c74-a6c0-bd6af98dd7e3","added_by":"auto","created_at":"2025-11-03 07:09:38","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":464050,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalysis of soluble yields of EcFabG and eGFP fusions.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eExpression levels of recombinant EcFabG (A, C) and eGFP (B, D) fused with different tags in \u003cem\u003eE. coli\u003c/em\u003e BL21(DE3). (A, B) SDS-PAGE gels showing the expression of fusion proteins. (C, D) Quantification of total soluble protein yield based on grayscale density analysis of SDS-PAGE gels. Cultures were grown in LB medium and induced with 240 μg/mL IPTG at 37°C for 4 h. M, protein molecular weight marker.\u003c/p\u003e","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/617bedfbad5476da1a25c621.png"},{"id":94945703,"identity":"d668874e-1184-4193-9f95-5bd09168ba12","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":385098,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEvaluation of soluble expression for EcFabG and eGFP fusions.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRecombinant protein expression in \u003cem\u003eE. coli\u003c/em\u003e BL21 (DE3) harboring different fusion tag constructs. (A, B) SDS-PAGE analysis of the soluble fraction (supernatants, S) and insoluble fraction (pellets, P) for EcFabG (A) and eGFP (B). (C, D) Quantification of protein solubility ratio based on grayscale density for EcFabG (C) and eGFP (D). Data represent mean ± SD from three independent experiments. Cultures were grown in LB medium and induced with 240 μg/mL IPTG at 37°C for 4 h. M, protein molecular weight marker.\u003c/p\u003e","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/1980725397f24749776bca34.png"},{"id":94945704,"identity":"aa029ed1-4096-4351-96ea-b56d668c1b27","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":331885,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePurification and TEV protease cleavage of fusion proteins.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) SDS-PAGE analysis of EcFabG fusion proteins before (B) and after (A) TEV protease cleavage. (B) SDS-PAGE analysis of eGFP fusion proteins before (B) and after (A) TEV protease cleavage. Cleavage yielded untagged target proteins at expected sizes: 25.6 kDa for EcFabG and 29.5 kDa for eGFP. M, molecular weight marker; BSA, bovine serum albumin (loading control).\u003c/p\u003e","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/ba166fd7c2e6a39c9d7ef502.png"},{"id":94945708,"identity":"e6202b77-3359-4188-a3a8-df5ca72bbc2e","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":401011,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFunctional analysis of tagged and tag-free proteins.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) \u003cem\u003eIn vitro \u003c/em\u003eenzymatic activity of EcFabG fusions analyzed by conformationally sensitive gel, showing conversion of malonyl-ACP (Mal-ACP) and octanoyl-ACP (C\u003csub\u003e8\u003c/sub\u003e-ACP) to decanoyl-ACP (C\u003csub\u003e10\u003c/sub\u003e-ACP) and dodecanoyl-ACP (C\u003csub\u003e12\u003c/sub\u003e-ACP). (B) Control reaction with tag-free EcFabG. (C) Quantitative analysis of EcFabG enzymatic activity (μmol/kg/sec). EcFabG activity was measured by monitoring NADPH oxidation at 340 nm. (D) Comparison of fluorescence intensity (RLU) between tagged and tag-free eGFP. Fluorescence of eGFP was measured at excitation 489 nm/emission 511 nm. Data represent mean ± SD (n=3); ns, not significant (P \u0026gt; 0.05, Student's t-test).\u003c/p\u003e","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/788e4dff1a40487e22cc7a92.png"},{"id":94988817,"identity":"a1a018e0-d448-4567-81ec-2cc5941b002e","added_by":"auto","created_at":"2025-11-03 07:11:04","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":480351,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eApplication of pNX vectors to difficult-to-express proteins.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e(A) SDS-PAGE analysis of XccXanA2 expression from the soluble fraction (supernatants, S) and insoluble fraction (pellets, P) using different fusion tags. (B) SDS-PAGE analysis of XccXanL expression from supernatants (S) and pellets (P) fractions using different fusion tags. Arrows indicate the positions of the target proteins. M, protein molecular weight marker.\u003c/p\u003e","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/995af2809e1ecba2b38bb1d6.png"},{"id":98813925,"identity":"a2432560-83a9-4591-a98d-88c3c1f6fcfe","added_by":"auto","created_at":"2025-12-22 16:07:50","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3429665,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/4fbe2ec1-1020-439f-98e3-bc551ca08b86.pdf"},{"id":94945700,"identity":"40c288bd-8217-4b68-a8e0-86d588a070b3","added_by":"auto","created_at":"2025-11-02 09:23:51","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":26222,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7875802/v1/b640a0f63e75e07f85adcfb2.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"A standardized set of pNX vectors for enhanced soluble expression of recombinant proteins in E. coli using small fusion tags","fulltext":[{"header":"1 Background","content":"\u003cp\u003eThe production of recombinant proteins is a cornerstone of modern biotechnology, with applications ranging from therapeutic biologics to industrial enzymes. Since the first approval of a recombinant biologic for diabetes treatment over five decades ago [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e], \u003cem\u003eEscherichia coli\u003c/em\u003e has remained the predominant host for recombinant protein production due to its well-characterized genetics, rapid growth, low-cost cultivation, and high yield potential [\u003cspan additionalcitationids=\"CR3\" citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Compared to other expression systems such as \u003cem\u003eBacillus\u003c/em\u003e, yeast, insect, or mammalian cells, \u003cem\u003eE. coli\u003c/em\u003e can be readily manipulated, are cultured inexpensively and grow rapidly [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Despite these advantages, the expression of heterologous proteins in \u003cem\u003eE. coli\u003c/em\u003e often faces significant challenges. A primary issue is the frequent formation of insoluble and inactive protein aggregates, known as inclusion bodies [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. This can be attributed to several factors, including codon usage bias that reduces translational efficiency [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e], improper disulfide bond formation or protein misfolding, potential cytotoxicity of the foreign protein, and the distinct intracellular microenvironment of the bacterial host that may not support the correct folding of foreign proteins [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. The formation of inclusion bodies necessitates laborious and often inefficient \u003cem\u003ein vitro\u003c/em\u003e refolding procedures, substantially increasing the cost and complexity of downstream processing and hindering large-scale industrial production. Consequently, developing robust strategies to enhance the solubility of recombinant proteins in \u003cem\u003eE. coli\u003c/em\u003e is a major focus of ongoing research.\u003c/p\u003e\u003cp\u003eCommon empirical approaches to improve soluble yield include modulating expression conditions\u0026mdash;such as lowering the induction temperature [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], changing the \u003cem\u003eE. coli\u003c/em\u003e expression strain [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e], employing different promoters or induction conditions [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e], and co-expressing molecular chaperones and folding regulators [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Among the most effective and generalizable strategies is the fusion of the target protein to a solubility-enhancing partner tag [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Several such tags have been extensively characterized. Maltose-binding protein (MBP) is a component of the maltose transport system in \u003cem\u003eE. coli\u003c/em\u003e. MBP attracts the molecular chaperone GroEL and functions as an intramolecular chaperone, aiding in the proper folding of recombinant proteins. The MBP fusion expression system not only offers advantages such as high expression efficiency and ease of purification but also demonstrates favorable solubilization effects [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Small ubiquitin-related modifier (SUMO) modulates protein structure and function by covalently binding to the lysine side chains of the target proteins. SUMO attachment to the N-terminus of underexpressed proteins significantly enhances their expression and solubility in \u003cem\u003eE. coli\u003c/em\u003e [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. As a fusion tag, acyl carrier protein (ACP) could greatly increase the soluble expression level of Glucokinase (GlcK), α-Amylase (Amy) and GFP [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. Highly positively charged recombinant protegrin-1 dimer and LL-37/histatin-5 peptides were expressed in \u003cem\u003eE. coli\u003c/em\u003e, utilizing biotin carboxyl carrier protein (BCCP) mediated acidic amino acids as fusion tags [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. The B1 immunoglobulin binding domain of \u003cem\u003eStreptococcal\u003c/em\u003e protein G (GB1) is a relatively small solubility-enhancing tag (56 residues) that utilizes highly acidic sequences to increase the solubility of target proteins. It enhances the expression yield, stability, and solubility of fusion target peptides without compromising their structure or function, proving particularly useful in protein preparation for NMR studies [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Putative calcium-binding protein (Fh8) fusion tags also serve as effective solubility enhancers. Compared to larger fusion tags, their low molecular weight and solubility-enhancing properties make Fh8 a favorable choice for soluble protein production in \u003cem\u003eE. coli\u003c/em\u003e [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. A small metal-binding protein (SmbP) of 9.9 kDa was isolated from the periplasm of \u003cem\u003eNitrosomonas europaea\u003c/em\u003e. Compared to MBP and glutathione S-transferase (GST), Green Fluorescent Protein fused with Smbp at the N-terminus exhibited high solubility and low inclusion body formation rates, suggesting Smbp may be a preferred fusion protein for producing small-sized proteins [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. The third domain of the periplasmic protein TolA (TolAIII) was used as a fusion partner in the expression of various proteins from bacteria and eukaryotes. TolAIII is small domain, expressed in high yields as a soluble protein in the cytoplasm of \u003cem\u003eE. coli\u003c/em\u003e and proved to be useful in the preparation of other peptides and proteins [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. Thioredoxin (TrxA) is a widely distributed, thermostable protein that functions as a hydrogen carrier and also appears as a domain in disulfide isomerase. TrxA enhances soluble expression of foreign proteins in \u003cem\u003eE. coli\u003c/em\u003e by acting similarly to a molecular chaperone. It achieves this either by physically binding to target proteins to reduce the formation of misfolded intermediates or by covalently binding to prevent inclusion body formation [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eHowever, no studies have identified a universally applicable fusion tag suitable for expressing any target protein, as the effectiveness of existing tags varies significantly depending on the protein of interest [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. This necessitates screening multiple tags to identify the optimal partner for a protein of interest. A significant practical bottleneck is that these tags are often dispersed across different plasmid vectors with incompatible cloning sites and backbone architectures, making parallel cloning and comparative screening inefficient and time-consuming [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eFurthermore, the presence of a fusion tag can sometimes interfere with the structure, function, or downstream applications of the target protein. Therefore, a seamless system for tag removal is highly desirable, particularly for the production of therapeutic proteins where the tag must be eliminated to ensure safety and efficacy [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. By incorporating specific cleavage sequences recognized by proteases include factor Xa, thrombin, caspase-2, SUMO protease, and enterokinase, the fusion tag can be efficiently removed during purification [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Tobacco Etch Virus (TEV) protease is also widely used for this purpose due to its high specificity and ability to cleave precisely, often leaving no extraneous residues on the target protein [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eTo address the critical need for standardization and high-throughput compatibility, we developed the pNX vector series. This is a unified set of expression vectors based on the pET-28b backbone, each harboring one of nine small fusion tags (SUMO, LD (Lipoyl domain E2p) [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) in an identical genetic context. All pNX vectors feature a standardized multi-cloning site (MCS) flanked by dual 6\u0026times;His tags and a TEV protease cleavage site for efficient tag removal. In this study, we systematically evaluated the ability of these tags to enhance the solubility and maintain the biological activity of four model proteins: enhanced Green Fluorescent Protein (eGFP) [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e], 3-ketoacyl-ACP reductase (EcFabG) [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], 3-hydroxybenozate AMP ligase (XccXanA2) [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e], and Chain length factor (XccXanL) [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. Our results not only validate the pNX system as a powerful tool for rapid screening but also lead to the novel finding that the Lipoyl domain E2p (LD) serves as an effective solubility enhancer.\u003c/p\u003e"},{"header":"2 Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Plasmids, Bacterial Strains, and Growth Conditions\u003c/h2\u003e\u003cp\u003eThe nine solubility-enhancing fusion tags utilized in this study are listed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Genes encoding SUMO, Fh8, GB1, and SmbP were synthesized by Sangon Biotech (Shanghai) Co., Ltd. and delivered in a PUC57-KAN vector. The ACP, BCCP, TolA, and TrxA genes were PCR-amplified from the genomic DNA of \u003cem\u003eE. coli\u003c/em\u003e MG1655. The LD gene was amplified from the plasmid pGS331. All primers for tag amplification are listed in Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e. The bacterial strains and plasmids used are detailed in Table S2. \u003cem\u003eE. coli\u003c/em\u003e strains were routinely cultured in Luria-Bertani (LB) medium at 37\u0026deg;C. Where appropriate, kanamycin and isopropyl β-D-1-thiogalactopyranoside (IPTG) were added to final concentrations of 30 \u0026micro;g/mL and 240 \u0026micro;g/mL, respectively.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSolubility enhancer tags used in this study\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTags\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSource\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eFull Name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eSize (kDa)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eReference\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSUMO\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eSaccharomyces cerevisiae\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSmall ubiquitin-related modifier\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e11.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLD\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eLipoyl domain E2p\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e8.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eACP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAcyl carrier protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e8.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBCCP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eBiotin carboxyl carrier protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e9.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGB1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eStreptococcus\u003c/em\u003e sp.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eThe B1 immunoglobulin binding domain of protein G\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFh8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eFasciola hepatica\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePutative calcium-binding protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSmbP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eNitrosomonas europaea\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSmall metal-binding protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e9.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTolA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTol-Pal system protein TolA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e9.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTrxA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eThioredoxin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e11.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Construction of the pNX Vector Series\u003c/h2\u003e\u003cp\u003eThe pNX vector series was constructed sequentially, starting with the pNSUMO plasmid. The SUMO gene was initially amplified using primers SUMO P1 and P2. The resulting amplicon was then used as a template in four subsequent rounds of PCR with primer SUMO P1 paired separately with SUMO P3, P4, P5, and P6. This strategy sequentially appended the following elements to the 3'-end of the SUMO gene: a \u003cem\u003eKpn\u003c/em\u003eI site, synthetic linker peptides, a TEV protease recognition site, a sequence encoding a C-terminal 6\u0026times;His tag, and finally \u003cem\u003eNde\u003c/em\u003eI and \u003cem\u003eBam\u003c/em\u003eHI sites. The final, extended SUMO fragment was digested with \u003cem\u003eNco\u003c/em\u003eI and \u003cem\u003eBam\u003c/em\u003eHI and ligated into the corresponding sites of the pET-28b vector, yielding the pNSUMO backbone plasmid.\u003c/p\u003e\u003cp\u003eTo generate the other vectors in the series, the genes for LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA were amplified using their respective primers (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). Each PCR product was digested with \u003cem\u003eNco\u003c/em\u003eI and \u003cem\u003eKpn\u003c/em\u003eI and then cloned into the pNSUMO backbone plasmid digested with the same enzymes, thereby replacing the SUMO tag. This resulted in the construction of plasmids pNLD, pNACP, pNBCCP, pNGB1, pNFh8, pNSmbP, pNTolA, and pNTrxA, collectively designated as the pNX series.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Cloning of Target Genes\u003c/h2\u003e\u003cp\u003eThe \u003cem\u003eegfp\u003c/em\u003e gene was amplified from the plasmid pET-28b-\u003cem\u003eegfp\u003c/em\u003e (a gift from Associate Professor Wang). The \u003cem\u003eEcfabG\u003c/em\u003e gene was amplified from the genomic DNA of \u003cem\u003eE. coli\u003c/em\u003e MG1655. The \u003cem\u003eXccxanA2\u003c/em\u003e and \u003cem\u003eXccxanL\u003c/em\u003e genes were amplified from the genomic DNA of \u003cem\u003eXanthomonas campestris\u003c/em\u003e pv. \u003cem\u003eCampestris\u003c/em\u003e 8004. All target genes were amplified using primer pairs (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) designed to incorporate \u003cem\u003eNde\u003c/em\u003eI and \u003cem\u003eHin\u003c/em\u003edIII restriction sites. The purified PCR fragments and the pNX vectors were digested with \u003cem\u003eNde\u003c/em\u003eI and \u003cem\u003eHin\u003c/em\u003edIII, ligated, and transformed into \u003cem\u003eE. coli\u003c/em\u003e DH5α. The resulting expression constructs were designated as pNX-\u003cem\u003eegfp\u003c/em\u003e, pNX-\u003cem\u003eEcfabG\u003c/em\u003e, pNX-\u003cem\u003eXccxanA2\u003c/em\u003e, and pNX-\u003cem\u003eXccxanL\u003c/em\u003e (Table S2).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4 Protein Expression, Purification, and Tag Cleavage\u003c/h2\u003e\u003cp\u003eRecombinant pNX plasmids were transformed into \u003cem\u003eE. coli\u003c/em\u003e BL21(DE3) for protein expression. Single colonies were used to inoculate 5 mL of LB medium with kanamycin, which were grown overnight at 37\u0026deg;C with shaking at 180 rpm. These cultures were diluted 1:100 into 100 mL of fresh LB medium and grown at 37\u0026deg;C to an OD₆₀₀ of 0.6. Protein expression was induced by adding IPTG to a final concentration of 240 \u0026micro;g/mL, followed by incubation for 4 hours at 37\u0026deg;C. Cells were harvested by centrifugation (4,000 rpm, 20 min, 4\u0026deg;C).\u003c/p\u003e\u003cp\u003eFor solubility analysis, cultures were normalized to an OD₆₀₀ of 1.0. Cells from 5 mL of normalized culture were pelleted (12,000 \u0026times; g, 5 min), resuspended in 1 mL of lysis buffer, and disrupted by sonication on ice (3 cycles of 5 min each). The lysate was centrifuged (12,000 rpm, 20 min, 4\u0026deg;C) to separate the soluble (supernatant) and insoluble (pellet) fractions. The insoluble pellet was resuspended in 1 mL of denaturing lysis buffer (50 mM phosphate buffer, pH 8.0, 0.3 M NaCl, 20 mM imidazole, 8 M urea). Both fractions were analyzed by SDS-PAGE, and gels were stained with Coomassie Blue. Band intensities were quantified using ImageJ software.\u003c/p\u003e\u003cp\u003eFor protein purification, fusion proteins were purified from the soluble fraction under native conditions using Ni-NTA affinity chromatography (Qiagen). Protein concentrations were determined by the Bradford assay (Bio-Rad). Purified proteins were stored at -80\u0026deg;C after the addition of glycerol to a final concentration of 20% (v/v).\u003c/p\u003e\u003cp\u003eFor tag removal, purified fusion proteins were incubated with TEV protease (a gift from Associate Professor Wang) at a 1:10 (w/w) protease-to-protein ratio in reaction buffer (50 mM Tris-HCl, pH 8.0, 0.5 mM EDTA, 1 mM DTT) for 2 hours at 30\u0026deg;C. The cleavage reaction was centrifuged briefly (5,000 rpm, 1 min), and the supernatant was analyzed by SDS-PAGE to assess cleavage efficiency.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5 Fluorescence spectroscopy for GFP\u003c/h2\u003e\u003cp\u003eThe fluorescence of purified eGFP proteins was assessed to evaluate functional integrity. Proteins were diluted to 1 \u0026micro;M in 50 mM sodium phosphate buffer (pH 8.0, 300 mM NaCl). Fluorescence measurements were performed using a SpectraMax i3x Multi-Mode Microplate Reader (Molecular Devices). The excitation wavelength was set to 460 nm, and the emission spectrum was recorded.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6 EcFabG Enzyme Activity Assays\u003c/h2\u003e\u003cp\u003eThe oxidoreductase (OAR) activity of EcFabG fusion proteins was assayed \u003cem\u003ein vitro\u003c/em\u003e as previously described [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Briefly, the 40 \u0026micro;L reaction mixture contained 0.1 M sodium phosphate buffer (pH 7.0), 0.1 \u0026micro;g each of EcFabB, fused EcFabG, EcFabA, and EcFabI, 50 \u0026micro;M NADH, 50 \u0026micro;M NADPH, 1 mM β-mercaptoethanol, 100 \u0026micro;M malonyl-ACP, and 100 \u0026micro;M octanoyl-ACP. Reaction products were analyzed by conformationally sensitive gel electrophoresis on 20% polyacrylamide gels containing 0.5-1 M urea, followed by staining with Coomassie Brilliant Blue R-250.\u003c/p\u003e\u003cp\u003eFor quantitative analysis, EcFabG activity was determined by monitoring the oxidation rate of NADPH at 340 nm. The 300 \u0026micro;L assay mixture contained 0.1 M sodium phosphate buffer (pH 8.0), 100 \u0026micro;M malonyl-ACP, 100 \u0026micro;M octanoyl-ACP, 5 mM β-mercaptoethanol, 0.2 mM NADPH, and 0.5 \u0026micro;g of purified EcFabB. After pre-incubation at 37\u0026deg;C for 1 h, the reaction was initiated by adding 20 \u0026micro;M of fused EcFabG. Activity, calculated based on NADPH consumption, was expressed in \u0026micro;mol/kg/sec and analyzed using GraphPad Prism software (version 6).\u003c/p\u003e\u003c/div\u003e"},{"header":"3 Results","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e3.1 Construction of a Standardized pNX Vector Series\u003c/h2\u003e\u003cp\u003eTo establish a high-throughput platform for screening solubility-enhancing tags, we engineered a series of expression vectors, designated pNX, by inserting nine different fusion tags (SUMO, LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) into a uniform pET-28b backbone. All pNX vectors share an identical architecture of functional elements to ensure consistency and comparability (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). This standardized configuration comprises, in sequential order: a T7 promoter, an N-terminal fusion tag, a 6\u0026times;His tag, a synthetic linker, a TEV protease recognition site, a multiple cloning site (MCS), and a C-terminal 6\u0026times;His tag.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e3.2 High-Throughput Cloning of Target Genes into the pNX vectors\u003c/h2\u003e\u003cp\u003eTo assess the functionality of the expression cassette, four target genes\u0026mdash;\u003cem\u003eegfp\u003c/em\u003e, \u003cem\u003eEcfabG\u003c/em\u003e, \u003cem\u003eXccxanA2\u003c/em\u003e, and \u003cem\u003eXccxanL\u003c/em\u003e\u0026mdash;were cloned into the pNX vectors system. Each gene was amplified with primers introducing \u003cem\u003eNde\u003c/em\u003eI and \u003cem\u003eHin\u003c/em\u003edIII restriction sites, digested, and ligated into the corresponding sites of all pNX vectors. This process generated a comprehensive set of expression constructs (listed in Table S2). The uniform MCS across the pNX series enabled the efficient, parallel cloning of each target gene into every vector, validating the system's utility for high-throughput screening.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e3.3 Fusion Tags Enhance Heterologous Protein Expression\u003c/h2\u003e\u003cp\u003eWe first investigated whether the fusion tags could increase the total expression level of the target proteins. The expression vectors pNX-\u003cem\u003eEcfabG\u003c/em\u003e and pNX-\u003cem\u003eegfp\u003c/em\u003e were transformed into \u003cem\u003eE. coli\u003c/em\u003e BL21(DE3), and proteins were expressed in small-scale cultures. Analysis of whole-cell lysates by SDS-PAGE revealed that the impact of fusion tags was target-dependent (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). For EcFabG, all tags except BCCP and TrxA increased the total protein yield compared to the untagged control. In the case of eGFP, the LD, GB1, TolA, and TrxA tags led to notably improved expression levels. These results demonstrate that while fusion tags can significantly enhance the production of heterologous proteins, their effectiveness varies with the target protein.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e3.4 Specific Tags Markedly Improve Protein Solubility\u003c/h2\u003e\u003cp\u003eWe next evaluated the ability of these tags to enhance the solubility of EcFabG and eGFP. The soluble (supernatant) and insoluble (pellet) fractions of cell lysates were analyzed separately by SDS-PAGE (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA, B). Densitometric quantification of the gels showed that for EcFabG, all tested fusions resulted in excellent soluble expression. The LD and Fh8 tags were particularly effective, increasing the soluble yield of EcFabG by 10.98% and 4.86%, respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). In the case of eGFP, the LD and SUMO fusions substantially increased the amount of soluble protein, achieving solubility rates of 71.85% and 50.17%, respectively, compared to 57.08% for wild-type eGFP, whereas ACP, BCCP, and SmbP fusions improved the ratio of soluble eGFP protein (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). These findings confirm that fusion tags can be strategically selected to either increase total soluble yield or improve the folding efficiency of the target protein in \u003cem\u003eE. coli\u003c/em\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e3.5 Purification and Tag Removal via TEV Protease Cleavage\u003c/h2\u003e\u003cp\u003eTo obtain tag-free target proteins, a TEV protease recognition site was incorporated between each fusion tag and the MCS. Following initial purification of the fusion proteins by Ni-NTA affinity chromatography, incubation with TEV protease efficiently cleaved the tags. Subsequent SDS-PAGE analysis confirmed the release of untagged target proteins at their expected molecular weights\u0026mdash;25.6 kDa for EcFabG and 29.5 kDa for eGFP\u0026mdash;which remained in the soluble fraction (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). This two-step purification and cleavage process reliably yielded both tagged and tag-free versions of the proteins.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003e3.6 Small Size Tags Preserve Biological Activity\u003c/h2\u003e\u003cp\u003eA critical consideration is whether fusion tags impair the native function of the target protein. We therefore assessed the biological activity of the tagged proteins. The enzymatic activity of EcFabG fusions was evaluated in a reconstituted fatty acid synthesis system. All tagged versions of EcFabG successfully utilized octanoyl-ACP (C\u003csub\u003e8\u003c/sub\u003e-ACP) as a substrate to produce decanoyl-ACP (C\u003csub\u003e10\u003c/sub\u003e-ACP), with no visible impairment compared to the untagged enzyme (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA, B). Quantitative spectrophotometric assays further confirmed that there was no significant difference (P\u0026thinsp;\u0026gt;\u0026thinsp;0.05) in NADPH oxidation rates between tagged and untagged EcFabG (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC). Similarly, the functional integrity of eGFP fusions was determined by measuring fluorescence intensity. No significant difference (P\u0026thinsp;\u0026gt;\u0026thinsp;0.05) was observed between any of the tagged eGFP proteins and the tag-free control (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD). These results collectively demonstrate that the small size tags used in the pNX system do not compromise the biological activity of the model proteins.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003e3.7 The pNX System Rescues Expression of Challenging Proteins\u003c/h2\u003e\u003cp\u003eTo validate the practical utility of the pNX vectors system, we applied it to two difficult-to-express proteins: XccXanA2, which is typically produced at low levels, and XccXanL, which is predominantly insoluble in \u003cem\u003eE. coli\u003c/em\u003e. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, the parental pET-28b vector (no tag) yielded negligible soluble XccXanA2 and mostly insoluble XccXanL. However, screening with the pNX vectors identified specific tags that dramatically improved outcomes. The SUMO and Fh8 tags significantly enhanced both the expression and solubility of XccXanA2. For the highly insoluble XccXanL, fusion with the LD or SUMO tag resulted in a substantial increase in the soluble fraction. This demonstrates the power of the pNX system as an empirical tool for identifying optimal solubility partners for recalcitrant proteins.\u003c/p\u003e\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe \u003cem\u003eE. coli\u003c/em\u003e expression system remains a workhorse for recombinant protein production, yet the persistent challenge of insoluble aggregation necessitates robust strategies to enhance solubility [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Fusion tags represent an effective strategy to mitigate these issues by enhancing protein solubility, improving yields, and facilitating purification in \u003cem\u003eE. coli\u003c/em\u003e [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e], but their utility has been hampered by the lack of standardized, comparable vector systems. In this study, we addressed this critical bottleneck by developing the pNX series\u0026mdash;a unified set of vectors hosting nine distinct small-size solubility tags within an identical genetic backbone. Our systematic evaluation demonstrates that this system not only facilitates high-throughput screening for optimal solubility partners but also led to the identification of the Lipoyl domain (LD) as a novel and effective solubility enhancer.\u003c/p\u003e\u003cp\u003eA primary finding of our work is that the effectiveness of fusion tags is highly dependent on the target protein, reinforcing the consensus in the field that no single tag is universally superior [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. For instance, although LD and Fh8 significantly increased the soluble yield of EcFabG, their effect on the total expression level of eGFP was more moderate, especially for Fh8. LD and SUMO improved eGFP solubility primarily by increasing its overall expression level, where instead, tags like ACP, BCCP and SmbP significantly boosted its soluble fraction. This dichotomy suggests that different tags may operate through distinct mechanisms\u0026mdash;some may act as genuine folding helpers, increasing the \u003cem\u003eproportion\u003c/em\u003e of soluble protein, while others may simply boost overall translation, thereby increasing the \u003cem\u003eamount\u003c/em\u003e of soluble protein. The pNX system is uniquely positioned to empirically distinguish between these modes of action for any given protein.\u003c/p\u003e\u003cp\u003eCritically, we confirmed that the small-size tags in the pNX system, once cleaved, do not compromise the biological activity of the target proteins. The preserved enzymatic function of EcFabG and the unaltered fluorescence of eGFP across all tags are significant advantages over larger tags, which can sometimes cause steric hindrance or require refolding after cleavage [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. It is noteworthy that in our hands, the SmbP tag did not diminish eGFP fluorescence, which contrasts with previous reports where the SmbP tag, for instance, led to significantly reduced fluorescence\u0026mdash;possibly due to misfolding in the cytoplasmic environment rather than the periplasm for which it is optimized [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Furthermore, when fused to EcFabG, the BCCP tag resulted in approximately 5% reduction in solubility. This suggests that fusion to BCCP may not have correctly engaged with its specific lysine attachment sites, thereby failing to enhance solubility and potentially impairing proper folding or stability. Similarly, the TrxA tag appears to weaken functional coupling, likely due to incomplete or unstable covalent linkage with EcFabG. Thus, both BCCP and TrxA may negatively influence the expression and solubility of EcFabG. This discrepancy highlights the value of empirical screening and suggests that tag performance can vary based on the target protein and expression conditions.\u003c/p\u003e\u003cp\u003eThe practical utility of the pNX platform was unequivocally demonstrated through its application to the difficult-to-express proteins XccXanA2 and XccXanL. The system rapidly identified SUMO and Fh8 as effective tags for XccXanA2, and LD and SUMO as powerful solubilizing agents for XccXanL. Based on Protein-Sol solubility predictions, XccXanA2 and XccXanL were classified as low-solubility proteins, with scores of 0.309 and 0.406, respectively, falling below the common solubility threshold of 0.45. Although Protein-Sol software suggested that some tags might improve solubility, experimental results did not always align with these predictions. For instance, the fusion pNSmbP-XccXanL received a prediction score of 0.512, yet SmbP failed to enhance solubility in practice. This discrepancy may be attributed to excessively rapid recombinant protein expression in \u003cem\u003eE. coli\u003c/em\u003e, leading to misfolding and inclusion body formation. These results underscore that for proteins with low predicted solubility, a priori prediction of the best fusion partner remains challenging. The empirical, high-throughput screening enabled by the pNX vectors thus provides a decisive strategy to overcome this limitation, transforming a traditionally slow and fragmented process into a rapid and systematic one.\u003c/p\u003e\u003cp\u003eAmong the tags tested, the lipoyl domain (LD) emerges as a particularly promising novel solubility partner. While LD is known for its role in the pyruvate dehydrogenase complex and its lipoylation [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], its function as a solubility-enhancing fusion tag has not been previously reported. Our hypothesis that its structural stability and potential chaperone-like interaction could aid folding appears to be validated by its strong performance with both EcFabG and the challenging XccXanL. This discovery expands the toolbox of available solubility tags and warrants further investigation into its mechanism of action.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe pNX vector series successfully bridges a significant technological gap in recombinant protein expression. By integrating nine small-size tags into a standardized, TEV-cleavable backbone, we have created a versatile and efficient platform for high-throughput solubility screening. The system's ability to enhance expression and solubility while preserving biological function, coupled with its proven efficacy on hard-to-express proteins, makes it a valuable resource for both academic research and industrial biotechnology.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003e\u003cem\u003eE. coli\u003c/em\u003e,\u003cem\u003e\u0026nbsp;Escherichia coli\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eMBP, Maltose-binding protein\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eSUMO, Small ubiquitin-related modifier\u003c/p\u003e\n\u003cp\u003eLD,\u0026nbsp;Lipoyl domain E2p\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eACP,\u0026nbsp;acyl carrier protein\u003c/p\u003e\n\u003cp\u003eBCCP,\u0026nbsp;biotin carboxyl carrier protein\u003c/p\u003e\n\u003cp\u003eGB1,\u0026nbsp;B1 immunoglobulin binding domain of \u003cem\u003eStreptococcal\u0026nbsp;\u003c/em\u003eprotein G\u003c/p\u003e\n\u003cp\u003eFh8, Putative calcium-binding protein\u003c/p\u003e\n\u003cp\u003eSmbP,\u0026nbsp;small metal-binding protein\u003c/p\u003e\n\u003cp\u003eTolA, The third domain of the periplasmic protein TolA\u003c/p\u003e\n\u003cp\u003eTrxA, Thioredoxin\u003c/p\u003e\n\u003cp\u003eMCS, multi-cloning site\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTEV, Tobacco Etch Virus\u003c/p\u003e\n\u003cp\u003eeGFP, enhanced green fluorescence protein\u003c/p\u003e\n\u003cp\u003eEcFabG, 3-ketoacyl-ACP reductase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eXccXanA2, 3-hydroxybenozate AMP ligase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eXccXanL, Chain length factor\u003c/p\u003e\n\u003cp\u003eLB, Luria Bertani\u003c/p\u003e\n\u003cp\u003eIPTG, Isopropyl-β-D-1-thiogalactopyranoside\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLLZ and MJC: Writing\u0026ndash;original draft, Writing\u0026ndash;review \u0026amp; editing, Data curation, Visualization. ZWB: Formal analysis. ZLH: Resources. HZ and JXB: Project administration. All authors reviewed the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by the following projects: Industry-University Cooperation Project (H20230317), National Natural Science Foundation of China (32570032) and Double First-class Discipline Promotion Project (2021B10564001).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll of the data generated and used in this work are included in the manuscript and are available as supplementary material.\u003c/p\u003e\n\u003cp\u003eEthics approval and consent to participate\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eItakura K, Hirose T, Crea R, Riggs AD, Heyneker HL, Bolivar F, Boyer HW. Expression in \u003cem\u003eEscherichia coli\u003c/em\u003e of a chemically synthesized gene for the hormone somatostatin. Science. 1977;198(4321):1056\u0026ndash;1063. https://doi.org/10.1126/science.412251.\u003c/li\u003e\n\u003cli\u003eHayat SMG, Farahani N, Golichenari B, Sahebkar A. Recombinant Protein Expression in \u003cem\u003eEscherichia coli\u003c/em\u003e (\u003cem\u003eE. coli\u003c/em\u003e): What We Need to Know. Curr Pharm Des. 2018;24(6):718\u0026ndash;725. https://doi.org/10.2174/1381612824666180131121940.\u003c/li\u003e\n\u003cli\u003eBi J, Tiong E, Koo YS, Zhou W, Wong FT. Further characterization and engineering of an 11-amino acid motif for enhancing recombinant soluble protein expression. Microbial Cell Factories. 2025;24(1):122. https://doi.org/10.1186/s12934-025-02738-5.\u003c/li\u003e\n\u003cli\u003eWang J, Guo H, Hou N, Xie Y, Zhang K, Li D. Research on enhancing the expression and immobilization of oxalate decarboxylase via the bicistronic translation coupling strategy. Journal of Cleaner Production. 2025;527:146650. https://doi.org/10.1016/j.jclepro.2025.146650.\u003c/li\u003e\n\u003cli\u003eZhong C, Wei P, Zhang YP. Enhancing functional expression of codon-optimized heterologous enzymes in \u003cem\u003eEscherichia coli\u003c/em\u003e BL21(DE3) by selective introduction of synonymous rare codons. Biotechnol Bioeng. 2017;114(5):1054\u0026ndash;1064. https://doi.org/10.1002/bit.26238.\u003c/li\u003e\n\u003cli\u003eRosano GL, Ceccarelli EA. Recombinant protein expression in \u003cem\u003eEscherichia coli\u003c/em\u003e: advances and challenges. Frontiers in microbiology. 2014;5:172. https://doi.org/10.3389/fmicb.2014.00172.\u003c/li\u003e\n\u003cli\u003eYan Y, Liu X, Li Q, Chu X, Tian J, Wu N. Effect of rare codons in C-terminal of green fluorescent protein on protein production in \u003cem\u003eEscherichia coli\u003c/em\u003e. Protein Expression Purif. 2018;149:23\u0026ndash;30. https://doi.org/10.1016/j.pep.2018.04.011.\u003c/li\u003e\n\u003cli\u003eZhao L, Cao J, Liu X, Li Y, Wu J, Su L. Optimizing protein folding in prokaryotes: Strategies to enhance soluble expression of recombinant proteins. Bioresour Technol. 2026;439:133266. https://doi.org/10.1016/j.biortech.2025.133266.\u003c/li\u003e\n\u003cli\u003eHammarstr\u0026ouml;m M, Hellgren N, van Den Berg S, Berglund H, H\u0026auml;rd T. Rapid screening for improved solubility of small human proteins produced as fusion proteins in \u003cem\u003eEscherichia coli\u003c/em\u003e. Protein Sci. 2002;11(2):313\u0026ndash;321. https://doi.org/10.1110/ps.22102.\u003c/li\u003e\n\u003cli\u003eMiroux B, Walker JE. Over-production of Proteins in \u003cem\u003eEscherichia coli\u003c/em\u003e: Mutant Hosts that Allow Synthesis of some Membrane Proteins and Globular Proteins at High Levels. J Mol Biol. 1996;260(3):289\u0026ndash;298. https://doi.org/10.1006/jmbi.1996.0399.\u003c/li\u003e\n\u003cli\u003eQing G, Ma L-C, Khorchid A, Swapna GVT, Mal TK, Takayama MM, Xia B, Phadtare S, Ke H, Acton T\u003cem\u003e et al\u003c/em\u003e. Cold-shock induced high-yield protein production in \u003cem\u003eEscherichia coli\u003c/em\u003e. Nat Biotechnol. 2004;22(7):877\u0026ndash;882. https://doi.org/10.1038/nbt984.\u003c/li\u003e\n\u003cli\u003ede Marco A, De Marco V. Bacteria co-transformed with recombinant proteins and chaperones cloned in independent plasmids are suitable for expression tuning. J Biotechnol. 2004;109(1-2):45\u0026ndash;52. https://doi.org/10.1016/j.jbiotec.2003.10.025.\u003c/li\u003e\n\u003cli\u003eTang NC, Su JC, Shmidov Y, Kelly G, Deshpande S, Sirohi P, Peterson N, Chilkoti A. Synthetic intrinsically disordered protein fusion tags that enhance protein solubility. Nat Commun. 2024;15(1):3727. https://doi.org/10.1038/s41467-024-47519-7.\u003c/li\u003e\n\u003cli\u003edi Guana C, Lib P, Riggsa PD, Inouyeb H. Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose-binding protein. Gene. 1988;67(1):21\u0026ndash;30. https://doi.org/10.1016/0378-1119(88)90004-2.\u003c/li\u003e\n\u003cli\u003eDyson MR, Shadbolt SP, Vincent KJ, Perera RL, McCafferty J. Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression. BMC Biotechnol. 2004;4:32. https://doi.org/10.1186/1472-6750-4-32.\u003c/li\u003e\n\u003cli\u003eMalakhov MP, Mattern MR, Malakhova OA, Drinker M, Weeks SD, Butt TR. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics. 2004;5(1-2):75\u0026ndash;86. https://doi.org/10.1023/b:Jsfg.0000029237.70316.52.\u003c/li\u003e\n\u003cli\u003eWang HZ, Chu ZZ, Chen CC, Cao AC, Tong X, Ouyang CB, Yuan QH, Wang MN, Wu ZK, Wang HH\u003cem\u003e et al\u003c/em\u003e. Recombinant Passenger Proteins Can Be Conveniently Purified by One-Step Affinity Chromatography. PLoS One. 2015;10(12):e0143598. https://doi.org/10.1371/journal.pone.0143598.\u003c/li\u003e\n\u003cli\u003eOrrapin S, Intorasoot S. Recombinant expression of novel protegrin-1 dimer and LL-37-linker\u0026ndash;histatin-5 hybrid peptide mediated biotin carboxyl carrier protein fusion partner. Protein Expression Purif. 2014;93:46\u0026ndash;53. https://doi.org/10.1016/j.pep.2013.10.010.\u003c/li\u003e\n\u003cli\u003eBao WJ, Gao YG, Chang YG, Zhang TY, Lin XJ, Yan XZ, Hu HY. Highly efficient expression and purification system of small-size protein domains in \u003cem\u003eEscherichia coli\u003c/em\u003e for biochemical characterization. Protein Expression Purif. 2006;47(2):599\u0026ndash;606. https://doi.org/10.1016/j.pep.2005.11.021.\u003c/li\u003e\n\u003cli\u003eCosta SJ, Almeida A, Castro A, Domingues L, Besir H. The novel Fh8 and H fusion partners for soluble protein expression in Escherichia coli: a comparison with the traditional gene fusion technology. Appl Microbiol Biotechnol. 2013;97(15):6779\u0026ndash;6791. https://doi.org/10.1007/s00253-012-4559-1.\u003c/li\u003e\n\u003cli\u003eVargas-Cortez T, Morones-Ramirez JR, Balderas-Renteria I, Zarate X. Expression and purification of recombinant proteins in\u003cem\u003e Escherichia coli \u003c/em\u003etagged with a small metal-binding protein from Nitrosomonas europaea. Protein Expression Purif. 2016;118:49\u0026ndash;54. https://doi.org/10.1016/j.pep.2015.10.009.\u003c/li\u003e\n\u003cli\u003eAnderluh G, G\u0026ouml;k\u0026ccedil;e I, Lakey JH. Expression of proteins using the third domain of the \u003cem\u003eEscherichia coli \u003c/em\u003eperiplasmic-protein TolA as a fusion partner. Protein Expression Purif. 2003;28(1):173\u0026ndash;181. https://doi.org/10.1016/S1046-5928(02)00681-2.\u003c/li\u003e\n\u003cli\u003eSong J, Chen W, Lu Z, Hu X, Ding Y. Soluble expression, purification, and characterization of recombinant human flotillin-2 (reggie-1) in \u003cem\u003eEscherichia coli\u003c/em\u003e. Mol Biol Rep. 2011;38(3):2091\u0026ndash;2098. https://doi.org/10.1007/s11033-010-0335-4.\u003c/li\u003e\n\u003cli\u003eLaVallie ER, DiBlasio EA, Kovacic S, Grant KL, Schendel PF, McCoy JM. A Thioredoxin Gene Fusion Expression System That Circumvents Inclusion Body Formation in the\u003cem\u003e E. coli \u003c/em\u003eCytoplasm. Nat Biotechnol. 1993;11(2):187\u0026ndash;193. https://doi.org/10.1038/nbt0293-187.\u003c/li\u003e\n\u003cli\u003eK\u0026ouml;ppl C, Lingg N, Fischer A, Kr\u0026ouml;\u0026szlig; C, Loibl J, Buchinger W, Schneider R, Jungbauer A, Striedner G, Cserjan-Puschmann M. Fusion Tag Design Influences Soluble Recombinant Protein Production in \u003cem\u003eEscherichia coli\u003c/em\u003e. Int J Mol Sci. 2022;23(14). https://doi.org/10.3390/ijms23147678.\u003c/li\u003e\n\u003cli\u003eKapust RB, T\u0026ouml;zs\u0026eacute;r J, Copeland TD, Waugh DS. The P1\u0026prime; specificity of tobacco etch virus protease. Biochem Biophys Res Commun. 2002;294(5):949\u0026ndash;955. https://doi.org/10.1016/S0006-291X(02)00574-0.\u003c/li\u003e\n\u003cli\u003eAli ST, Guest JR. Isolation and characterization of lipoylated and unlipoylated domains of the E2p subunit of the pyruvate dehydrogenase complex of \u003cem\u003eEscherichia coli\u003c/em\u003e. Biochem J. 1990;271(1):139\u0026ndash;145. https://doi.org/10.1042/bj2710139.\u003c/li\u003e\n\u003cli\u003eHu Z, Ma J, Chen Y, Tong W, Zhu L, Wang H, Cronan JE. \u003cem\u003eEscherichia coli \u003c/em\u003eFabG 3-ketoacyl-ACP reductase proteins lacking the assigned catalytic triad residues are active enzymes. J Biol Chem. 2021;296:100365. https://doi.org/10.1016/j.jbc.2021.100365.\u003c/li\u003e\n\u003cli\u003eCao XQ, Wang JY, Zhou L, Chen B, Jin Y, He YW. Biosynthesis of the yellow xanthomonadin pigments involves an ATP-dependent 3-hydroxybenzoic acid: acyl carrier protein ligase and an unusual type II polyketide synthase pathway. Mol Microbiol. 2018;110(1):16\u0026ndash;32. https://doi.org/10.1111/mmi.14064.\u003c/li\u003e\n\u003cli\u003eMalhotra A. Tagging for protein expression. Methods Enzymol. 2009;463:239\u0026ndash;258. 10.1016/S0076-6879(09)63016-0.\u003c/li\u003e\n\u003cli\u003eBernier SC, Cantin L, Salesse C. Systematic analysis of the expression, solubility and purification of a passenger protein in fusion with different tags. Protein Expression Purif. 2018;152:92\u0026ndash;106. https://doi.org/10.1016/j.pep.2018.07.007.\u003c/li\u003e\n\u003cli\u003eJo BH. An Intrinsically Disordered Peptide Tag that Confers an Unusual Solubility to Aggregation-Prone Proteins. Appl Environ Microbiol. 2022;88(7):e0009722. https://doi.org/10.1128/aem.00097-22.\u003c/li\u003e\n\u003cli\u003eFeilmeier BJ, Iseminger G, Schroeder D, Webber H, Phillips GJ. Green fluorescent protein functions as a reporter for protein localization in \u003cem\u003eEscherichia coli\u003c/em\u003e. J Bacteriol. 2000;182(14):4068\u0026ndash;4076. https://doi.org/10.1128/jb.182.14.4068-4076.2000.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"microbial-cell-factories","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"micf","sideBox":"Learn more about [Microbial Cell Factories](http://microbialcellfactories.biomedcentral.com/)","snPcode":"12934","submissionUrl":"https://submission.nature.com/new-submission/12934/3","title":"Microbial Cell Factories","twitterHandle":"@BioMedCentral","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Escherichia coli, Fusion tags, pNX vectors, Protein solubility, Lipoyl domain (LD), High-throughput screening","lastPublishedDoi":"10.21203/rs.3.rs-7875802/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7875802/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eThe production of recombinant proteins in \u003cem\u003eEscherichia coli\u003c/em\u003e (\u003cem\u003eE. coli\u003c/em\u003e) is often hampered by the formation of inclusion bodies. While fusion tags can enhance solubility, existing systems are hampered by a lack of standardization, with tags scattered across disparate plasmid backbones and inconsistent cloning sites, complicating high-throughput screening.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eTo address this, we constructed a standardized series of expression vectors, termed pNX, by incorporating nine small fusion tags (SUMO, LD, ACP, BCCP, GB1, Fh8, SmbP, TolA, and TrxA) into a uniform pET-28b backbone. Each pNX vector features an identical configuration: a T7 promoter, an N-terminal fusion tag, a 6\u0026times;His tag, a linker, a TEV protease cleavage site, a multiple cloning site (MCS), and a C-terminal 6\u0026times;His tag. We evaluated this system using four model proteins (EcFabG, eGFP, XccXanA2, and XccXanL). Our results showed that specific tags significantly improved both the expression level and solubility of the target proteins without compromising their biological activity. Notably, the lipoyl domain (LD) was identified for the first time as an effective solubility enhancer. The standardized MCS enabled rapid, parallel cloning, facilitating the efficient screening of optimal fusion partners.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e\u003cp\u003eThe pNX vector series provides a versatile and high-throughput platform for enhancing the soluble expression of challenging recombinant proteins in \u003cem\u003eE. coli\u003c/em\u003e, streamlining the empirical identification of ideal fusion tags.\u003c/p\u003e","manuscriptTitle":"A standardized set of pNX vectors for enhanced soluble expression of recombinant proteins in E. coli using small fusion tags","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-02 09:23:46","doi":"10.21203/rs.3.rs-7875802/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-11-17T13:53:56+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-17T13:47:45+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-10T04:09:13+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-29T21:07:17+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"102934922839220596260362177035669332000","date":"2025-10-29T20:09:37+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"50522917866444442678173453978053931541","date":"2025-10-27T08:44:54+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"196231068971387673027620267046694327810","date":"2025-10-23T03:27:02+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-22T02:51:31+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-10-21T04:51:41+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-10-21T04:50:16+00:00","index":"","fulltext":""},{"type":"submitted","content":"Microbial Cell Factories","date":"2025-10-16T09:34:16+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"microbial-cell-factories","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"micf","sideBox":"Learn more about [Microbial Cell Factories](http://microbialcellfactories.biomedcentral.com/)","snPcode":"12934","submissionUrl":"https://submission.nature.com/new-submission/12934/3","title":"Microbial Cell Factories","twitterHandle":"@BioMedCentral","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"325f77c1-b8cc-43cf-83eb-916a449064da","owner":[],"postedDate":"November 2nd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-12-22T16:00:42+00:00","versionOfRecord":{"articleIdentity":"rs-7875802","link":"https://doi.org/10.1186/s12934-025-02903-w","journal":{"identity":"microbial-cell-factories","isVorOnly":false,"title":"Microbial Cell Factories"},"publishedOn":"2025-12-19 15:57:32","publishedOnDateReadable":"December 19th, 2025"},"versionCreatedAt":"2025-11-02 09:23:46","video":"","vorDoi":"10.1186/s12934-025-02903-w","vorDoiUrl":"https://doi.org/10.1186/s12934-025-02903-w","workflowStages":[]},"version":"v1","identity":"rs-7875802","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7875802","identity":"rs-7875802","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.