Constraint-based metabolic reconstruction and analysis of Synechococcus elongatus PCC 11801 and PCC 11802 for bioengineering

doi:10.22541/au.174831125.54586289/v1

Constraint-based metabolic reconstruction and analysis of Synechococcus elongatus PCC 11801 and PCC 11802 for bioengineering

2025 · doi:10.22541/au.174831125.54586289/v1

preprint OA: closed

Full text JSON View at publisher

Full text 48,805 characters · extracted from oa-doi-fallback · 5 sections · click to expand

Abstract

not-yet-known not-yet-known not-yet-known unknown Constraint-based reconstruction and analysis (COBRA) is a powerful systems biology approach for computational bioengineering. Synechococcus elongatus PCC 11801 and PCC 11802 are fast-growing, stress-tolerant cyanobacteria that are promising platforms for photosynthetic biomanufacturing. Here, we present constraint-based models (CBMs) iLV1052 and iLV1087 of PCC 11801 and PCC 11802, respectively, to facilitate and streamline strain engineering efforts. Following draft reconstruction using a template model, the models underwent extensive manual curation to reduce redundancy, and verification using BiGG, KEGG and BRENDA databases. We added 281 and 69 new reactions for PCC11801 and PCC11802, respectively, associated with stress tolerance, growth stability, antioxidant defense, energy regulation, and sulfur acquisition. The models were refined through iterative debugging and validation using flux balance analysis, flux variability analysis, and single gene/reaction deletion analysis. Gene essentiality predictions gave 69% accuracy for PCC 11801 and 83% for PCC 11802. The flux maps captured key features of cyanobacterial metabolism, including an incomplete TCA cycle. The final PCC11802 CBM contained 1130 reactions, 1052 genes, and 930 metabolites, while the PCC 11802 CBM included 1199 reactions, 1087 genes, and 951 metabolites. Using the Optknock framework, phosphoenolpyruvate carboxylase (PEPC) was identified to be a metabolic hotspot for bioengineering of valuable products like ethanol, butanol, succinic acid and butanediol. Constraint-based metabolic reconstruction and analysis of Synechococcus elongatus PCC 11801 and PCC 11802 for bioengineering Lokesh V, Pramod P. Wangikar* Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai, India *Corresponding author: Pramod P. Wangikar Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai – 400076, India. Phone/Fax: +91 22 2576 7232; Email: [email protected] Author mail: Lokesh V ([email protected]) Funding: The work was supported by grants from Wadhwani Research Center for Bioengineering at IIT Bombay and Department of Biotechnology, Government of India (DBT Pan IIT center for Bioenergy Phase II, BT/PR41982/PBD/26/822/2021). Author contributions: PPW conceptualized the research. LV performed the research and modeling, and analyzed the data. LV wrote the manuscript. PPW acquired the funding, supervised the research, and edited the manuscript. Conflict of interest: The authors declare no conflict of interest. Data availability statement: The data and models generated in this study are provided in the article and its Supporting Information. not-yet-known not-yet-known not-yet-known unknown Supporting Information statement: S1: contains supplementary tables and figures, and codes used for analyses. S2. Summary of reciprocal BLAST hit analysis (RBBH) and gap filling for PCC11801 and PCC11802 models. S3. Results of gene essentiality analysis. S4. Results of FROG test of model reproducibility for PCC11801 and PCC11802 CBMs. S5. Constraint based models of S. elongatus PCC11801 and PCC11802 in Excel format. S6: Constraint based model of S. elongatus PCC11801 in SBML format. S7. Constraint based model of S. elongatus PCC11802 in SBML format.

Abstract

Constraint-based reconstruction and analysis (COBRA) is a powerful systems biology approach for computational bioengineering. Synechococcus elongatus PCC 11801 and PCC 11802 are fast-growing, stress-tolerant cyanobacteria that are promising platforms for photosynthetic biomanufacturing. Here, we present constraint-based models (CBMs) iLV1052 and iLV1087 of PCC 11801 and PCC 11802, respectively, to facilitate and streamline strain engineering efforts. Following draft reconstruction using a template model, the models underwent extensive manual curation to reduce redundancy, and verification using BiGG, KEGG and BRENDA databases. We added 281 and 69 new reactions for PCC11801 and PCC11802, respectively, associated with stress tolerance, growth stability, antioxidant defense, energy regulation, and sulfur acquisition. The models were refined through iterative debugging and validation using flux balance analysis, flux variability analysis, and single gene/reaction deletion analysis. Gene essentiality predictions gave 69% accuracy for PCC 11801 and 83% for PCC 11802. The flux maps captured key features of cyanobacterial metabolism, including an incomplete TCA cycle. The final PCC11802 CBM contained 1130 reactions, 1052 genes, and 930 metabolites, while the PCC 11802 CBM included 1199 reactions, 1087 genes, and 951 metabolites. Using the Optknock framework, phosphoenolpyruvate carboxylase (PEPC) was identified to be a metabolic hotspot for bioengineering of valuable products like ethanol, butanol, succinic acid and butanediol.

Keywords

Constraint based metabolic model, Flux Balance Analysis, COBRA toolbox, Cyanobacteria, Computational bioengineering

Introduction

Constraint based models (CBMs) are mathematical representations of the complete metabolic network of an organism. These models incorporate the biochemical reactions along with the gene-protein-reaction relationships and reaction directionalities to provide a systems level blueprint of cellular metabolism (Mardinoglu & Palsson, 2024). CBMs enable tracing the direction and flux of carbon flow through different pathways and are instrumental in elucidating genotype-phenotype relationships (Schilling & Palsson, 2000). Through constraint-based simulations such as flux balance analysis (FBA), CBMs can predict metabolic capabilities, evaluate the effect of genetic perturbations, and guide strain engineering strategies, thereby reducing reliance on extensive experimental screening (Orth et al., 2010; Sahu et al., 2021). Cyanobacteria are promising chassis for sustainable photosynthetic biomanufacturing because of their ability to fix CO 2 from the environment and produce a diverse array of chemicals (Mescouto et al., 2024). In recent years, several strains including Synechocystis sp. PCC 6803, Synechococcus elongatus PCC 7942, UTEX 2973, Synechococcus sp. PCC 7002, and PCC 11901 have been developed as platforms to produce platform chemicals and biofuels (Agarwal et al., 2022). While good product yields have been demonstrated from engineered cyanobacteria on lab scale (Gong et al., 2024), they are still significantly lower than those achieved using heterotrophs like E. coli and yeast. Guided strain engineering efforts, applied to fast-growing and environmentally robust cyanobacteria are critical to increasing productivity and developing commercially viable processes (Kim et al., 2024). We recently isolated eight Synechococcus elongatus strains (PCC 11801, PCC 11802, and IITB3–8) from a freshwater habitat (Jaiswal et al., 2018). These strains exhibit desirable characteristics including short doubling time (<3 h), tolerance to high light intensity (up to 1000 µmol s -1 m -2 ), and growth on 1% CO 2, along with amenability to genetic engineering (Jaiswal et al., 2018, 2020), making them attractive candidates for photosynthetic biomanufacturing. We have employed multi-omics analyses (Jain et al., 2024; Jaiswal et al., 2023; Jaiswal & Wangikar, 2020; Mehta et al., 2019) to understand their physiology and metabolism under different cultivation conditions, and developed synthetic biology toolboxes to facilitate metabolic engineering (Madhu et al., 2023, 2024). Applying these tools, we have demonstrated high titers of succinate, ethylene (Sengupta et al. 2020), and mannitol (Pritam et al., 2023) at lab scale from engineered PCC 11801. Likewise, PCC 11802 and IITB6 have yielded high titers of ethanol, 2,3-butanediol, and alkanes (Srivastava et al., 2025; Srivastava & Wangikar, 2024) To facilitate further strain engineering for scale up and commercial viability, a deeper understanding of the metabolism and capabilities of these cyanobacterial strains is essential. CBMs, by enabling predictive analyses of metabolic fluxes and genetic interventions, serve as valuable tools for this purpose. CBMs have been constructed for several cyanobacterial strains such as PCC 6803 (iJN678) (Nogales et al., 2012), PCC 7942 (iJB785) (Triana et al., 2014), UTEX 2973 (Mueller et al., 2017), and PCC 11901 (Ravindran et al., 2024), and utilized to understand their growth and photosynthetic metabolism and guide strain engineering for the production of different target compounds (Mueller et al., 2017; Santos-Merino et al., 2023). In this study, we reconstructed high quality CBMs for S. elongatus PCC 11801 and PCC 11802 using a previously defined workflow (Thiele & Palsson, 2010) with the PCC 7942 GEM as reference model. The models were subjected to extensive manual curation and validated using gene essentiality analysis to ensure biological accuracy and predictive utility. Flux maps were generated to visualize and understand the cyanobacterial metabolism. not-yet-known not-yet-known not-yet-known unknown 2. Materials and methods 2.1. Metabolic reconstruction of S. elongatus PCC 11801 and PCC 11802 The high-quality metabolic model iJB785 of Synechococcus elongatus PCC 7942 (Broddrick et al., 2016) was used as the template model to generate the CBM of PCC 11801, as these strains share 81% phylogenetic identity. The CBM of PCC 11801 after curation and validation (as described in Results) served as template to generate the PCC 11802 GEM (97% phylogenetic identity). Draft CBMs were generated through bidirectional best hit analysis of the genomes as described earlier (Mueller et al., 2013). Genes were identified as homologous based on stringent thresholds: query coverage > 60%, e-value ≤ 10 -5 and score density ≥ 0.6, following established criteria (Tiruveedula & Wangikar, 2017). Non-homologous genes as identified from bidirectional hits were systematically removed from the template model. The homologous gene pairs were used to map reviewed gene-protein-reaction (GPR) associations from PCC 7942 to PCC 11801, and subsequently from PCC 11801 to PCC 11802. The draft models then underwent extensive manual curation. GPR associations were reviewed and updated based on a detailed literature survey and cross-referencing with multiple databases, including BiGG (King et al., 2016), KEGG (Kanehisa & Goto, 2000) and BRENDA (Schomburg et al., 2000) databases. Additional reactions supported by genome annotation were manually incorporated in the models. These reactions were assigned appropriate GPR associations and the corresponding genes were annotated according to the PCC 11801 and PCC 11802 genomes. Model consistency checks were performed to identify and rectify mass-imbalanced or gap-associated reactions. All model manipulations and analyses were conducted using the COBRA toolbox (v3.0) within MATLAB R2022a. not-yet-known not-yet-known not-yet-known unknown 2.2. Flux balance analysis Flux balance analysis (FBA) was performed as described by Orth et al. (Orth et al., 2010) with the constructed CBMs to predict the intracellular reaction fluxes under steady state conditions. All FBA simulations in the study were performed using the MATLAB interface-based COBRA toolbox with the glpk solver (Becker et al., 2007). FBA identifies a flux distribution that optimizes a specific objective function, typically biomass production, subject to steady state mass balance constraints and thermodynamic constraints (equation 1). Max: z = Vbiomass S.v = 0 αi ≤ vi ≤ βi (1) where Vbiomass is the predicted growth rate of the cell, S is the stoichiometric matrix in which Sij is the stoichiometric coefficient of ith metabolite in jth reaction of the network, vi is the ith component of the flux vector V, and α and β denote the lower and upper bound of fluxes, respectively. The resulting underdetermined system of equations defines a solution space, which is shrunk by the constraints imposed. The optimal solution is identified by maximizing the objective function within this feasible space. Flux variability analysis (FVA) was used to determine the range of feasible fluxes for every reaction through iterative maximization and minimization of flux subject to FBA constraints (Becker et al., 2007). not-yet-known not-yet-known not-yet-known unknown 2.3. Gene essentiality analysis Gene essentiality analysis was conducted to identify essential and non-essential genes by simulating gene deletions and comparing the results with experimental gene essentiality datasets (Broddrick et al., 2016). For each gene, FBA was performed after its removal from the model, and biomass was estimated. A gene was classified as essential if the biomass fell below 10% of the biomass of the wild type. The in silico classification of essential and non-essential genes was compared with the available in vivo gene essentiality data reported for the closely related model strain S. elongatus PCC 7942 (Broddrick et al., 2016; Rubin et al., 2015). The predictive performance of the GEM was evaluated by determining the specificity and sensitivity as follows (Saha et al., 2012): Specificity = GG/ (GG + NGG) (2) Sensitivity = NGNG / (NGNG + GNG) (3) where the in silico and in vivo results for growth and no growth agree (GG or NGNG) or conflict (GNG or NGG). 2.4. Reciprocal BLAST: Reciprocal BLAST analysis was conducted to identify orthologous genes between S. elongatus PCC 11801 and related cyanobacterial strains. Protein sequences from PCC 11801 were used as queries in blastp searches against the proteome of the reference strains (e.g. S. elongatus PCC 7942), and vice versa, using the NCBI BLAST+ suite. Gene pairs were considered orthologous if they were reciprocal blast hits (RBHs) in both directions. The BLAST algorithm is based on a heuristic method that approximates Smith-Waterman local alignments (Smith & Waterman, 1981), by identifying high-scoring segment pairs (HSPs) through a seed and extend strategy. The core scoring system uses substitution matrices such as BLOSUM62 (Henikoff & Henikoff, 1992) with each alignment assigned a bit score (S’) and an expectation value (E). The E-value reflects the number of alignments with a score ≥ S that are expected to occur by chance in a database search, and is defined as E = K mne- λS (4) where m and n are the lengths of the query and database sequences, respectively, K and λ are statistical parameters specific to the scoring system used, and S is the raw alignment score (Karlint & Altschult, 1990). Lower E-values indicate more significant alignments. Only gene pairs satisfying all thresholds and statistical significance were retained for mapping gene-protein-reaction (GPR) relationships during genome scale metabolic model reconstruction. 2.5. Strain cultivation and formulation of biomass equation Synechococcus elongatus PCC 11801 and PCC 11802 were cultivated under optimized growth conditions as reported earlier (Jaiswal et al., 2018, 2020). Briefly, 5 mL BG-11 medium was inoculated using a single colony from agar plate and cultivated for 24 h at 38 ºC with 120 rpm shaking (Innova 44 shaker, New Brunswick) under 50 µmol m -1 s -2 light and ambient (0.04%) CO 2 . This culture was adapted to 300 µmol m -1 s -2 light and 1% CO 2, and subsequently inoculated (OD 730 = 0.05) in 20 mL BG-11 in 100 mL shake flask for cultivation under the same conditions. Cultivation was performed till mid-exponential growth (OD 730 = 0.6-0.7). Experimental measurements of biomass components were used to derive the biomass equation. Cells were harvested from exponentially growing cultures, dried at 65 ºC overnight to constant weight, and analyzed for protein and lipid content according to standard spectrophotometric assays. Proteins were extracted using a lysis buffer containing 1.1 mM Na 2 EDTA, 0.2 mM PMSF, and 0.5 % Triton-X 100, and quantified using the Bradford assay (Bradford, 1976) Figure S1, Table S1). Lipids were extracted using 1:2 methanol:chloroform, reacted with sulphuric acid and vanillin-phosphoric acid reagent, and quantified from absorbance at 530 nm (Byreddy et al., 2016; Park et al., 2016). Concentrations of all other biomass components were adapted from PCC 7942 (Broddrick et al., 2016). The resulting biomass equation represents the stoichiometric coefficients (mmol of precursors) needed to form 1 g biomass. 2.6. Implementation of CBMs for strain design The OptKnock algorithm (Burgard et al., 2003) was implemented with the curated CBMs of PCC 11801 and PCC 11802 to identify potential reaction knockouts that could enhance the production of heterologous industrially relevant compounds such as ethanol, butanol, succinic acid and butanediol. OptKnock formulates a bilevel optimization problem, where the outer problem maximizes the production flux of the target compound, and the inner problem ensures optimal cellular growth. Maximize: V ev_product (production flux) subject to Maximize v biomass subject to S.v = 0 α i ≤ v i ≤ β I (5) where, V ex_product is flux through the product of interest, V biomass is the predicted growth rate of the cell, S is the stoichiometric matrix in which S ij represents the stoichiometric coefficient of i th component in j th reaction of the network, v i is i th component of the flux vector V, and α and β denote the lower and upper bound of fluxes, respectively. Heterologous biosynthetic pathways and the corresponding product exchange reaction were incorporated into the model for simulating the production of ethanol (Deng & Coleman, 1999), butanol (Lan & Liao, 2011), and succinate (Sengupta et al., 2020) production. 3. Results and Discussion: We reconstructed CBMs iLV1052 and iLV1087 for accelerating engineering efforts of S. elongatus PCC 11801 and PCC 11802, respectively, which are fast-growing robust cyanobacterial strains that have significant potential to be developed for photosynthetic biomanufacturing (Jaiswal, Sahasrabuddhe, et al., 2022). 3.1. CBM reconstruction and attributes PCC 11801 CBM: The PCC 11801 CBM was generated using its annotated genome (Jaiswal et al., 2018) and the available GEM for S. elongatus PCC 7942 as reference model. Reciprocal BLAST was employed to remove 29 non-homologous genes of PCC 7942, leading to an initial draft containing 849 reactions, 756 genes, and 768 metabolites. COBRA toolbox was used to refine the model through systematic removal of the identified non-homologous genes. A Memote score of 74% was obtained for the draft model (Figure S1). Reactions were curated for elemental and charge balance, with 12 mass imbalanced reactions corrected manually. The reversibility of all the reactions were checked against the metabolic databases and updated by assigning the appropriate directionality. From the model, 91 reactions were identified as blocked, i.e., incapable of carrying flux under any condition), and 124 orphan reactions that lacked gene annotations. These gaps were characterized using systematic gap analysis pipeline into thermodynamic inconsistencies, dead-end metabolites, and missing transport reactions. Importantly, 344 low confidence reactions were identified, particularly in transport and photosynthetic reactions and murein synthesis pathways. Manual curation and gap filling led to the inclusion of 281 reactions and 267 genes based on genome annotation data, biochemical pathways (from KEGG, BiGG, and PubChem), and literature (Supporting Information S2). This knowledge-based refinement is critical for ensuring predictive fidelity of the GEM, especially when exploring model-guided strain design strategies. The final curated CBM of S. elongatus PCC 11801 (iLV1052) contains 1130 reactions, 1052 genes, and 930 metabolites (Table 1). The total number of genes included in the model corresponded to 37% of the characterized open reading frames (ORFs). The model included 78 transport and 47 exchange reactions. The model possesses various metabolic pathways associated with functions including photosynthesis, respiration, carbohydrate metabolism, amino acid metabolism, and fatty acid biosynthesis. Three compartments – cytosol, periplasm, and thylakoid were incorporated in the model. There were 928 reactions localized to the cytosol, 56 reactions in the periplasm, and 35 reactions in the thylakoid, with 42 reactions involved in metabolite transport from the thylakoid to the cell membrane. To formulate the biomass equation, we experimentally determined the major components of PCC 11801 biomass (Figure S2). The remaining constituents and their coefficients were directly adopted from the previously characterized close phylogenetic neighbor S. elongatus PCC 7942. The final biomass equation for PCC 11801 added to the CBM incorporated 13 metabolic components. PCC 11802 CBM : The draft CBM for PCC 11802 was generated using its annotated genome (Jaiswal et al. 2020) and the refined and curated model of PCC 11801 as the template model. Homology mapping through reciprocal BLAST analysis enabled removal of 46 non-homologous genes that were not present in PCC 11802 (Supporting Information S2). The draft model was further refined by manually adding and curating 69 novel reactions that were unique to PCC 11802, as identified from genome annotation and biological databases. The final curated CBM of PCC 11802 comprises 1087 genes, 1199 reactions, and 952 metabolites (Table 1). The reconstructed model contains 147 reactions without gene associations and not present in the PCC 7942 model, indicating putative pathways that warrant further experimental validation. Five compartments: cytosol, periplasm, thylakoid, carboxysome, and extracellular were incorporated in the model. 3.2. Model validation: The predictive capability of iLV1052 (PCC11801) and iLV1087 (PCC11802) CBMs was assessed through in silico gene essentiality analysis, a standard benchmark for evaluating model fidelity. The analysis identifies genes whose deletion would render the cell non-viable, offering a functional test of the accuracy of the model (Hirose et al., 2024). Here, we compared the gene essentiality predicted by the PCC 11801 and PCC 11802 CBMs against experimental datasets available for the closely related model strain S. elongatus PCC 7942 (Supporting Information S3). From a gene essentiality dataset of 371 genes from PCC 7942 (Broddrick et al., 2016), 253 genes were predicted correctly by the PCC 11801 model, including 141 essential and 112 non-essential genes (Figure 1a). The model incorrectly predicted 106 non-essential genes as false positives, while only 12 genes were incorrectly predicted as essential (false negatives). From this analysis, the specificity (i.e., proportion of essential genes that were correctly identified), and sensitivity (i.e., proportion of true negatives that were correctly predicted) of the model were calculated to be 0.903 and 0.571, respectively, indicating higher precision in identifying essential genes, though with moderate recall for non-essential genes. In a second round of analysis with a larger experimental dataset from PCC 7942 (Rubin et al., 2015), the model correctly predicted 328 of 475 genes with similar number of mispredictions. Overall, the gene essentiality predictions by iLV1052 showed a high level of accuracy (69%), supporting the robustness and utility of the model in simulating genotype-phenotype relationships in cyanobacteria. The PCC 11802 model iLV1087 was also similarly validated by comparing with the PCC 7942 gene essentiality datasets (Figure 1b). Against a dataset of 229 genes (Broddrick et al. 2016), the model accurately predicted the essentiality of 185 genes, including 134 essential and 51 non-essential genes, resulting in a specificity of 0.792 and sensitivity of 0.886. With the second dataset (Rubin et al., 2015), the model correctly predicted 132 out of 143 essential genes and 42 out of 59 non-essential genes, corresponding to a specificity of 0.823 and sensitivity of 0.802. The overall accuracy of the model was estimated to be 83%. In addition, the metabolic robustness was estimated to be 38% for PCC 11801 and 34% for PCC 11802 CBMs, which was comparable to that of Synechocystis sp. PCC6803 (Nogales et al., 2012). This indicated a balanced distribution of essential and non-essential genes and affirming the potential of these strains for metabolic engineering. A subsystem level analysis revealed that the maximum number of incorrect predictions by both GEMs were associated with amino acid metabolism, amounting to 55% of false positives and 60% of false negatives (Figure 2). The models may have failed to capture the energetic costs or kinetic constraints that render certain reactions conditionally essential. Furthermore, the assumption of steady state mass balance and the absence of regulatory constraints in the simulation framework limit the ability of the model to reflect essential genes that are indispensable under specific environment or cellular context. False negatives were most frequently present in nucleotide metabolism and cofactor/vitamin biosynthesis. These pathways often rely on complex, multi-step reactions which may not be fully captured in stoichiometric reconstructions. The inability of the model to accurately represent salvage pathways or regulatory bottlenecks in nucleotide synthesis likely contributed to the underestimation of gene essentiality in the subsystem. Moreover, incomplete representation of essential vitamins such as biotin or thiamine may have allowed the model to erroneously bypass certain biosynthetic genes under simulated conditions. Transport reactions were also a major source of false predictions. Many transporters are poorly annotated in cyanobacteria, and their absence from model may create a false impression of metabolite availability, masking autotrophic dependencies. This was particularly evident in subsystems involving cofactor uptake and exchange reactions, where missing transporters led to underprediction of essentiality. Growth of PCC 11801 and PCC 11802 under autotrophic conditions on BG-11 medium was simulated using FBA. The iLV1052 model showed a biomass formation rate of 0.053 h -1, utilizing 0.04% CO 2 along with a nitrogen source and essential metal ions in the presence of light. These predictions align with experimental growth rates of 0.05 h -1 reported for PCC 11801 and PCC 11802 under ambient conditions (Jaiswal et al., 2018, 2020). Both models were able to synthesize all the necessary biomass precursors in autotrophic condition. The photosynthetic quotient (PQ) was estimated to be 1.5, which was in the relevant PQ range of 1.1 to 2.1 obtained for cyanobacterial species (Allen, 2003; Zavrel et al., 2019). Additionally, a Photosystem I (PSI)/Photosystem II (PSII) ratio of 4 was predicted for PCC 11801, which falls in the range of >1.2 expected for cyanobacteria. 3.3. Model comparison and unique reactions Manual curation of the PCC 11801 and PCC 11802 CBMs involved rigorous assessment of GPR associations, subsystem annotation, and reaction stoichiometry. Post curation, the models demonstrated a more accurate distribution of reactions across subsystems and key metabolic pathways (Table 2). There were differences in the number of reactions at the subsystem level between the models, reflecting strain-specific variations in metabolism. The curated models also contained a larger or similar coverage of metabolites, genes, and reactions than CBMs available for other known cyanobacterial strains (Table S1). We identified 281 unique reactions in PCC11801 compared to the PCC 7942 reference model and 69 unique reactions in PCC 11802 compared to PCC11801. These reactions were predominantly distributed among porphyrin metabolism, purine metabolism, and cysteine and methionine metabolism. Porphyrin metabolism (i) contributes to the biosynthesis of vitamin B12 intermediates, which are rare in cyanobacteria, and (ii) enhances the production of chlorophyll and phycobilins, improving the photosynthetic efficiency. Purine metabolism is central to energy regulation, governing the biosynthesis of nucleotides such as ATP and GTP, and is essential for growth under fluctuating environmental conditions. Cysteine and methionine metabolism generates products such as glutathione and S-adenosylmethionine that are involved in protecting against oxidative damage and heat stress. 3.4. Flux distribution of S. elongatus strains under photoautotrophic conditions We simulated the intracellular flux distribution of PCC 11801 and PCC 11802 for growth under photoautotrophic conditions to gain insights into the central metabolic pathways (Figure 3). As common in cyanobacteria (Hendry et al., 2016), CO 2 is fixed through RuBisCo during photosynthesis, and the carbon enters the central metabolism through 3-phosphoglycerate. A significantly high flux was observed through the phosphoglycerate kinase reaction, reflecting active carbon flow through the Calvin-Benson-Bassham (CBB) cycle. The TCA cycle was predicted to be incomplete, consistent with cyanobacterial physiology, where partial TCA activity supports biosynthesis rather than full respiration (Steinhauser et al., 2012). The oxidative pentose phosphate pathway showed (oxPPP) showed elevated flux, which is required to replenish the ribulose-1,5-bisphosphate and NADPH for biosynthetic reactions. In contrast, the non-oxidative branch of the PPP exhibited moderate flux, while glycolysis displayed moderate to high flux, indicating flexibility in routing carbon in response to cellular demand. There was minimal flux through the pyruvate flavodoxin oxidoreductase reaction which produces acetyl coA from pyruvate. Instead, L- alanine biosynthesis emerged as a major carbon sink, suggesting an adaptive redirection of carbon flux towards amino acid biosynthesis (Figure 3). Similar patterns of flux distribution have been reported in other fast-growing cyanobacterial strains such as Synechococcus sp. PCC 11901 (Ravindran et al., 2024) and UTEX 2973 (Mueller et al., 2017). Flux through the photorespiration pathway accounted for 30-40% of the oxygenation activity of RuBisCo, indicating significant diversion of fixed carbon. This suggests that photorespiration may play a significant role in shaping metabolic efficiency, particularly under ambient CO 2 when oxygenation activity increases. In nitrogen metabolism, maximum flux was directed through glutamate synthesis, due to its central role as a nitrogen donor across multiple biosynthetic processes. The models accurately predicted high fluxes through glutamine and glutamate synthesis, aligning with experimental observations that these amino acids are primary nitrogen assimilation products in PCC 11801 (Jaiswal, Nenwani, et al., 2022) 3.5. Optimization of pathways for maximization of target product yield CBMs are mainly applied as a computational framework to provide predictions of metabolic states and explore the capabilities of the organism as a cell factory. Importantly, simulations using CBM can help identify potential metabolic hotspots and guide the rational design of metabolic engineering strategies aimed at overproducing industrially relevant metabolites, thereby reducing reliance on extensive experimental screening. Following curation and validation, we applied the PCC 11801 and PCC 11802 CBMs to investigate potential metabolic hotspots for bioengineering. Reaction deletions that lead to growth-coupled product formation are desirable in metabolic engineering, as this leads to a stable phenotype. Here, we applied the OptKnock algorithm (Burgard et al., 2003) to the iLV1052 model and iLV1087 to identify reaction deletion strategies that couple product formation with optimal growth. Heterologous pathways for ethanol, butanol, succinic acid, and butanediol production were integrated into the model, and FBA simulations were performed to determine the maximum production rate. Both models predicted maximum production flux was predicted for succinic acid among these products. Phosphoenol pyruvate carboxylase was predicted to be a promising hotspot for bioengineering of the different products by both the CBMs. The CBM simulations also indicated that PCC 11801 and PCC 11802 possesses a greater potential for succinic acid production compared to S. elongatus PCC 7942, underscoring their potential as robust chassis. 4. Conclusion We reconstructed CBMs iLV1052 and iLV1087 for the fast-growing cyanobacterial strains S. elongatus PCC 11801 and PCC 11802, respectively, to support their development as potential cell factories. The models were manually curated and validated using gene essentiality analysis, achieving good average prediction accuracy of 76%. The reproducibility of the models could be checked via the FROG test (Flux variability analysis, Reaction deletion analysis, Objective function evaluation, Gene deletion analysis) (Supporting Information S4), and the draft model is supported by a 74% Memote card. The unique reactions added to the models are associated with protection against oxidative stress, growth stability, energy regulation, and environmental stress tolerance. The flux maps indicated an incomplete TCA cycle which is a typical characteristic of cyanobacterial metabolism. Amino acid biosynthesis-specific curation, incorporation of multi-omics data, accurate estimation of biomass composition, machine learning-based gap filling, and incorporation of condition-specific constraints can help further enhance predictive power of the models. not-yet-known not-yet-known not-yet-known unknown References: Agarwal, P., Soni, R., Kaur, P., Madan, A., Mishra, R., Pandey, J., Singh, S., & Singh, G. (2022). Cyanobacteria as a Promising Alternative for Sustainable Environment: Synthesis of Biofuel and Biodegradable Plastics. In Frontiers in Microbiology (Vol. 13). Frontiers Media S.A. https://doi.org/10.3389/fmicb.2022.939347 Allen, J. F. (2003). Cyclic, pseudocyclic and noncyclic photophosphorylation: new links in the chain . http://plants.trends.com Becker, S. A., Feist, A. M., Mo, M. L., Hannum, G., Palsson, B., & Herrgard, M. J. (2007). Quantitative prediction of cellular metabolism with constraint-based models: The COBRA Toolbox. Nature Protocols, 2 (3), 727–738. https://doi.org/10.1038/nprot.2007.99 Broddrick, J. T., Rubin, B. E., Welkie, D. G., Du, N., Mih, N., Diamond, S., Lee, J. J., Golden, S. S., & Palsson, B. O. (2016). Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis. Proceedings of the National Academy of Sciences of the United States of America, 113 (51), E8344–E8353. https://doi.org/10.1073/pnas.1613446113 Burgard, A. P., Pharkya, P., & Maranas, C. D. (2003). OptKnock: A Bilevel Programming Framework for Identifying Gene Knockout Strategies for Microbial Strain Optimization. Biotechnology and Bioengineering, 84 (6), 647–657. https://doi.org/10.1002/bit.10803 Byreddy, A. R., Gupta, A., Barrow, C. J., & Puri, M. (2016). A quick colorimetric method for total lipid quantification in microalgae. Journal of Microbiological Methods, 125, 28–32. https://doi.org/10.1016/j.mimet.2016.04.002 Deng, M.-D., & Coleman, J. R. (1999). Ethanol Synthesis by Genetic Engineering in Cyanobacteria. In APPLIED AND ENVIRONMENTAL MICROBIOLOGY (Vol. 65, Issue 2). https://journals.asm.org/journal/aem Gong, Z., Chen, J., Jiao, X., Gong, H., Pan, D., Liu, L., Zhang, Y., & Tan, T. (2024). Genome-scale metabolic network models for industrial microorganisms metabolic engineering: Current advances and future prospects. In Biotechnology Advances (Vol. 72). Elsevier Inc. https://doi.org/10.1016/j.biotechadv.2024.108319 Hendry, J. I., Prasannan, C. B., Joshi, A., Dasgupta, S., & Wangikar, P. P. (2016). Metabolic model of Synechococcus sp. PCC 7002: Prediction of flux distribution and network modification for enhanced biofuel production. Bioresource Technology, 213, 190–197. https://doi.org/10.1016/j.biortech.2016.02.128 Henikoff, S., & Henikoff, J. G. (1992). Amino acid substitution matrices from protein blocks (amino add sequence/alignment algorithms/data base srching) (Vol. 89). https://www.pnas.org Hirose, Y., Zielinski, D. C., Poudel, S., Rychel, K., Baker, J. L., Toya, Y., Yamaguchi, M., Heinken, A., Thiele, I., Kawabata, S., Palsson, B. O., & Nizet, V. (2024). A genome-scale metabolic model of a globally disseminated hyperinvasive M1 strain of Streptococcus pyogenes . MSystems . https://doi.org/10.1128/msystems.00736-24 Jain, V. S., Schubert, M. G., Sarnaik, A. P., Pritam, P., Jaiswal, D., Church, G. M., & Wangikar, P. P. (2024). De novo genome assembly and pan-genome analysis of the fast-growing Indian isolates of Synechococcus elongatus: Potential chassis for bioproduction. The Microbe, 2, 100048. https://doi.org/10.1016/j.microb.2024.100048 Jaiswal, D., Nenwani, M., Mishra, V., & Wangikar, P. P. (2022). Probing the metabolism of γ-glutamyl peptides in cyanobacteria via metabolite profiling and 13C labeling. Plant Journal, 109 (3), 708–726. https://doi.org/10.1111/tpj.15564 Jaiswal, D., Nenwani, M., & Wangikar, P. P. (2023). Isotopically non-stationary 13C metabolic flux analysis of two closely related fast-growing cyanobacteria, Synechococcus elongatus PCC 11801 and 11802. Plant Journal, 116 (2), 558–573. https://doi.org/10.1111/tpj.16316 Jaiswal, D., Sahasrabuddhe, D., & Wangikar, P. P. (2022). Cyanobacteria as cell factories: the roles of host and pathway engineering and translational research. In Current Opinion in Biotechnology (Vol. 73, pp. 314–322). Elsevier Ltd. https://doi.org/10.1016/j.copbio.2021.09.010 Jaiswal, D., Sengupta, A., Sengupta, S., Madhu, S., Pakrasi, H. B., & Wangikar, P. P. (2020). A Novel Cyanobacterium Synechococcus elongatus PCC 11802 has Distinct Genomic and Metabolomic Characteristics Compared to its Neighbor PCC 11801. Scientific Reports, 10 (1). https://doi.org/10.1038/s41598-019-57051-0 Jaiswal, D., Sengupta, A., Sohoni, S., Sengupta, S., Phadnavis, A. G., Pakrasi, H. B., & Wangikar, P. P. (2018). Genome Features and Biochemical Characteristics of a Robust, Fast Growing and Naturally Transformable Cyanobacterium Synechococcus elongatus PCC 11801 Isolated from India. Scientific Reports, 8 (1). https://doi.org/10.1038/s41598-018-34872-z Jaiswal, D., & Wangikar, P. P. (2020). Dynamic Inventory of Intermediate Metabolites of Cyanobacteria in a Diurnal Cycle. IScience, 23 (11). https://doi.org/10.1016/j.isci.2020.101704 Kanehisa, M., & Goto, S. (2000). KEGG: Kyoto Encyclopedia of Genes and Genomes. In Nucleic Acids Research (Vol. 28, Issue 1). http://www.genome.ad.jp/kegg/ Karlint, S., & Altschult, S. F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes (sequence alignment/protein sequence features). In Proc. Natl. Acad. Sci. USA (Vol. 87). https://www.pnas.org Kim, D. S., Moreno-Cabezuelo, J. Á., Schulz, E. N., Lea-Smith, D. J., & Sagaram, U. S. (2024). Recent advances in engineering fast-growing cyanobacterial species for enhanced CO2 fixation. In Frontiers in Climate (Vol. 6). Frontiers Media SA. https://doi.org/10.3389/fclim.2024.1412232 King, Z. A., Lu, J., Dräger, A., Miller, P., Federowicz, S., Lerman, J. A., Ebrahim, A., Palsson, B. O., & Lewis, N. E. (2016). BiGG Models: A platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Research, 44 (D1), D515–D522. https://doi.org/10.1093/nar/gkv1049 Lan, E. I., & Liao, J. C. (2011). Metabolic engineering of cyanobacteria for 1-butanol production from carbon dioxide. Metabolic Engineering, 13 (4), 353–363. https://doi.org/10.1016/j.ymben.2011.04.004 Madhu, S., Sengupta, A., Sarnaik, A. P., & Wangikar, P. P. (2024). Expanding the synthetic biology repertoire of a fast-growing cyanobacterium Synechococcus elongatus PCC 11801. Biotechnology and Bioengineering, 121 (9), 2974–2980. https://doi.org/10.1002/bit.28740 Madhu, S., Sengupta, A., Sarnaik, A., Sahasrabuddhe, D., & Wangikar, P. (2023). Global Transcriptome-Guided Identification of Neutral Sites for Engineering Synechococcus elongatus PCC 11801. ACS Synthetic Biology, 12, 1677–1685. Mardinoglu, A., & Palsson, B. (2024). Genome-scale models in human metabologenomics. In Nature Reviews Genetics . Nature Research. https://doi.org/10.1038/s41576-024-00768-0 Mehta, K., Jaiswal, D., Nayak, M., Prasannan, C. B., Wangikar, P. P., & Srivastava, S. (2019). Elevated carbon dioxide levels lead to proteome-wide alterations for optimal growth of a fast-growing cyanobacterium, Synechococcus elongatus PCC 11801. Scientific Reports, 9 (1). https://doi.org/10.1038/s41598-019-42576-1 Mescouto, V. A. de, Ferreira, L. da C., Paiva, R. de J., Oliveira, D. T. de, Santana, M. O., Rocha Filho, G. N. da, Luque, R., Noronha, R. C. R., & Nascimento, L. A. S. do. (2024). Review of recent advances in improvement strategies for biofuels production from cyanobacteria. Heliyon, 10 (22). https://doi.org/10.1016/j.heliyon.2024.e40293 Mueller, T. J., Berla, B. M., Pakrasi, H. B., & Maranas, C. D. (2013). Rapid construction of metabolic models for a family of Cyanobacteria using a multiple source annotation workflow . http://www.biomedcentral.com/1752-0509/7/142 Mueller, T. J., Ungerer, J. L., Pakrasi, H. B., & Maranas, C. D. (2017). Identifying the Metabolic Differences of a Fast-Growth Phenotype in Synechococcus UTEX 2973. Scientific Reports, 7 . https://doi.org/10.1038/srep41569 Nogales, J., Gudmundsson, S., Knight, E. M., Palsson, B. O., & Thiele, I. (2012). Detailing the optimality of photosynthesis in cyanobacteria through systems biology analysis. Proceedings of the National Academy of Sciences of the United States of America, 109 (7), 2678–2683. https://doi.org/10.1073/pnas.1117907109 Orth, J. D., Thiele, I., & Palsson, B. O. (2010). What is flux balance analysis? In Nature Biotechnology (Vol. 28, Issue 3, pp. 245–248). https://doi.org/10.1038/nbt.1614 Park, J., Jeong, H. J., Yoon, E. Y., & Moon, S. J. (2016). Easy and rapid quantification of lipid contents of marine dinoflagellates using the sulpho-phospho-vanillin method. Algae, 31 (4), 391–401. https://doi.org/10.4490/algae.2016.31.12.7 Pritam, P., Sarnaik, A. P., & Wangikar, P. P. (2023). Metabolic engineering of Synechococcus elongatus for photoautotrophic production of mannitol. Biotechnology and Bioengineering, 120 (8), 2363–2370. https://doi.org/10.1002/bit.28479 Ravindran, S., Hajinajaf, N., Kundu, P., Comes, J., Nielsen, D. R., Varman, A. M., & Ghosh, A. (2024). Genome-Scale Metabolic Model Reconstruction and Investigation into the Fluxome of the Fast-Growing Cyanobacterium Synechococcus sp. PCC 11901. ACS Synthetic Biology . https://doi.org/10.1021/acssynbio.4c00379 Rubin, B. E., Wetmore, K. M., Price, M. N., Diamond, S., Shultzaberger, R. K., Lowe, L. C., Curtin, G., Arkin, A. P., Deutschbauer, A., & Golden, S. S. (2015). The essential gene set of a photosynthetic organism. Proceedings of the National Academy of Sciences of the United States of America, 112 (48), e6634–e6643. https://doi.org/10.1073/pnas.1519220112 Saha, R., Verseput, A. T., Berla, B. M., Mueller, T. J., Pakrasi, H. B., & Maranas, C. D. (2012). Reconstruction and Comparison of the Metabolic Potential of Cyanobacteria Cyanothece sp. ATCC 51142 and Synechocystis sp. PCC 6803. PLoS ONE, 7 (10). https://doi.org/10.1371/journal.pone.0048285 Sahu, A., Blätke, M. A., Szymański, J. J., & Töpfer, N. (2021). Advances in flux balance analysis by integrating machine learning and mechanism-based models. In Computational and Structural Biotechnology Journal (Vol. 19, pp. 4626–4640). Elsevier B.V. https://doi.org/10.1016/j.csbj.2021.08.004 Santos-Merino, M., Gargantilla-Becerra, Á., de la Cruz, F., & Nogales, J. (2023). Highlighting the potential of Synechococcus elongatus PCC 7942 as platform to produce α-linolenic acid through an updated genome-scale metabolic modeling. Frontiers in Microbiology, 14 . https://doi.org/10.3389/fmicb.2023.1126030 Schilling, C. H., & Palsson, B. (2000). Assessment of the metabolic capabilities of Haemophilus influenzae Rd through a genome-scale pathway analysis. Journal of Theoretical Biology, 203 (3), 249–283. https://doi.org/10.1006/jtbi.2000.1088 Schomburg, I., Hofmann, O., Baensch, C., Chang, A., & Schomburg, D. (2000). Enzyme data and metabolic information: BRENDA, a resource for research in biology, biochemistry, and medicine. Gene Function & Disease, 1 (3–4), 109–118. Sengupta, S., Jaiswal, D., Sengupta, A., Shah, S., Gadagkar, S., & Wangikar, P. P. (2020). Metabolic engineering of a fast-growing cyanobacterium Synechococcus elongatus PCC 11801 for photoautotrophic production of succinic acid. Biotechnology for Biofuels, 13 (1). https://doi.org/10.1186/s13068-020-01727-7 Srivastava, V., Sarnaik, A. P., & Wangikar, P. P. (2025). Metabolic engineering of rapidly growing Synechococcus elongatus strains for phototrophic production of alkanes. Biotechnology Progress, 41 (1). https://doi.org/10.1002/btpr.3509 Srivastava, V., & Wangikar, P. P. (2024). Metabolic engineering of fast growing cyanobacteria for phototrophic production of 2,3-butanediol. Biochemical Engineering Journal, 210 . https://doi.org/10.1016/j.bej.2024.109439 Steinhauser, D., Fernie, A. R., & Araújo, W. L. (2012). Unusual cyanobacterial TCA cycles: Not broken just different. In Trends in Plant Science (Vol. 17, Issue 9, pp. 503–509). https://doi.org/10.1016/j.tplants.2012.05.005 Thiele, I., & Palsson, B. (2010). A protocol for generating a high-quality genome-scale metabolic reconstruction. Nature Protocols, 5 (1), 93–121. https://doi.org/10.1038/nprot.2009.203 Tiruveedula, G. S. S., & Wangikar, P. P. (2017). Gene essentiality, conservation index and coevolution of genes in cyanobacteria. PLoS ONE, 12 (6). https://doi.org/10.1371/journal.pone.0178565 Triana, J., Montagud, A., Siurana, M., Fuente, D., Urchueguía, A., Gamermann, D., Torres, J., Tena, J., De Córdoba, P., & Urchueguía, J. (2014). Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942. Metabolites, 4 (3), 680–698. https://doi.org/10.3390/metabo4030680 Zavrel, T., Faizi, M., Loureiro, C., Poschmann, G., Stuhler, K., Sinetova, M., Zorina, A., Steuer, R., & Cerveny, J. (2019). Quantitative insights into the cyanobacterial cell economy. ELife, 8 . https://doi.org/10.7554/eLife.42508.001 Figure captions: Figure 1. Gene essentiality analysis for CBMs of (A) S. elongatus PCC 11801 and (B) S. elongatus PCC 11802. In silico predictions of essential and non-essential genes by the models were compared with in vivo derived gene essentiality datasets from Broddrick et al. (2016) and Robin et al. (2015). Parameters used for evaluating the gene essentiality predictions are shown. Figure 2. Subsystem wise analysis of false gene essentiality predictions by CBMs of S. elongatus PCC 11801 and PCC 11802 . (A) False positive predictions. (B) False negative predictions. Figure 3. Predicted flux distribution of S. elongatus PCC 11801 under autotrophic conditions. The thickness of the arrows is proportional to the flux through the corresponding reactions. Refer Supporting Information S4 for absolute flux values . Abbreviations: 3PGA, 3-Phosphoglycerate; ACA, acetyl -CoA; ALA, Alanine; AKG, α-ketoglutarate; E4P, Erythrose-4-phosphate; F6P, Fructose 6 Phosphate; FUM, Fumarate; G1P, Glucose 1 Phosphate; G6P, Glucose 6 Phosphate; GAP, Glyceraldehyde 3 Phosphate; GLY, Glycogen; ICI, Isocitrate; MAL, Malate; OAA, Oxaloacetate; PEP, Phosphoenolpyruvate; PYR, Pyruvate; R5P, Ribose-5-Phosphate; RU5P, Ribulose-5-phosphate; RUBP, Ribulose-1,5-bisphosphate; S7P, Sedoheptulose-7-phosphate; SUCC, Succinate. Supplementary Material File (tables.docx) - Download - 16.28 KB Information & Authors Information Version history Copyright This work is licensed under a Non Exclusive No Reuse License.

Keywords

Authors Metrics & Citations Metrics Article Usage 418views 244downloads Citations Download citation Lokesh V, Pramod Wangikar. Constraint-based metabolic reconstruction and analysis of Synechococcus elongatus PCC 11801 and PCC 11802 for bioengineering. Authorea. 27 May 2025. DOI: https://doi.org/10.22541/au.174831125.54586289/v1 DOI: https://doi.org/10.22541/au.174831125.54586289/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00