Applying prior knowledge of regulatory signaling to investigate macrophage cAMP dynamics during Mycobacterium tuberculosis infection | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Applying prior knowledge of regulatory signaling to investigate macrophage cAMP dynamics during Mycobacterium tuberculosis infection Chris Chen, Pranta Saha, Joyce Reimer, Shaun Wachter, Jeff Chen, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7265502/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 18 You are reading this latest preprint version Abstract Mycobacterium tuberculosis (Mtb), the causative agent of Tuberculosis, resides in host lung macrophages and has evolved unique processes to hijack host signaling pathways to facilitate its survival and propagation within macrophages. Notably, Mtb exports cyclic AMP (cAMP), a key regulatory signaling molecule, during infection. As can often be the case, experimental data exploring immune modulation by cAMP during Mtb infection are sparse, largely cross-sectional and offer only very partial coverage. Data-poor conditions such as significantly challenge conventional data-driven analyses. Accordingly, we apply a hypothesis driven approach to construct a mechanistically informed network model from prior knowledge of pathway signaling recovered from manually curated pathway schema and extracted from literature. Undocumented pathway elements are hypothesized under strict confidence measures using generative artificial intelligence to ensure a closed loop architecture consistent with homeostatic stability. Simulated perturbations to using the most plausible network models highlight the impact of IL-6 on cAMP response. Subsequent experimental validation using human THP-1 monocytes differentiated to macrophages supported this effect. These results suggest that the de novo creation mechanistically informed network model from prior knowledge may support early explorations of complex pathway dynamics, such as intracellular cAMP signaling during Mtb infection, when experimental data is sparse or unavailable. Biological sciences/Computational biology and bioinformatics Biological sciences/Immunology Biological sciences/Microbiology Mycobacterium tuberculosis cAMP signaling dynamics regulatory network simulation pathway logic Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. Introduction Tuberculosis (TB) is the leading cause of death amongst infectious diseases, impacting millions of people every year. [WHO, 2025]. Caused by the bacteria Mycobacterium tuberculosis (Mtb), this pathogen has co-evolved with humans for thousands of years and has developed unique ways of evading host immune defenses [Cambier et al., 2014 ]. As a facultative intracellular pathogen, Mtb enters host macrophages and interferes with host cell signaling, modulates the formation of the phagolysosome and establishes an environment for its own growth and survival. As Mtb replicates and the bacterial burden increases, multiple infected macrophages will aggregate and form a granuloma, from which the pathogen eventually escapes from and spreads, causing active disease. In addition to the active form of the disease, containment of Mtb by the host immune system can result in a latent TB infection, where Mtb bacteria persists in the host and can convert back to an active infection, even though the host is not infectious or symptomatic in this stage [Chandra et al., 2022 ]. Furthermore, the granuloma also protects Mtb from innate immune responses and chemotherapy, and a rigorous regimen of multiple antibiotics over many months is required for full Mtb clearance. This is often further complicated by the emergence of multidrug resistant (MDR) and extensively drug resistant (XDR) strains of the pathogen [Keshavjee and Rich, 2015 ]. To establish this favourable environment for its growth and survival, Mtb uses multiple mechanisms to redirect host signaling and halt immune system clearance. One method by which Mtb accomplishes this is by acting on the cyclic AMP (cAMP) signaling pathway. Found in both humans and bacteria, the signaling molecule cAMP is crucial in numerous physiological functions, including environmental responses, control of gene expression and notably, regulation of immune responses [Raker et al., 2016 ; McDonough and Rodriguez, 2012 ]. While many pathogenic bacteria only encode one or a few adenyl cyclase (AC) genes for cAMP synthesis, Mtb harbours 16 such genes. This overabundance of AC genes implicates cAMP as a central Mtb virulence factor [Lin et al., 2013 ; Joseph et al., 1982 ; Smith et al., 2004 ; Shenoy and Visweswariah, 2006 ; Kurpad and Dhar, 2025 ]. Earlier studies have demonstrated that Mtb-produced cAMP intoxication of macrophages impacts immune function, further suggesting that perturbation of host cAMP pathways may constitute a potential strategy for disrupting Mtb’s infection cycle [Agarwal et al., 2009 ]. However, the vast interconnectedness of cAMP with numerous signaling pathways makes the isolation and study of relevant molecules and pathways challenging. In this work we attempt to explore the mechanisms by which Mtb exploits cAMP signalling to its advantage by constructing an in silico model of cAMP signaling and validating the latter using perturbation experiments to cAMP levels in a THP-1 monocyte cell line. While the use of in silico models of pathway dynamics has a rich history, standard methods typically rely on rate equations formulated as large sets of coupled ordinary differential equations [Gilbert et al., 2006 ]. While offering a high degree of mechanistic fidelity, such models are highly nonlinear in their parameters making them very challenging computationally to estimate. Accordingly, substantial amounts of data are required to support model identification [Rachel et al., 2024 ] even when model reduction techniques are used to decrease the model size, with the latter leading to loss of mechanistic interpretability and bias in model uncertainty. Discrete logic models offer an attractive alternative for qualitatively capturing the characteristic dynamic behaviors of regulatory networks [Thomas et al., 1995 ; Thomas and Kaufman, 2001 ] using a much-reduced parameter space and correspondingly small amount of supporting data [Chaouiya and Remy, 2013 ; Abou-Jaoudé et al., 2016 ]. Indeed, this aspect has been extended further to address instances of very sparse and partially observed data by leveraging model checking techniques [Sedghamiz et al., 2017; Sedghamiz et al., 2019 ] where what little experimental data may be available are used to constrain the selection of competing mechanistically informed models instead of constructing an empirical model de novo. In a departure from conventional data-driven empirical modeling, we demonstrate in this work a hypothesis-driven approach rooted in prior mechanistic knowledge may be applied to explore host immune pathway dynamics in the extreme case where no numerical data is available at the outset. Key regulatory mediators relevant to Mtb infection and transmission were extracted from the literature and their known regulatory interactions recovered from multiple manually curated pathway databases. To satisfy the closed loop network architecture required to support homeostatic regulatory stability, undocumented candidate interactions were inferred using a generative artificial intelligence (AI) model with their credibility assessed using information theoretic metrics of trustworthiness. Information flow through the resulting network and the active recruitment of pathway elements was simulated using sets of discrete regulatory logic rules where the predicted network dynamic behaviors were required to align with partial data reported in the literature. A set of competing regulatory logic models was then used to assess the impact of perturbations in each of the network molecular mediators on cAMP response. Using these models, we validate the response of cAMP to a lipopolysaccharide (LPS)-induced pulse in IL-6 and demonstrate the potential of a knowledge-based approach to explore regulatory network dynamics with situations where experimental data is sparse, partially observed or initially absent altogether. 2. Materials and Methods 2.1. Identifying molecular mediators of interest In order to create a foundation for the regulatory network, molecular mediators of interest were first identified from existing literature. In this proof-of-concept exercise, we selected a recent review of host immune pathways involved in the persistence of Mtb infection [Dey and Bishai, 2014] as well as a review of mechanisms driving Mtb transmission through cough [Naqvi et al., 2023 ]. Proteins and other signaling molecules were extracted autonomously from these two papers using custom domain-specific named entity recognition (NER) engine. The detailed workflow used to extract molecular mediators from the raw PDF files in this small example expert-curated corpus is shown in Fig. 1 (a), (b) . Raw PDF files were first translated into text using Google’s ordinary character recognition (OCR) tool Tesseract [Smith, 2007] concurrently with Mindee DocTR [Batra et al., 2024 ]. The resulting sequences of characters expected in regular expressions typically found in a sentence were then identified using the Python library Regex [Onyenwe et al., 2021 ]. Sentence terms were then recognized as proteins, transcripts of other biological entities by iteratively applying pretrained biomedical ontology models, specifically the en_ner_bionlp13cg_md, en_ner_bc5cdr_md, and en_Ner_jnlpba_md language models, available as part of the Sci spaCy Python library [Neumann et al., 2019 ]. Candidate molecular species predicted by Sci spaCy models were verified against the HUGO and UniProt ontologies. Additional recovery of potential molecules of interest was conducted by proposing corrections to discrepancies in spelling, case, etc… were conducted using the difflib.SequenceMatcher function included in the difflib module of the Python Standard Library [Hellman, 2011]. Recovered candidates were then manually verified. To investigate the functional relevance of these NER-recovered entities, a gene set enrichment analysis (GSEA) [Mubeen et al., 2019 ] was conducted to identify their known involvement in pathways documented in the Pathway Commons database [Rodchenkov et al., 2020 ]. Pathways were considered significantly enriched if a given subset of genes exhibited a q value < 0.05 using a one-sided Fisher’s exact test (Fisher, 1992) subjected to a multiple hypothesis testing correction with the Benjamini–Yekutieli method under dependency (Benjamini and Yekutieli, 2001). 2.2. Assembling a regulatory network de novo Relationships coordinating changes in these molecular mediators of interest were extracted from manually curated pathway databases using the suite of tools available within Integrated Network and Dynamical Reasoning Assembler (INDRA v1.23.0 [Bachman et al., 2023 ] (Fig. 1 (c) ). Specifically, known pathway relationships were extracted using BEL processor and BioPax queries from the BEL Selventa Large Corpus ( https://github.com/cthoyt/selventa-knowledge ) [Hoyt et al. 2019 ] and the Pathway Commons database [Rodchenkov et al., 2020 ] respectively. The latter is itself an aggregate of 23 databases including KEGG Pathway, PANTHER Pathway, Reactome and others, for a total catalogue of over 3.5 million curated interactions drawn across close to 7,000 pathways. Relationships from these databases are reconciled and translated into INDRA statements. For example, the statement “Activation(ERK(kinase), PRKAC(), kinase)” would describe the activation of extracellular signal-regulated kinase (ERK) by the alpha catalytic subunit of protein kinase A (PKA). To achieve this reconciliation across sources, INDRA combines a number of innovative algorithms to address a variety of data fusion challenges that include but are not limited to the inconsistent use of identifiers, fully or partially redundant representations of the same mechanism, detecting and resolving hierarchical relationships between pathway elements, reconciling contradictory statements and ensuring consistency and coherence in causal logic chains across multiple statements. Though INDRA will extract, and assign multiple types of relationships, our focus in this work was on regulatory control actions and so specifically used only those relationships labelled as “increased amount”, “decreased amount”, “activation”, and “inhibition” of a target molecular species when acted upon by a source species. 2.3. Closing regulatory loops using hypothesis generation To support homeostatic stability, a regulatory network model must incorporate closed-loop feedback and feedforward control motifs [Thomaas et al., 1995; Karin et al., 2016 ]. To favour a minimal network model that makes maximal use of known documented relationships, the broader network was pruned by removing nodes with zero input (sources) or output (sinks). In doing so cAMP was also removed as it did not exercise any documented downstream regulatory actions onto the broader network. It was reintroduced by restoring known upstream relationships to cAMP previously extracted using INDRA ( Section 2.2 ) and predicting candidate downstream relationships using a generative artificial intelligence (AI) large language model (LLM) for hypothesis generation. Given the tendency for LLMs to produce spurious results or hallucinate, we selected the Trustworthy Language Model (TLM) from Cleanlab (San Francisco, CA) as a means of generating and quantifying the reliability of the hypothesized relationships [Sardana, 2025 ]. The latter does not draw on a custom-trained model to assess uncertainty but rather uses an LLM-as-a-judge architecture as an overarching environment to assess the answers provided by any user-specified LLM, in this case OpenAI's GPT-4 (San Francisco, CA) [Sanderson, 2023 ]. Relationships between cAMP and each molecule found in the existing closed-loop model were queried to the TLM with the following sentence structures: Does protein [protein name] regulate messenger molecule cAMP (1) positively, (2) negatively or (3) not at all? Respond with only one choice And conversely, Does messenger molecule cAMP regulate protein [protein name] (1) positively, (2) negatively or (3) not at all? Respond with only one choice. Confidence of the suggested regulatory relationships was assessed based on two basic premises, namely (i) observed consistency, and (ii) self-reflection as extrinsic and intrinsic evaluations of LLM confidence [Chen and Mueller, 2023 ]. The former, observed consistency, consists of assessing the dissimilarity between answers obtained from the LLM using multiple Chain-of-Thought (CoT) [Wei et al., 2020] variants of the initial query using different levels of output randomness or sampling temperature. The latter, self-reflection certainty, consists of simply asking the LLM to intrinsically reflect on how confident it is about whether its own previously generated answer is correct or not. These two measures are combined to produce an aggregate trustworthiness score. The distribution of trustworthiness scores obtained for all TLM queries regarding upstream and downstream regulatory actions was then analyzed using Otsu’s method [Otsu, 1975 ] to generate a threshold value separating background and foreground values. Predicted interactions above the generated threshold (foreground) were retained as candidate regulatory actions in the network. In isolated cases where the necessary balance of upstream regulatory actions is not predicted by the LLM, we conducted a focused search of the Elsevier Biology Knowledge Graph (Elsevier, Amsterdam, NL) [Kamdar et al., 2020 ] which is populated by relationships extracted from peer reviewed scientific literature by the natural language processing (NLP) engine MedScan [Novichkova et al., 2003 ; Daraselia et al., 2004 ]. 2.4. Decisional logic of the regulatory network Given a candidate network structure, a decisional logic program is required to direct each molecular mediator node’s response to incoming control actions. Extending foundational work by others using Boolean logic [Albert and Thakar, 2014 ], we apply a more granular multi-level state transition logic whereby the actions of upstream nodes expressed above a perception threshold are combined, weighing the competing actions of weak inactivators against strong activators (and vice versa ) to compute a change in the state of the target regulated node [Sedghamiz et al., 2018 ] (Fig. 2 ). Sets of integer values for the decisional logic parameters, as well as the values of unobserved states, are estimated by solving a constraint satisfaction (SAT) problem whereby allowable values are those that support predicted network behaviors that do not contradict available observations, with these often being sparse and incomplete [Sedghamiz et al., 2019 ]. In this work we explore the scenario where no prior experimental results are available. Instead we constrain network behaviors to explain three steady state conditions that might be expected of macrophages described qualitatively and incompletely by the relative activation in a subset of markers in: (1) an uninfected baseline state, where levels of each node are at a low basal level of zero, (2) an unmeasured condition resulting in the persistent upregulation of cAMP to a maximum level of 2 and (3) a partially observed state reported piecemeal across multiple literature sources and describing activation levels in a subset of markers corresponding to a state of persistent Mtb infection [Russell et al., 2009 ; Wang et al., 2012 ; Rothchild et al., 2014 ; Silvério et al., 2021 ; Haile et al., 2021; Queval et al., 2016 ; Ayalew et al., 2024 ]. As this parameter estimation may also result in changes to the network architecture, additional constraints were applied to enforce structural integrity and to ensure overall network stability, for example disallowing the formation of new source or sink nodes and nodes subject to purely inhibitory regulation [Reimer et al., 2025 ]. Of the competing models that complied with these constraints, a subset of 100 network solutions were chosen for further study of signaling dynamics. 2.5. Simulation of cAMP signaling responses Using the state transition logic defined by the parameter sets described above, an increase or decrease in the activation level of each node is computed and this changed applied in an upcoming iteration using a synchronous, asynchronous or priority-based scheduling scheme [Lyman et al., 2021 ]. In this work, a synchronous update scheme was applied whereby all incremental changes proposed by the state transition logic are enacted at all nodes simultaneously to update the network state from one iteration to the next. This update strategy delivers improved computational efficiency when the focus is achievability of migration from one steady state to another rather than the exact sequence of events along the transition path. On this basis, a simulation-based sensitivity analysis was applied to a subset of 48 candidate network models which exactly supported the upregulation of cAMP as an achievable persistent state. Specifically, for each model the predicted response to exogenous upregulation of each node individually was simulated one at a time and results surveyed across all models to identify perturbations most likely to produce a long-lasting upregulation of cAMP. Perturbations were introduced as continuous perturbations, lasting throughout the simulation time course of 100 iterations, as well as limited pulses where exogenous upregulation was maintained for 50 iterations then discontinued. 2.6. THP-1 human monocytic cell culture Human THP-1 monocytes were obtained from ATCC (TIB-202) and regularly maintained in RPMI-1640 media supplemented with 10% fetal bovine serum (FBS), 1mM sodium pyruvate, 2.5 mM N-2-hydroxyethylpiperazine-N-2-ethane sulfonic acid (HEPES), and 1x non-essential amino acids (Thermo Scientific #11140050) at 37˚C in 5% CO 2 . Cells were regularly counted via trypan blue staining in a hemocytometer and passages were performed as necessary to maintain cell density below 1 x 10 6 cells/mL. To differentiate THP-1 monocytes to macrophages, cells were seeded in a 12 well plate at a density between 1–3 x 10 5 cells/mL. Cells were exposed to 80 ng/mL of phorbol 12-myristate 13-acetate (PMA) for 24 hours and allowed to rest in PMA free media for an additional 24 hours before further experiments. 2.7. Measurement of cAMP levels in THP-1 cells Non-differentiated THP-1 cells were seeded in a 24 well plate at a density of 2.0 x 10^5 cells/mL and PMA differentiated as described above. Following a 24-hour rest period in PMA free media, 100 ng/mL LPS or 100 µM forskolin (FSK), an adenylyl cyclase activator [Seamon and Daly, 1981 ], was added to each treatment well. Cyclic AMP levels were measured using a commercial cAMP ELISA kit (Cayman Chemical Cat. # 581001). Cell lysate was collected via acid lysis as recommended by kit protocol. Each sample was diluted 1:10 and processed in accordance with manufacturer’s instructions. LPS samples were collected at 1 and 2 hour timepoints, FSK samples were collected at 10 and 30 minute timepoints ( Supplementary Figure S1 ). 3. Results 3.1. A cAMP regulatory network produced from prior knowledge In this proof-of-concept analysis NER was applied to two expert-curated review papers describing host immune response to Mtb infection [Dey and Bishai, 2014] and transmission through cough [Naqvi et al., 2023 ] resulting in the recovery of 22 molecular species related to persistence of infection and 9 molecular species associated with cough (Table 1 , Fig. 1 (a), (b) ). Between 2 and 8 (on average 5) of these 31molecular entities were associated with any one of 44 pathway gene sets with enrichment scores significant at an adjusted p-value q < 0.05. These 44 annotated reference pathway gene sets consisted of between 5 and 182 genes as documented in either the Gene Ontology or Reactome component databases available in the Pathway Commons environment ( Supplementary Table S1 (a) ). Pathways enriched by as many as 7 to 8 of 31 extracted entities involved detection by cytosolic sensors of pathogen-associated DNA(REAC:R-HSA-1834949), cytoplasmic pattern recognition receptor signaling (GO:0002753), Type I interferon production and regulation (GO:0032606, GO:0032479, GO:0032481) followed by response to Gram-positive bacterium (GO:0050830). Also well represented were pathways overseeing the production and regulation of IL-1b and IL-8, enriched by 6 of 31 text-mined entities. Table 1 An initial set of literature-informed regulatory network nodes. List of proteins/signaling molecules of interest extracted using NER from Dey and Bishai (2014) and Naqvi et al. ( 2023 ) along with the number of pathways documented in Pathway Commons with which they are associated. UniProt Name Full Protein Name Documented pathways Dey and Bishai (2014) VPS33B Vacuolar protein sorting-associated protein 33B 0 CA2 Carbonic anhydrase 2 0 HAMP Hepcidin 0 TNF Tumor necrosis factor 16 MYD88 Myeloid differentiation primary response protein 27 CGAS Cyclic GMP-AMP synthase 7 TLR9 Toll-like receptor 9 12 ESX1 Homeobox protein 0 PTPA Serine/ threonine-protein phosphatase 2A activator 0 IRF3 Interferon regulatory factor 3 23 SAMHD1 Deoxynucleoside triphosphate triphosphohydrolase 0 CRP C-reactive protein 5 AKAP7 A-kinase anchor protein 7 0 cAMP Cyclic AMP 1 AIM2 Interferon-inducible protein 10 NLRP3 NACHT, LRR and PYD domains-containing protein 3 8 DDX41 Probable ATP-dependent RNA helicase 4 TREX1 Three-prime repair exonuclease 1 13 CD14 Monocyte differentiation antigen CD14 12 IFI16 Gamma-interferon-inducible protein 16 11 PI3 Elafin 0 TBK1 Serine/ threonine-protein kinase 24 Naqvi et al. ( 2023 ) STAT1 Signal transducer and activator of transcription 1 13 NTM Neurotrimin 0 ASIC2 Acid sensing ion channel subunit 2 2 TRPV1 Transient receptor potential cation channel subfamily V member 1 4 TRPM8 Transient receptor potential cation channel subfamily M member 8 3 TRPA1 Transient receptor potential cation channel subfamily A member 1 3 ASIC3 Acid sensing ion channel subunit 3 2 IL6 Interleukin 6 15 Extending this analysis to discover relationships that traverse annotated pathway gene sets, we applied tools from the INDRA Python language library to extract documented relationships linking these molecules from the biological expression language (BEL) Large Corpus and the BioPax/ Pathway Commons databases. These relationships were reconciled and expressed as INDRA statements where each statement assigned a source molecular mediator to its known target with a specific standardized relationship type (e.g. IncreasedAmount). For the purposes of this project, we retained only those relationships related to regulatory control of a target molecular species, specifically those INDRA statements labelled Activation, Inhibition, IncreasedAmount and DecreasedAmount of target. The resulting initial network consisted of our original 31 entities as well as their documented first neighbors for a total of 399 molecular mediator nodes linked by 560 INDRA relationship statements (Fig. 3 (a) ). Of these 399 mediator nodes, 132 were source nodes such as cAMP feeding into the network (no upstream mediator or zero indegree) and 203 were reporter or sink nodes regulated by the network (no downstream target or zero outdegree). In order to support regulatory stability, recall that we require all network mediator nodes to regulate and be regulated, that is to be part of a feedback regulatory motif. As such, we removed all source and reporter or sink nodes to produce a pruned fully closed loop network consisting of 62 molecular species, including 11 of our original 31 text-extracted entities. After removal of duplicate entries from the various component databases in the Pathway Commons environment, these 62 nodes were now linked by 149 regulatory relationships (directed edges) (Fig. 3 (b), Supplementary Table S1 (b) ). Recall that cAMP was devoid of known upstream regulatory actions from other nodes in the initial network. As such it was removed during this editing process and subsequently reintroduced by linking cAMP to putative mediators proposed by a LLM as described in the next section. 3.2. LLM predictions of network interactions closed-loop regulation of cAMP To re-introduce cAMP into the closed loop regulatory network, a large language model (LLM) was used to predict how cAMP would interact with other nodes in the network, with upstream interactions regulating cAMP being of special interest. Due to the tendency for LLMs to generate highly speculative information, the believability of these predictions was assessed using metrics incorporated into the Trustworthy Language Model (TLM) (Cleanlab, San Franscisco, CA) ( Section 2.3 ). Of the 84 queries submitted to the LLM, 61 predicted either a positive or negative regulatory relationship ( Supplementary Table S1 (c) ) with trustworthiness scores ranging between 0.134 to 0.989. Applying Otsu’s method [Otsu, 1975 ] for separating foreground values from background to these 61 values of trustworthiness we obtain 13 new putative relationships involving cAMP with a trustworthiness greater or equal to the threshold value of 0.623. Of these, cAMP is a predicted target of upstream mediated downregulation by CXCL12, ERK and TNF (0.793, 0.763, 0.763 trustworthiness respectively). With all incoming regualtory actions being inhibitory, cAMP activation will inevitably decrease to its floor value with no means of recovering. This violates one of our criteria according to which all network nodes must be the target of at least one positive regulatory action [Reimer et al., 2025 ]. This unilateral downregulation was also the case for network elements IL-1b, ERBB2, Insulin (INS), NFKBIA and AGT. For the latter, we conducted an additional interrogation of the LLM with a focus on upstream regulation of these entities yielding an additional 7 positive regulatory relationships at a high trustworthiness for IL-1b, NFKBIA and AGT. The LLM also predicted 5 relationships positively mediating ERBB2 and INS but with low trustworthiness ( Supplementary Table S1 (d) ). These were included as highly speculative and tagged as available for removal should they not prove helpful in explaining the available data ( Section 2.4 ). Finally, since no positive upstream regulators of cAMP were predicted by the LLM, we conducted a focused search of the Elsevier Biology Knowledge Graph (Elsevier, Amsterdam, NL) which is populated by an automated text-mining of the peer-reviewed literature ( Section 2.3 ). This yielded positive regulation of cAMP by EGFR and MAPK1as reported in 4 and 25 journal publications respectively ( Supplementary Table S1 (e) ). These and other inclusions necessary to satisfy a closed loop architecture resulted in a final network consisting of 178 regulatory relationships linking 63 molecular species (Fig. 4 ). 3.3. Reverse engineering and simulation of regulatory behavior We expect this network model to support pathway-informed explanations of partially observed immune responses to Mtb infection. As we are demonstrating the use of this approach in the extreme case of a potentially novel pathogen where little if any experimental data is available, we turn to fundamental behaviors such as a stable baseline resting state and reports of up expression or down expression of measurable illness markers. In this example use case we define three stable persistent states that must be supported by any regulatory logic applied to this immune network circuitry (Section 2.4). These consist of a baseline resting state where all 63 molecular mediators are expressed at their basal state of 0 (constraint C1), a largely unmeasured state supporting a persistent activation of cAMP to 2 (constraint C2), and a partially observed state corresponding to persistent Mtb infection (constraint C3). The later manifested as maximal activation to level 2 of IL-6, STAT3, IL-17a, SerpinA3, IL-18, TNF, IL-18, CCL2, IFNg, TLR2, CD40, TLR9, CXCL8 concurrent with baseline activation (0) of CXCL12 ( Supplementary Table S2 ). Formulating a constraint satisfaction optimization problem, we solve for values of decisional weights and threshold values for every regulatory relationship as well as feasible values of unmeasured mediators at each condition as a steady state, that is where the next desired state of all nodes is their current state. As this problem is highly underdetermined mathematically there might exist a large number of solutions that satisfy these conditions equally well. Interestingly when required to satisfy these 3 conditions as strict steady states no parameter sets were found for this network configuration where the next predicted state remained the current state for all nodes. Only when allowing a 5% deviation from steady state, or 6 bits across all nodes, was a family of parameter sets recovered. Here we analyze a random subset of 100 logic parameter sets and corresponding predicted states for unmeasured mediators from this broader family of solutions ( Supplementary Table S3 ). Of these 100 models, only 48 fully supported a persistently elevated state of 2 (high with zero deviation from steady state) for cAMP with 57 to 60 or 92–97% the remaining 62 markers also occupying a stable steady state ( Supplementary Table S4 ). These 3 to 5 non-compliant fluctuating nodes typically consisted of TLR9, STAT1, CD40, CD14, and EGFR. For these 48 regulatory models, a stable concurrent activation at a maximal level of 2 was also predicted for 9 other nodes, namely NFKBIA (43/ 48 models), TNFRSF1B (43/48 models), AKT2 (42/48 models), PTEN (42/48 models), TGFA (42/48 models), IL-18 (42/48 models), TNF (43/48 models), INS (41/48 models), and EGFR (41/48 models)( Supplementary Table S5 )(Fig. 5 ). During the reverse engineering of the regulatory logic, relationships extracted from documented pathways were forcibly retained in the network as were text-mined relationships (Elsevier Biology Knowledge Graph) and LLM posited relationships involving cAMP as a target with a high trustworthiness (> 0.623). Other LLM posited relationships where cAMP was not a regulatory target or where trustworthiness was low or the relationship was not predicted to exist were allowed to be discarded during identification of the regulatory logic. Of the 48 network models, LLM predicted cAMP positive regulation of NOS3 and ADIPOQ were unanimously retained. Interestingly cAMP regulation of ADIPOQ was retained as being useful in explaining the partially observed conditions despite a trustworthiness score of 0.483, well below the threshold of 0.623. Also retained unanimously were positive regulation of IL-1b by MAPK14, positive regulation NFKBIA by TNF, and of AGT by IFNg ( Supplementary Table S6). None of the relationships predicted by the LLM to be inexistent were retained unanimously, however, positive regulation of ERBB2 by AKT and of insulin (INS) by TNF were conserved in 46 and 41 of the 48 models respectively. To distinguish the causal upregulation of cAMP from concurrent up expression, a series of simulations were conducted where the network was initialized as its baseline state with all markers at basal activation levels of 0, then each marker was exogenously upregulated individually in turn to a maximal activation of 2 and held at that level as a transient pulse over 50 iterations or for the entire duration of 100 iterations. The network state was updated from one iteration to the next using the basic synchronous update scheme since the focus was the reachability of a persistent maximal activation of cAMP rather than the exact path traversed. The typical responses of cAMP to these perturbations consisted of 5 characteristic trajectories across all 48 models, namely (a) a stable and persistent upregulation, (b) a transient upregulation only, (c) a transient upregulation with single oscillation, (d) a transient with sustained slow low-medium oscillation and (e) transient upregulation with sustained medium-high rapid oscillation ( Supplementary Figure S2 ). These in silico single mediator perturbations predict that persistent upregulation of cAMP could result causally from the exogenous addition of only 4 markers, namely IL-6 (48/48 continuous; 4/48 pulse), EGFR (48/48 continuous; 4/48 pulse), MAPK1 (47/48 continuous; 4/48 pulse), and IL-18 (43/48 continuous; 2/48 pulse) (Fig. 6 ) ( Supplementary Table S7 ). Among the network models where persistent activation of cAMP was predicted the concurrent response of the broader network varied from widely unresponsive (model 42)( Supplementary Figure S3 ) to highly oscillatory (model 53) (Supplementary Figure S4) . 3.4. Indirect experimental validation in THP-1 macrophage cell line Prolonged exposure to exogenous IL-6 is predicted to induce self-sustaining elevated IL-6 and a persistent elevation of cAMP levels by all 48 models. As such it was selected as the perturbation for experimental validation over EGFR which was also predicted to produce persistent elevation of cAMP but was not predicted to achieve a strictly self-sustaining steady state. As lipopolysaccharide (LPS) is a known inducer of IL-6 and more readily available as a reagent, it was used in this work as an indirect means of experimentally creating an elevated IL-6 medium ( Supplementary Figure S1 ) . A one-time transient challenge with LPS was applied to a culture of THP-1 macrophages and cAMP levels measured ( Supplementary Figure S5 ). Results of the LPS pulse experiments are presented in Fig. 7 . LPS induced a transient increase in cAMP levels 60 min after exposure, however the levels decreased after 120 min. To validate our cAMP readout, as a control we also exposed THP-1 cells to Forskolin, an activator of adenylate cyclase, which produced an increase cAMP levels as expected [Seamon and Daly, 1981 ]. Our experiments with IL-6 suggest that the cAMP response is transient. Recall that only 4 of 48 models predicted that a transient pulse in IL-6 would be sufficient to produce a lasting elevation cAMP levels. In contrast all 48 models predicted that a persistent elevation of cAMP would require a prolonged exposure to IL-6. 4. Discussion In this work we explore how a hypothesis driven approach based on prior knowledge alone may inform on experimental design in the absence of the often-abundant data typically required by data-driven statistical analyses. This is especially relevant to the design of effective strategies for crafting rapid responses to the emergence of a novel pathogen. In this work we explore the regulation of cAMP dynamics within macrophages, by assembling and using a minimal regulatory network model to predict shifts in the expression patterns of molecular pathway mediators. Ultimately, experimental validation of these predictions, i.e. within a live cell model, can further inform network behavior and improve the pace of focused experimental studies. As cAMP is a key signalling molecule targeted by Mtb infection, the specific objective of this proof of concept use case was to identify key mediators of cAMP signaling in the hope of understanding some of the strategies used by Mtb and how these might be disrupted. In contrast with most pathway analytical approaches, closed loop regulatory feedback and dynamic stability are rigorously represented and enforced in the architecture and programming of the network model. Moreover, assembling the network model de novo from established prior knowledge of pathway structures makes such models inherently interpretable mechanistically in addition to enabling their use with very sparse and partially observed data unsuitable for empirical data mining. As a case in point, we explore cAMP regulation in the context Mtb infection in the absence of experimental data. Results of qualitative simulations conducted in a family of competing mechanistically informed network models have been supported at least in part by experimental validation in cell culture. The network presented here is a minimal model, based on entities extracted from a very small reference corpus. As we continue to explore the mechanisms of Mtb -host interaction, our ongoing work will increase the size of this expert-curated corpus to include peer-reviewed publications from a broader selection of authors/groups to reduce bias. Though a proof-of-concept investigation, the partial validation of in silico experiments nonetheless offer focused avenues for further investigation. For example, these preliminary simulation results suggest that in addition to cAMP produced endogenously by Mtb [Agarwal et al., 2009], the pathogen’s recruitment of IL-6 response in the host serves only to exacerbate this by further driving increased cAMP production by resident macrophages. Notably, IL6 is reported to be involved in macrophage polarization, which plays a key factor in determining the outcome of an Mtb infection [Peyron et al.,2008; Russell et al., 2019; Martinez et al., 2013; Fernando et al., 2014; Sanmarco et al., 2017]. Interestingly, the predicted regulation of cAMP by IL6 is contrary to current literature, with some studies establishing that cAMP is the regulator of IL6 [Luan et al., 2023; Chio et al., 2004] However, in the context of an Mtb infection within a macrophage, this relationship has not been studied. Importantly, these same simulations also suggest the networked effects of increased IL-6 and cAMP include an upregulation of MAPK3 and STAT3, both apoptosis inhibitors [Seimon et al, 2009; Liu et al., 2003], along with a concurrent downregulation of RIPK1, a motivator of apoptosis. Similar network propagation of elevated IL-6 and cAMP in downregulation of IL-18 which along with RIPK1 are known motivators of cellular necrosis [Ju et al., 2022]. By highlighting the role of IL-6 in potentially exacerbating elevated cAMP production, these simulation experiments suggest a self-inflicted compounding role of IL-6 in disrupting apoptotic and necrotic programmed cell death in infected macrophages. Indirect effects such as these that arise as a result of pathway signal propagation require a network theoretical approach if they are to be captured and understood. To produce these results, it was necessary to fill gaps in our existing pathway knowledge by proposing possible regulatory actions that could provide a closed loop regulatory architecture. We used a general-purpose Large Language generative AI model for this purpose, assessing the credibility of these predictions by means of a quantitative trustworthiness score [Chen and Mueller, 2023]. It was interesting to observe in this admittedly limited analysis that although many of the hypothesized relationships with very high trustworthiness scores were highly retained as useful in explaining specified network dynamics, this was not a hard and fast rule. Indeed, while 6 of the 9 relationships retained in more than 40 of 48 models were assigned high trustworthiness scores, two highly retained relationships were predicted to be non-existent, and one unanimously retained relationship scored far below the trustworthiness threshold. Moreover, 6 of the 16 relationships (~38%) predicted with high trustworthiness were unanimously rejected. While there is undoubtedly room for improvement, it is important to remember that the LLM used was not domain-focused and reliability was based on an LLM-as-a-judge introspective approach purposely devoid of a Gold Standard. As we do not require the universality of LLMs , we are currently investigating the use of smaller domain and even task specific LM [Sinha et al., 2024]. As LLMs remain hypothesis generation engines, we propose that a more robust approach might be to combine such model-based credibility measures with the practical contribution of a novel regulatory relationships in explaining the observed and expected dynamics of the system. While the limited experimental validation conducted here offered evidence that a one-time pulse perturbation with IL-6 using LPS did result in an upregulation of cAMP, this upregulation appears to be transient. This would essentially invalidate the 4 candidates in our pool of 48 competing models where a pulse perturbation was predicted to be sufficient to trigger persistent cAMP upregulation, favouring instead the remaining 44 models predicting that a continued stimulation with IL-6 would be needed. This exemplifies how even a very partially observed single experimental such as this one can be used to incrementally reduce the pool of competing network models. Our group has been exploring the use of numerical techniques to accelerate this process of elimination by using the model pool to actually predict which experiments would best discriminate between candidate models to converge most efficiently on the smallest number of alternatives [Videla et al., 2015]. Model-informed methods such as these emphasize that not all data is equally informative and that the conventional focus on generating more data might instead be redirected to generating the right kind of data. Similarly, the measurement of multiple network mediators in addition to cAMP measured here would also introduce much more stringent requirements on model selection and multiplex analyses will be a growing component in our continuing work. As we continue to refine and broaden the capabilities of our live cell visualization of cAMP dynamics, we argue that the methods and tools presented here constitute a practical framework for harnessing prior knowledge and sparse partially observed data to deliver rational pathway-informed hypotheses that in turn serve to guide and improve the efficiency of experimental laboratory work aimed at unraveling the complexities of the host-pathogen interaction between Mtb and the macrophage. Declarations Acknowledgements and Funding Information This work was supported by the University of Saskatchewan’s Vaccine and Infectious Disease Organization (VIDO). VIDO receives operational funding from the Canada Foundation for Innovation (CFI) through the Major Science Initiatives Fund and from the Government of Saskatchewan through Innovation Saskatchewan and the Ministry of Agriculture. This article is submitted with the permission of the Director of VIDO. Research in N.D’s lab was supported by funding from Canadian Institutes of Health Research (ARB-185715 and ARB-192058); Natural Sciences and Engineering Research Council of Canada (RGPIN-2023-05746) and Saskatchewan Health Research Foundation (6239). The authors also wish to thank Kim Chiok, Michelle Gerber, Neil Rawlyk, Nandini, and Shamsuddeen Ma’aruf from the Dhar lab. Competing Interests Statement The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Author contributions . Chris Chen, Joyce Reimer, Pranta Saha: Methodology, Software, Formal analysis, Data Curation, Writing - Original Draft Shaun Wachter: Validation, Investigation, Resources, Data Curation Jeff Chen, Neeraj Dhar: Conceptualization, Methodology, Investigation, Resources, Writing - Review & Editing, Funding acquisition, Supervision, Project administration Gordon Broderick: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Writing - Original Draft, Writing - Review & Editing, Funding acquisition, Supervision, Project administration. Data availability statement . All data used to conduct the analyses may be found as Supporting Information. Additional Information Additional File: Supplementary_Tables.xlsx Supplementary Table S1 . (a) Pathway membership of NER extracted entities as documented in the Pathway Commons database. (b) Interaction relationships extracted by INDRA from the Pathway Commons database (* denotes contradictory database entries), (c) Interaction relationships involving cAMP predicted by the CleanLab Trustworthy LLM (CleanLab, San Francisco, CA), (d) Additional interaction relationships required to balance positive and negative regulation predicted by the CleanLab Trustworthy LLM (CleanLab, San Francisco, CA), Table S1. (e) Additional interaction relationships required to provide positive regulation of cAMP extracted by text-mining to the EmBiology database (Elsevier, Amsterdam, NL) Supplementary Table S2 . Expected stable persistence behaviors expected of the regulatory network (Low=0, Mid-range=1, High=2, Unknown=-99) Supplementary Table S3 . Activation profiles predicted by a sample of 100 competing models satisfying all 3 steady state conditions within a 5% tolerance (6 bits across 63 state variables). Supplementary Table S4 . Extent of deviation from steady state in a sample of 100 competing models satisfying all 3 steady state conditions within a 5% tolerance (6 bits across 63 state variables). Supplementary Table S5 . Subset of 48 models that exactly satisfy a strongly upregulated steady state for cAMP only from a sample of 100 competing models. Supplementary Table S6 . Retention of inferred relationships in subset of 48 models that exactly satisfy a strongly upregulated steady state for cAMP only from a sample of 100 competing models. Supplementary Table S7 . Simulated response to persistent and pulse perturbation in 48 of 100 competing models supporting cAMP steady state. References Tuberculosis (TB). https://www.who.int/news-room/fact-sheets/detail/tuberculosis. Accessed January 4, 2025 Cambier CJ, Falkow S, Ramakrishnan L. 2014. Host Evasion and Exploitation Schemes of Mycobacterium tuberculosis. Cell 159(7):1497-1509. Chandra P, Grigsby SJ, Philips JA. 2022. Immune evasion and provocation by Mycobacterium tuberculosis. Nat Rev Microbiol 20:750-766. KJ, Keshavjee S, Rich ML. 2015. Multidrug-Resistant Tuberculosis and Extensively Drug-Resistant Tuberculosis. Cold Spring Harb Perspect Med 5:a017863. Raker VK, Becker C, Steinbrink K. 2016. The cAMP Pathway as Therapeutic Target in Autoimmune and Inflammatory Diseases. Front Immunol 7:123. McDonough KA, Rodriguez A. 2012. The myriad roles of cyclic AMP in microbial pathogens: from signal to sword. Nat Rev Microbiol 10:27-38. Lin CT, Chen YC, Jinn TR, Wu CC, Hong YM, Wu WH. 2013. Role of the cAMP-Dependent Carbon Catabolite Repression in Capsular Polysaccharide Biosynthesis in Klebsiella pneumoniae. PLOS ONE 8:e54430. Joseph E, Bernsley C, Guiso N, Ullmann A. 1982. Multiple regulation of the activity of adenylate cyclase in Escherichia coli. Molec Gen Genet 185:262-268. Smith RS, Wolfgang MC, Lory S. 2004. An Adenylate Cyclase-Controlled Signaling Network Regulates Pseudomonas aeruginosa Virulence in a Mouse Model of Acute Pneumonia. Infection and Immunity 72:1677-1684. Shenoy AR, Visweswariah SS. 2006. Mycobacterial adenylyl cyclases: Biochemical diversity and structural plasticity. FEBS Letters 580:3344-3352 Kurpad SS, Dhar N. Playing Telephone: How Secondary Messengers Influence Host–Pathogen Interactions in Tuberculosis. ACS Infectious Diseases. 2025 Jun 20. Agarwal N, Lamichhane G, Gupta R, Nolan S, Bishai WR. 2009. Cyclic AMP intoxication of macrophages by a Mycobacterium tuberculosis adenylate cyclase. Nature 460:98-102. Gilbert D, Fuss H, Gu X, Orton R, Robinson S, Vyshemirsky V, Kurth MJ, Downes CS, Dubitzky W. Computational methodologies for modelling, analysis and simulation of signalling networks. Briefings in Bioinformatics. 2006 Dec 1;7(4):339-53. Rachel T, Brombacher E, Wöhrle S, Groß O, Kreutz C. Dynamic modelling of signalling pathways when ordinary differential equations are not feasible. Bioinformatics. 2024 Dec;40(12):btae683. Thomas R, Thieffry D, Kaufman M. Dynamical behaviour of biological regulatory networks—I. Biological role of feedback loops and practical use of the concept of the loop-characteristic state. Bulletin of mathematical biology. 1995 Mar;57:247-76. Thomas R, Kaufman M. Multistationarity, the basis of cell differentiation and memory. II. Logical analysis of regulatory networks in terms of feedback circuits. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2001 Mar 1;11(1):180-95. Chaouiya C, Remy E. Logical modelling of regulatory networks, methods and applications. Bulletin of mathematical biology. 2013 Jun;75:891-5. Abou-Jaoudé W, Traynard P, Monteiro PT, Saez-Rodriguez J, Helikar T, Thieffry D, Chaouiya C. Logical modeling and dynamical analysis of cellular networks. Frontiers in genetics. 2016 May 31;7:94. Sedghamiz H, Chen W, Rice M, Whitley D, Broderick G. Selecting optimal models based on efficiency and robustness in multi-valued biological networks. In2017 IEEE 17th international conference on bioinformatics and bioengineering (BIBE) 2017 Oct 23 (pp. 200-205). IEEE. Sedghamiz H, Morris M, Craddock TJ, Whitley D, Broderick G. Bio-modelchecker: using bounded constraint satisfaction to seamlessly integrate observed behavior with prior knowledge of biological networks. Frontiers in Bioengineering and Biotechnology. 2019 Mar 26;7:48. Dey B, Bishai WR. Crosstalk between Mycobacterium tuberculosis and the host cell. In: Seminars in immunology 2014 Dec 1 (Vol. 26, No. 6, pp. 486-496). Academic Press. Naqvi KF, Mazzone SB, Shiloh MU. Infectious and inflammatory pathways to cough. Annual review of physiology. 2023 Feb 10;85(1):71-91. Smith R. An overview of the Tesseract OCR engine. In: Ninth international conference on document analysis and recognition (ICDAR 2007) 2007 Sep 23 (Vol. 2, pp. 629-633). IEEE. Batra P, Phalnikar N, Kurmi D, Tembhurne J, Sahare P, Diwan T. OCR-MRD: performance analysis of different optical character recognition engines for medical report digitization. International Journal of Information Technology. 2024 Jan;16(1):447-55. Onyenwe I, Ogbonna S, Onyedimma E, Ikechukwu-Onyenwe O, Nwafor C. Developing Smart Web-Search using Regex. arXiv preprint arXiv:2110.04767. 2021 Oct 10. Neumann M, King D, Beltagy I, Ammar W. ScispaCy: fast and robust models for biomedical natural language processing. arXiv preprint arXiv:1902.07669. 2019 Feb 20. Hellmann D. The Python standard library by example. Addison-Wesley Professional; 2011 Jun 1. Mubeen S, Hoyt CT, Gemünd A, Hofmann-Apitius M, Fröhlich H, Domingo-Fernández D. The impact of pathway database choice on statistical enrichment analysis and predictive modeling. Frontiers in genetics. 2019 Nov 22;10:1203. Rodchenkov I, Babur O, Luna A, Aksoy BA, Wong JV, Fong D, Franz M, Siper MC, Cheung M, Wrana M, Mistry H. Pathway Commons 2019 Update: integration, analysis and exploration of pathway data. Nucleic acids research. 2020 Jan 8;48(D1):D489-97. Fisher RA. Statistical methods for research workers. In Breakthroughs in statistics: Methodology and distribution 1970 (pp. 66-70). New York, NY: Springer New York. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Annals of statistics. 2001 Aug 1:1165-88. Bachman JA, Gyori BM, Sorger PK. Automated assembly of molecular mechanisms at scale from text mining and curated databases. Molecular Systems Biology. 2023 May 9;19(5):e11325. Hoyt CT, Domingo-Fernández D, Aldisi R, Xu L, Kolpeja K, Spalek S, Wollert E, Bachman J, Gyori BM, Greene P, Hofmann-Apitius M. Re-curation and rational enrichment of knowledge graphs in Biological Expression Language. Database. 2019;2019:baz068. Karin O, Swisa A, Glaser B, Dor Y, Alon U. Dynamical compensation in physiological circuits. Molecular systems biology. 2016 Nov;12(11):886. Sardana A. Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?. arXiv preprint arXiv:2503.21157. 2025 Mar 27. Chen J, Mueller J. Quantifying uncertainty in answers from any language model and enhancing their trustworthiness. arXiv preprint arXiv:2308.16175. 2023 Aug 30. Sanderson K. GPT-4 is here: what scientists think. Nature. 2023 Mar 30;615(7954):773. Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems. 2022 Dec 6;35:24824-37. Otsu N. A threshold selection method from gray-level histograms. Automatica. 1975 Jun;11(285-296):23-7. Kamdar MR, Stanley CE, Carroll M, Wogulis L, Dowling W, Deus HF, Samarasinghe M. Text Snippets to Corroborate Medical Relations: An Unsupervised Approach using a Knowledge Graph and Embeddings. AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:288-297. Novichkova S, Egorov S, Daraselia N. MedScan, a natural language processing engine for MEDLINE abstracts. Bioinformatics. 2003 Sep 1;19(13):1699-706. Daraselia N, Yuryev A, Egorov S, Novichkova S, Nikitin A, Mazo I. Extracting human protein interactions from MEDLINE using a full-sentence parser. Bioinformatics. 2004 Mar 22;20(5):604-11. Albert R, Thakar J. Boolean modeling: a logic‐based dynamic approach for understanding signaling and regulatory networks and for making useful predictions. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. 2014 Sep;6(5):353-69. Sedghamiz H, Morris M, Craddock TJ, Whitley D, Broderick G. High-fidelity discrete modeling of the HPA axis: a study of regulatory plasticity in biology. BMC systems biology. 2018 Dec;12:1-6. Russell DG, Cardona PJ, Kim MJ, Allain S, Altare F. 2009. Foamy macrophages and the progression of the human TB granuloma. Nat Immunol 10:943-948. Wang JY, Chang HC, Liu JL, Shu CC, Lee CH, Wang JT, Lee LN. 2012. Expression of toll-like receptor 2 and plasma level of interleukin-10 are associated with outcome in tuberculosis. Eur J Clin Microbiol Infect Dis 31:2327-2333. Rothchild AC, Jayaraman P, Nunes-Alves C, Behar SM. 2014. iNKT Cell Production of GM-CSF Controls Mycobacterium tuberculosis. PLOS Pathogens .;10:e1003805. Silvério D, Gonçalves R, Appelberg R, Saraiva M. 2021. Advances on the Role and Applications of Interleukin-1 in Tuberculosis. mBio 12:e03134-21. HaileMariam M, Yu Y, Singh H, Teklu T, Wondale B, Worku A, Zewude A, Mounaud S, Tsitrin T, Legesse M, Gobena A, Pieper R. 2021. Protein and Microbial Biomarkers in Sputum Discern Acute and Latent Tuberculosis in Investigation of Pastoral Ethiopian Cohort. Front Cell Infect Microbiol 11. Queval CJ, Song OR, Deboosère N, Delorme V, Debrie AS, Iantomasi R, Veyron-Churlet R, Jouny S, Redhage K, Deloison G, Baulard A, Chamaillard M, Locht C, Brodin P. 2016. STAT3 Represses Nitric Oxide Synthesis in Human Macrophages upon Mycobacterium tuberculosis Infection. Sci Rep 6:29297. Ayalew S, Wegayehu T, Wondale B, Tarekegn A, Tessema B, Admasu F, Piantadosi A, Sahi M, Gebresilase TT, Fredolini C, Mihret A. 2024. Candidate serum protein biomarkers for active pulmonary tuberculosis diagnosis in tuberculosis endemic settings. BMC Infect Dis 24:1-15. Reimer J, Page J, Saha P, Shen S, Zhu X, Qian S, Mammen M, Qu J, Sethi S, Broderick GJ. Leveraging Dynamic Stability to Infer Regulation in Protein-Protein Interaction Networks: A Study of Infectious Vulnerability in COPD. bioRxiv. 2025:2025-05. Lyman CA, Morris MM, Richman S, Cao H, Scerri A, Cheadle C, Broderick G. High fidelity modeling of pulse dynamics using logic networks. In2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021 Dec 9 (pp. 197-204). IEEE. Seamon K, Daly JW. Activation of adenylate cyclase by the diterpene forskolin does not require the guanine nucleotide regulatory protein. Journal of Biological Chemistry. 1981 Oct 10;256(19):9799-801. Barde I, Laurenti E, Verp S, Wiznerowicz M, Offner S, Viornery A, Galy A, Trumpp A, Trono D. 2011. Lineage- and stage-restricted lentiviral vectors for the gene therapy of chronic granulomatous disease. Gene Ther 18:1087-1097. Dull T, Zufferey R, Kelly M, Mandel RJ, Nguyen M, Trono D, Naldini L.1998. A Third-Generation Lentivirus Vector with a Conditional Packaging System. Journal of Virology 72:8463-8471. Peyron P, Vaubourgeix J, Poquet Y, Levillain F, Botanch C, Bardou F, Daffé M, Emile JF, Marchou B, Cardona PJ, de Chastellier C, Altare F. 2008. Foamy Macrophages from Tuberculous Patients’ Granulomas Constitute a Nutrient-Rich Reservoir for M. tuberculosis Persistence. PLOS Pathogens 4:e1000204. Russell DG, Huang L, VanderVen BC. 2019. Immunometabolism at the interface between macrophages and pathogens. Nat Rev Immunol 19:291-304. Martinez AN, Mehra S, Kaushal D. 2013. Role of Interleukin 6 in Innate Immunity to Mycobacterium tuberculosis Infection. J Infect Dis 207:1253-1261. Fernando MR, Reyes JL, Iannuzzi J, Leung G, McKay DM. 2014. The Pro-Inflammatory Cytokine, Interleukin-6, Enhances the Polarization of Alternatively Activated Macrophages. PLOS ONE 9:e94188. Sanmarco LM, Ponce NE, Visconti LM, Eberhardt N, Theumer MG, Minguez ÁR, Aoki MP. 2017. IL-6 promotes M2 macrophage polarization by modulating purinergic signaling and regulates the lethal release of nitric oxide during Trypanosoma cruzi infection. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease 1863:857-869. Luan D, Dadpey B, Zaid J, Bridge-Comer PE, DeLuca JH, Xia W, Castle J, Reilly SM. 2023. Adipocyte-Secreted IL-6 Sensitizes Macrophages to IL-4 Signaling. Diabetes 72:367-374. Chio CC, Chang YH, Hsu YW, Chi KH, Lin WW. 2004. PKA-dependent activation of PKC, p38 MAPK and IKK in macrophage: implication in the induction of inducible nitric oxide synthase and interleukin-6 by dibutyryl cAMP. Cellular Signalling 16:565-575. Seimon TA, Wang Y, Han S, Senokuchi T, Schrijvers DM, Kuriakose G, Tall AR, Tabas IA. Macrophage deficiency of p38α MAPK promotes apoptosis and plaque necrosis in advanced atherosclerotic lesions in mice. The Journal of clinical investigation. 2009 Apr 1;119(4):886-98. Liu H, Ma Y, Cole SM, Zander C, Chen KH, Karras J, Pope RM. Serine phosphorylation of STAT3 is essential for Mcl-1 expression and macrophage survival. Blood. 2003 Jul 1;102(1):344-52. Ju E, Park KA, Shen HM, Hur GM. The resurrection of RIP kinase 1 as an early cell death checkpoint regulator—a potential target for therapy in the necroptosis era. Experimental & Molecular Medicine. 2022 Sep;54(9):1401-11. Sinha N, Jain V, Chadha A. Are Small Language Models Ready to Compete with Large Language Models for Practical Applications?. arXiv preprint arXiv:2406.11402. 2024 Jun 17. Videla S, Konokotina I, Alexopoulos LG, Saez-Rodriguez J, Schaub T, Siegel A, Guziolowski C. Designing experiments to discriminate families of logic models. Frontiers in bioengineering and biotechnology. 2015 Sep 4;3:131. Additional Declarations No competing interests reported. Supplementary Files SupplementaryFigures.pdf SupplementaryTables.xlsx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 19 Feb, 2026 Reviews received at journal 10 Dec, 2025 Reviews received at journal 10 Dec, 2025 Reviews received at journal 08 Dec, 2025 Reviewers agreed at journal 01 Dec, 2025 Reviewers agreed at journal 01 Dec, 2025 Reviewers agreed at journal 29 Nov, 2025 Reviewers agreed at journal 29 Nov, 2025 Reviewers agreed at journal 29 Nov, 2025 Reviewers agreed at journal 29 Nov, 2025 Editor invited by journal 30 Oct, 2025 Reviews received at journal 24 Sep, 2025 Reviewers agreed at journal 04 Sep, 2025 Reviewers agreed at journal 04 Sep, 2025 Reviewers invited by journal 03 Sep, 2025 Editor assigned by journal 09 Aug, 2025 Submission checks completed at journal 07 Aug, 2025 First submitted to journal 31 Jul, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7265502","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":512240837,"identity":"521dc4ed-8a0d-483c-8c4c-36ea05bed207","order_by":0,"name":"Chris Chen","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Chris","middleName":"","lastName":"Chen","suffix":""},{"id":512240838,"identity":"4e7af388-6df8-469c-8a13-4de69393da5e","order_by":1,"name":"Pranta Saha","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Pranta","middleName":"","lastName":"Saha","suffix":""},{"id":512240839,"identity":"45152a83-4363-4a55-a2cc-037e028fc0b7","order_by":2,"name":"Joyce Reimer","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Joyce","middleName":"","lastName":"Reimer","suffix":""},{"id":512240840,"identity":"e7d53bb5-66bb-4faa-ad8c-f9b53c95999b","order_by":3,"name":"Shaun Wachter","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Shaun","middleName":"","lastName":"Wachter","suffix":""},{"id":512240841,"identity":"9c0f6a5d-cf4d-451d-a500-7eca2ee5ea5e","order_by":4,"name":"Jeff Chen","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Jeff","middleName":"","lastName":"Chen","suffix":""},{"id":512240842,"identity":"43389726-ff3c-44fc-abc5-d60867b23e43","order_by":5,"name":"Neeraj Dhar","email":"","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":false,"prefix":"","firstName":"Neeraj","middleName":"","lastName":"Dhar","suffix":""},{"id":512240843,"identity":"1e3db911-ae5e-4b63-88c3-77de8dc06669","order_by":6,"name":"Gordon Broderick","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/ElEQVRIiWNgGAWjYJACCSBOAGIDEDLgB4sdIEWLZANpWoCEwQECWuTdex/e+LiDIc+8vXnj44oCO2PjG+kPP/44wyDP34Bdi+GZ48aWM88wFMucOVZseMYg2czsRo6xNM8NBsMZOGwynJHGJs3bxpA4QyLHTLLB4IANUAuDNMMHoFOJ0GL+E6TFeEb6458/gFrkcWiRl0CyhRGoxcxAIsFMAuiwBAMcWgx4jjFbzmyTKJbgOVYMdFiyscSZN2bWPGckDDfisqW9jfHGxzabPAn25o0fG/7YGfa3pz+++eOYjbwcLlsg4hIYEpgicFsacEqNglEwCkbBKIACAGd2WY1rCzz6AAAAAElFTkSuQmCC","orcid":"","institution":"University of Saskatchewan","correspondingAuthor":true,"prefix":"","firstName":"Gordon","middleName":"","lastName":"Broderick","suffix":""}],"badges":[],"createdAt":"2025-07-31 20:38:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7265502/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7265502/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":91074688,"identity":"22b47f43-db11-4058-aa8f-403a45e64f2d","added_by":"auto","created_at":"2025-09-11 11:04:07","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":365645,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExtraction of biological entities from text\u003c/em\u003e. Recovery of biological entities of interest from two example domain relevant publications [Dey and Bishai, 2014; Naqvi et al., 2023] using the Python REGEX library for PDF conversion to text and tokenization \u003cstrong\u003e(a)\u003c/strong\u003e. Recognition of entities was performed using spaCy and Google docTR to deploy biomedical domain specific LLM \u003cstrong\u003e(b)\u003c/strong\u003e. These identified 22 molecular entities relevant to Mtb persistence and 9 entities relevant to Mtb transmission through cough.\u003cstrong\u003e \u003c/strong\u003eDocumented relationships linking these entities were recovered from the Pathway Commons and BEL Large Corpus databases using the INDRA Python library resulting in the inclusion of entity first neighbors leading to a total of 399 entities linked by 560 documented regulatory relationships \u003cstrong\u003e(c).\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/34b3eb87eb97edeeda5d4076.png"},{"id":91072965,"identity":"d8a775b5-5c56-4d3e-91c7-4a8882c55b9d","added_by":"auto","created_at":"2025-09-11 10:56:07","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":324709,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eReverse engineering immune regulatory logic.\u003c/em\u003e\u003cem\u003e\u003cstrong\u003e \u003c/strong\u003e\u003c/em\u003eA simple state transition logic applies detection thresholds to upstream mediators to mimic differences in receptor affinity as well as decisional weights to capture context specific strength of mediation. Families of decisional logic parameter values supporting network dynamics that include observed behaviors are identified by formulating and solving a constraint satisfaction (SAT) problem.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/db9c9bc6dc38ce543a899be1.png"},{"id":91072967,"identity":"03de5bc3-6ea8-48e2-b40b-7ad7546bfe43","added_by":"auto","created_at":"2025-09-11 10:56:07","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":562099,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExtraction of documented pathway relationships\u003c/em\u003e. Recovery of known regulatory relationships linking 31 text-mined molecular entities to each other and to related mediators/ targets documented in the manually curated pathway databases Pathway Commons and BEL Large Corpus. This produced an initial overarching network of 560 relations linking 399 nodes \u003cstrong\u003e(a)\u003c/strong\u003e. Extracting only those relationships and entities that were included in a feedback loop produced a minimal closed loop circuit of 62 entities linked through 149 relationships \u003cstrong\u003e(b)\u003c/strong\u003e.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/7b803f26d3045240724c3350.png"},{"id":91074686,"identity":"ff8f3e56-b37f-4aad-8690-ac43ecd79e9a","added_by":"auto","created_at":"2025-09-11 11:04:07","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":845125,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eAugmented minimal closed loop regulatory network involving cAMP\u003c/em\u003e. An augmented network reintroducing regulatory interactions with cAMP as well as new regulatory interactions required to make available a balance of positive and negative regulation at each network node. This network put forward for dynamics stability assessment consisted of 178 interactions linking 63 molecular entities.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/50be08d70a852672b8103ebe.png"},{"id":91074694,"identity":"02c5f0cb-2093-4b60-b546-d8077f9b34e9","added_by":"auto","created_at":"2025-09-11 11:04:07","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":206125,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003ePredicted co-expression profiles consistent with stable upregulation of cAMP (C2 constraint)\u003c/em\u003e. A survey of 48 models that support persistent stable activation of cAMP show similar stable elevated activation of NFKBIA (43/ 48), TNFRSF1B (43/48), AKT2 (42/48), PTEN (42/48), TGFA (42/48), IL18 (42/66), TNF (43/48), INS (41/48), and EGFR (41/48) .\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/7f98a2bcaa15b40d5771f155.png"},{"id":91078425,"identity":"0c640531-2fb1-4752-adb0-8f5868c54472","added_by":"auto","created_at":"2025-09-11 11:20:07","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":182616,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003ePredicted cAMP response (C2 constraint)\u003c/em\u003e. Simulations (synchronous update) of continuous (CONSTANT) and transient pulse (PULSE) perturbations to individual network nodes/ markers across all 48 models that support persistent stable activation of cAMP. Persistent upregulation of cAMP was predicted to be produced by addition of IL6 (48/48 continuous; 4/48 pulse), EGFR (48/48 continuous; 4/48 pulse), MAPK1 (47/48 continuous; 4/48 pulse), and IL18 (43/48 continuous; 2/48 pulse).\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/9342be13f3ae792d48caaa46.png"},{"id":91072973,"identity":"59b2940c-f991-46bf-8988-3ae3d4827a6e","added_by":"auto","created_at":"2025-09-11 10:56:07","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":94202,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eMeasurement of cAMP levels\u003c/em\u003e. Cell lysate was collected from PMA differentiated THP-1 cells treated with LPS to induce IL-6 expression. (a): Measurement of cAMP levels with ELISA found that LPS induced a transient pulse of cAMP after 60 minutes, which decreased at 120 minutes. (b): Validation of cAMP readout with Forskolin (FSK) activation of adenylyl cyclases in differentiated THP-1 cells, which induced a large increase in cAMP levels within 30 minutes.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/5a9e4c4aa6f3c810d1461d91.png"},{"id":91079861,"identity":"8c7298cd-ab09-4443-bee9-ff2bf8f06832","added_by":"auto","created_at":"2025-09-11 11:28:09","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3647208,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/dc7d331c-d95c-4e07-beb9-401279d6c1d4.pdf"},{"id":91072961,"identity":"82419fcd-aca8-4325-84c4-1b0dc510ed70","added_by":"auto","created_at":"2025-09-11 10:56:07","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":1567506,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigures.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/2a3a2a1f0cdefe70ac2ae086.pdf"},{"id":91074687,"identity":"98219fc4-fa42-4ddd-a776-e3cc7ffc2395","added_by":"auto","created_at":"2025-09-11 11:04:07","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":210406,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7265502/v1/f10d3af59112c6bbea965e5d.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Applying prior knowledge of regulatory signaling to investigate macrophage cAMP dynamics during Mycobacterium tuberculosis infection","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eTuberculosis (TB) is the leading cause of death amongst infectious diseases, impacting millions of people every year. [WHO, 2025]. Caused by the bacteria \u003cem\u003eMycobacterium tuberculosis\u003c/em\u003e (Mtb), this pathogen has co-evolved with humans for thousands of years and has developed unique ways of evading host immune defenses [Cambier et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2014\u003c/span\u003e]. As a facultative intracellular pathogen, Mtb enters host macrophages and interferes with host cell signaling, modulates the formation of the phagolysosome and establishes an environment for its own growth and survival. As Mtb replicates and the bacterial burden increases, multiple infected macrophages will aggregate and form a granuloma, from which the pathogen eventually escapes from and spreads, causing active disease. In addition to the active form of the disease, containment of Mtb by the host immune system can result in a latent TB infection, where Mtb bacteria persists in the host and can convert back to an active infection, even though the host is not infectious or symptomatic in this stage [Chandra et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2022\u003c/span\u003e]. Furthermore, the granuloma also protects Mtb from innate immune responses and chemotherapy, and a rigorous regimen of multiple antibiotics over many months is required for full Mtb clearance. This is often further complicated by the emergence of multidrug resistant (MDR) and extensively drug resistant (XDR) strains of the pathogen [Keshavjee and Rich, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2015\u003c/span\u003e]. To establish this favourable environment for its growth and survival, Mtb uses multiple mechanisms to redirect host signaling and halt immune system clearance. One method by which Mtb accomplishes this is by acting on the cyclic AMP (cAMP) signaling pathway. Found in both humans and bacteria, the signaling molecule cAMP is crucial in numerous physiological functions, including environmental responses, control of gene expression and notably, regulation of immune responses [Raker et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; McDonough and Rodriguez, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2012\u003c/span\u003e]. While many pathogenic bacteria only encode one or a few adenyl cyclase (AC) genes for cAMP synthesis, Mtb harbours 16 such genes. This overabundance of AC genes implicates cAMP as a central Mtb virulence factor [Lin et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Joseph et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e1982\u003c/span\u003e; Smith et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2004\u003c/span\u003e; Shenoy and Visweswariah, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2006\u003c/span\u003e; Kurpad and Dhar, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2025\u003c/span\u003e]. Earlier studies have demonstrated that Mtb-produced cAMP intoxication of macrophages impacts immune function, further suggesting that perturbation of host cAMP pathways may constitute a potential strategy for disrupting Mtb\u0026rsquo;s infection cycle [Agarwal et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2009\u003c/span\u003e]. However, the vast interconnectedness of cAMP with numerous signaling pathways makes the isolation and study of relevant molecules and pathways challenging.\u003c/p\u003e\u003cp\u003eIn this work we attempt to explore the mechanisms by which Mtb exploits cAMP signalling to its advantage by constructing an \u003cem\u003ein silico\u003c/em\u003e model of cAMP signaling and validating the latter using perturbation experiments to cAMP levels in a THP-1 monocyte cell line. While the use of \u003cem\u003ein silico\u003c/em\u003e models of pathway dynamics has a rich history, standard methods typically rely on rate equations formulated as large sets of coupled ordinary differential equations [Gilbert et al., \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2006\u003c/span\u003e]. While offering a high degree of mechanistic fidelity, such models are highly nonlinear in their parameters making them very challenging computationally to estimate. Accordingly, substantial amounts of data are required to support model identification [Rachel et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2024\u003c/span\u003e] even when model reduction techniques are used to decrease the model size, with the latter leading to loss of mechanistic interpretability and bias in model uncertainty. Discrete logic models offer an attractive alternative for qualitatively capturing the characteristic dynamic behaviors of regulatory networks [Thomas et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e1995\u003c/span\u003e; Thomas and Kaufman, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2001\u003c/span\u003e] using a much-reduced parameter space and correspondingly small amount of supporting data [Chaouiya and Remy, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Abou-Jaoud\u0026eacute; et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2016\u003c/span\u003e]. Indeed, this aspect has been extended further to address instances of very sparse and partially observed data by leveraging model checking techniques [Sedghamiz et al., 2017; Sedghamiz et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2019\u003c/span\u003e] where what little experimental data may be available are used to constrain the selection of competing mechanistically informed models instead of constructing an empirical model de novo. In a departure from conventional data-driven empirical modeling, we demonstrate in this work a hypothesis-driven approach rooted in prior mechanistic knowledge may be applied to explore host immune pathway dynamics in the extreme case where no numerical data is available at the outset. Key regulatory mediators relevant to Mtb infection and transmission were extracted from the literature and their known regulatory interactions recovered from multiple manually curated pathway databases. To satisfy the closed loop network architecture required to support homeostatic regulatory stability, undocumented candidate interactions were inferred using a generative artificial intelligence (AI) model with their credibility assessed using information theoretic metrics of trustworthiness. Information flow through the resulting network and the active recruitment of pathway elements was simulated using sets of discrete regulatory logic rules where the predicted network dynamic behaviors were required to align with partial data reported in the literature. A set of competing regulatory logic models was then used to assess the impact of perturbations in each of the network molecular mediators on cAMP response. Using these models, we validate the response of cAMP to a lipopolysaccharide (LPS)-induced pulse in IL-6 and demonstrate the potential of a knowledge-based approach to explore regulatory network dynamics with situations where experimental data is sparse, partially observed or initially absent altogether.\u003c/p\u003e"},{"header":"2. Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1. Identifying molecular mediators of interest\u003c/h2\u003e\u003cp\u003eIn order to create a foundation for the regulatory network, molecular mediators of interest were first identified from existing literature. In this proof-of-concept exercise, we selected a recent review of host immune pathways involved in the persistence of Mtb infection [Dey and Bishai, 2014] as well as a review of mechanisms driving Mtb transmission through cough [Naqvi et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2023\u003c/span\u003e]. Proteins and other signaling molecules were extracted autonomously from these two papers using custom domain-specific named entity recognition (NER) engine. The detailed workflow used to extract molecular mediators from the raw PDF files in this small example expert-curated corpus is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e \u003cb\u003e(a), (b)\u003c/b\u003e. Raw PDF files were first translated into text using Google\u0026rsquo;s ordinary character recognition (OCR) tool Tesseract [Smith, 2007] concurrently with Mindee DocTR [Batra et al., \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2024\u003c/span\u003e]. The resulting sequences of characters expected in regular expressions typically found in a sentence were then identified using the Python library Regex [Onyenwe et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2021\u003c/span\u003e]. Sentence terms were then recognized as proteins, transcripts of other biological entities by iteratively applying pretrained biomedical ontology models, specifically the en_ner_bionlp13cg_md, en_ner_bc5cdr_md, and en_Ner_jnlpba_md language models, available as part of the Sci spaCy Python library [Neumann et al., \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2019\u003c/span\u003e]. Candidate molecular species predicted by Sci spaCy models were verified against the HUGO and UniProt ontologies. Additional recovery of potential molecules of interest was conducted by proposing corrections to discrepancies in spelling, case, etc\u0026hellip; were conducted using the \u003cem\u003edifflib.SequenceMatcher\u003c/em\u003e function included in the \u003cem\u003edifflib\u003c/em\u003e module of the Python Standard Library [Hellman, 2011]. Recovered candidates were then manually verified.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo investigate the functional relevance of these NER-recovered entities, a gene set enrichment analysis (GSEA) [Mubeen et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2019\u003c/span\u003e] was conducted to identify their known involvement in pathways documented in the Pathway Commons database [Rodchenkov et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2020\u003c/span\u003e]. Pathways were considered significantly enriched if a given subset of genes exhibited a \u003cem\u003eq\u003c/em\u003e value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 using a one-sided Fisher\u0026rsquo;s exact test (Fisher, 1992) subjected to a multiple hypothesis testing correction with the Benjamini\u0026ndash;Yekutieli method under dependency (Benjamini and Yekutieli, 2001).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2. Assembling a regulatory network de novo\u003c/h2\u003e\u003cp\u003e\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003eRelationships coordinating changes in these molecular mediators of interest were extracted from manually curated pathway databases using the suite of tools available within Integrated Network and Dynamical Reasoning Assembler (INDRA v1.23.0 [Bachman et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2023\u003c/span\u003e] (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u003cb\u003e(c)\u003c/b\u003e). Specifically, known pathway relationships were extracted using BEL processor and BioPax queries from the BEL Selventa Large Corpus (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/cthoyt/selventa-knowledge\u003c/span\u003e\u003cspan address=\"https://github.com/cthoyt/selventa-knowledge\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) [Hoyt et al. \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2019\u003c/span\u003e] and the Pathway Commons database [Rodchenkov et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2020\u003c/span\u003e] respectively. The latter is itself an aggregate of 23 databases including KEGG Pathway, PANTHER Pathway, Reactome and others, for a total catalogue of over 3.5\u0026nbsp;million curated interactions drawn across close to 7,000 pathways. Relationships from these databases are reconciled and translated into INDRA statements. For example, the statement \u0026ldquo;Activation(ERK(kinase), PRKAC(), kinase)\u0026rdquo; would describe the activation of extracellular signal-regulated kinase (ERK) by the alpha catalytic subunit of protein kinase A (PKA). To achieve this reconciliation across sources, INDRA combines a number of innovative algorithms to address a variety of data fusion challenges that include but are not limited to the inconsistent use of identifiers, fully or partially redundant representations of the same mechanism, detecting and resolving hierarchical relationships between pathway elements, reconciling contradictory statements and ensuring consistency and coherence in causal logic chains across multiple statements. Though INDRA will extract, and assign multiple types of relationships, our focus in this work was on regulatory control actions and so specifically used only those relationships labelled as \u0026ldquo;increased amount\u0026rdquo;, \u0026ldquo;decreased amount\u0026rdquo;, \u0026ldquo;activation\u0026rdquo;, and \u0026ldquo;inhibition\u0026rdquo; of a target molecular species when acted upon by a source species.\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3. Closing regulatory loops using hypothesis generation\u003c/h2\u003e\u003cp\u003eTo support homeostatic stability, a regulatory network model must incorporate closed-loop feedback and feedforward control motifs [Thomaas et al., 1995; Karin et al., \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2016\u003c/span\u003e]. To favour a minimal network model that makes maximal use of known documented relationships, the broader network was pruned by removing nodes with zero input (sources) or output (sinks). In doing so cAMP was also removed as it did not exercise any documented downstream regulatory actions onto the broader network. It was reintroduced by restoring known upstream relationships to cAMP previously extracted using INDRA (\u003cb\u003eSection 2.2\u003c/b\u003e) and predicting candidate downstream relationships using a generative artificial intelligence (AI) large language model (LLM) for hypothesis generation. Given the tendency for LLMs to produce spurious results or hallucinate, we selected the Trustworthy Language Model (TLM) from Cleanlab (San Francisco, CA) as a means of generating and quantifying the reliability of the hypothesized relationships [Sardana, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2025\u003c/span\u003e]. The latter does not draw on a custom-trained model to assess uncertainty but rather uses an LLM-as-a-judge architecture as an overarching environment to assess the answers provided by any user-specified LLM, in this case OpenAI's GPT-4 (San Francisco, CA) [Sanderson, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2023\u003c/span\u003e]. Relationships between cAMP and each molecule found in the existing closed-loop model were queried to the TLM with the following sentence structures:\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003eDoes protein [protein name] regulate messenger molecule cAMP (1) positively, (2) negatively or (3) not at all? Respond with only one choice\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eAnd conversely,\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003eDoes messenger molecule cAMP regulate protein [protein name] (1) positively, (2) negatively or (3) not at all? Respond with only one choice.\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eConfidence of the suggested regulatory relationships was assessed based on two basic premises, namely (i) observed consistency, and (ii) self-reflection as extrinsic and intrinsic evaluations of LLM confidence [Chen and Mueller, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2023\u003c/span\u003e]. The former, observed consistency, consists of assessing the dissimilarity between answers obtained from the LLM using multiple Chain-of-Thought (CoT) [Wei et al., 2020] variants of the initial query using different levels of output randomness or sampling temperature. The latter, self-reflection certainty, consists of simply asking the LLM to intrinsically reflect on how confident it is about whether its own previously generated answer is correct or not. These two measures are combined to produce an aggregate trustworthiness score. The distribution of trustworthiness scores obtained for all TLM queries regarding upstream and downstream regulatory actions was then analyzed using Otsu\u0026rsquo;s method [Otsu, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e1975\u003c/span\u003e] to generate a threshold value separating background and foreground values. Predicted interactions above the generated threshold (foreground) were retained as candidate regulatory actions in the network. In isolated cases where the necessary balance of upstream regulatory actions is not predicted by the LLM, we conducted a focused search of the Elsevier Biology Knowledge Graph (Elsevier, Amsterdam, NL) [Kamdar et al., \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2020\u003c/span\u003e] which is populated by relationships extracted from peer reviewed scientific literature by the natural language processing (NLP) engine MedScan [Novichkova et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Daraselia et al., \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2004\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4. Decisional logic of the regulatory network\u003c/h2\u003e\u003cp\u003eGiven a candidate network structure, a decisional logic program is required to direct each molecular mediator node\u0026rsquo;s response to incoming control actions. Extending foundational work by others using Boolean logic [Albert and Thakar, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2014\u003c/span\u003e], we apply a more granular multi-level state transition logic whereby the actions of upstream nodes expressed above a perception threshold are combined, weighing the competing actions of weak inactivators against strong activators (and vice versa ) to compute a change in the state of the target regulated node [Sedghamiz et al., \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e2018\u003c/span\u003e] (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Sets of integer values for the decisional logic parameters, as well as the values of unobserved states, are estimated by solving a constraint satisfaction (SAT) problem whereby allowable values are those that support predicted network behaviors that do not contradict available observations, with these often being sparse and incomplete [Sedghamiz et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2019\u003c/span\u003e]. In this work we explore the scenario where no prior experimental results are available. Instead we constrain network behaviors to explain three steady state conditions that might be expected of macrophages described qualitatively and incompletely by the relative activation in a subset of markers in: (1) an uninfected baseline state, where levels of each node are at a low basal level of zero, (2) an unmeasured condition resulting in the persistent upregulation of cAMP to a maximum level of 2 and (3) a partially observed state reported piecemeal across multiple literature sources and describing activation levels in a subset of markers corresponding to a state of persistent Mtb infection [Russell et al., \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Wang et al., \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Rothchild et al., \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; Silv\u0026eacute;rio et al., \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Haile et al., 2021; Queval et al., \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Ayalew et al., \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2024\u003c/span\u003e]. As this parameter estimation may also result in changes to the network architecture, additional constraints were applied to enforce structural integrity and to ensure overall network stability, for example disallowing the formation of new source or sink nodes and nodes subject to purely inhibitory regulation [Reimer et al., \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e2025\u003c/span\u003e]. Of the competing models that complied with these constraints, a subset of 100 network solutions were chosen for further study of signaling dynamics.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5. Simulation of cAMP signaling responses\u003c/h2\u003e\u003cp\u003eUsing the state transition logic defined by the parameter sets described above, an increase or decrease in the activation level of each node is computed and this changed applied in an upcoming iteration using a synchronous, asynchronous or priority-based scheduling scheme [Lyman et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2021\u003c/span\u003e]. In this work, a synchronous update scheme was applied whereby all incremental changes proposed by the state transition logic are enacted at all nodes simultaneously to update the network state from one iteration to the next. This update strategy delivers improved computational efficiency when the focus is achievability of migration from one steady state to another rather than the exact sequence of events along the transition path. On this basis, a simulation-based sensitivity analysis was applied to a subset of 48 candidate network models which exactly supported the upregulation of cAMP as an achievable persistent state. Specifically, for each model the predicted response to exogenous upregulation of each node individually was simulated one at a time and results surveyed across all models to identify perturbations most likely to produce a long-lasting upregulation of cAMP. Perturbations were introduced as continuous perturbations, lasting throughout the simulation time course of 100 iterations, as well as limited pulses where exogenous upregulation was maintained for 50 iterations then discontinued.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6. THP-1 human monocytic cell culture\u003c/h2\u003e\u003cp\u003eHuman THP-1 monocytes were obtained from ATCC (TIB-202) and regularly maintained in RPMI-1640 media supplemented with 10% fetal bovine serum (FBS), 1mM sodium pyruvate, 2.5 mM N-2-hydroxyethylpiperazine-N-2-ethane sulfonic acid (HEPES), and 1x non-essential amino acids (Thermo Scientific #11140050) at 37˚C in 5% CO\u003csub\u003e2\u003c/sub\u003e. Cells were regularly counted via trypan blue staining in a hemocytometer and passages were performed as necessary to maintain cell density below 1 x 10\u003csup\u003e6\u003c/sup\u003e cells/mL. To differentiate THP-1 monocytes to macrophages, cells were seeded in a 12 well plate at a density between 1\u0026ndash;3 x 10\u003csup\u003e5\u003c/sup\u003e cells/mL. Cells were exposed to 80 ng/mL of phorbol 12-myristate 13-acetate (PMA) for 24 hours and allowed to rest in PMA free media for an additional 24 hours before further experiments.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e2.7. Measurement of cAMP levels in THP-1 cells\u003c/h2\u003e\u003cp\u003eNon-differentiated THP-1 cells were seeded in a 24 well plate at a density of 2.0 x 10^5 cells/mL and PMA differentiated as described above. Following a 24-hour rest period in PMA free media, 100 ng/mL LPS or 100 \u0026micro;M forskolin (FSK), an adenylyl cyclase activator [Seamon and Daly, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e1981\u003c/span\u003e], was added to each treatment well. Cyclic AMP levels were measured using a commercial cAMP ELISA kit (Cayman Chemical Cat. # 581001). Cell lysate was collected via acid lysis as recommended by kit protocol. Each sample was diluted 1:10 and processed in accordance with manufacturer\u0026rsquo;s instructions. LPS samples were collected at 1 and 2 hour timepoints, FSK samples were collected at 10 and 30 minute timepoints (\u003cb\u003eSupplementary Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e\u003c/b\u003e).\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e3.1. A cAMP regulatory network produced from prior knowledge\u003c/h2\u003e\u003cp\u003eIn this proof-of-concept analysis NER was applied to two expert-curated review papers describing host immune response to Mtb infection [Dey and Bishai, 2014] and transmission through cough [Naqvi et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2023\u003c/span\u003e] resulting in the recovery of 22 molecular species related to persistence of infection and 9 molecular species associated with cough (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u003cb\u003e(a), (b)\u003c/b\u003e). Between 2 and 8 (on average 5) of these 31molecular entities were associated with any one of 44 pathway gene sets with enrichment scores significant at an adjusted p-value q\u0026thinsp;\u0026lt;\u0026thinsp;0.05. These 44 annotated reference pathway gene sets consisted of between 5 and 182 genes as documented in either the Gene Ontology or Reactome component databases available in the Pathway Commons environment (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e(a)\u003c/b\u003e). Pathways enriched by as many as 7 to 8 of 31 extracted entities involved detection by cytosolic sensors of pathogen-associated DNA(REAC:R-HSA-1834949), cytoplasmic pattern recognition receptor signaling (GO:0002753), Type I interferon production and regulation (GO:0032606, GO:0032479, GO:0032481) followed by response to Gram-positive bacterium (GO:0050830). Also well represented were pathways overseeing the production and regulation of IL-1b and IL-8, enriched by 6 of 31 text-mined entities.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003e\u003cem\u003eAn initial set of literature-informed regulatory network nodes.\u003c/em\u003e List of proteins/signaling molecules of interest extracted using NER from Dey and Bishai (2014) and Naqvi et al. (\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2023\u003c/span\u003e) along with the number of pathways documented in Pathway Commons with which they are associated.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUniProt Name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eFull Protein Name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eDocumented pathways\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"2\" nameend=\"c2\" namest=\"c1\"\u003e\u003cp\u003e\u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eDey and Bishai (2014)\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eVPS33B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eVacuolar protein sorting-associated protein 33B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCA2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbonic anhydrase 2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHAMP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eHepcidin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTNF\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTumor necrosis factor\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMYD88\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMyeloid differentiation primary response protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e27\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCGAS\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCyclic GMP-AMP synthase\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTLR9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eToll-like receptor 9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eESX1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eHomeobox protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePTPA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSerine/ threonine-protein phosphatase 2A activator\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIRF3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eInterferon regulatory factor 3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e23\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSAMHD1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDeoxynucleoside triphosphate triphosphohydrolase\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCRP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eC-reactive protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAKAP7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eA-kinase anchor protein 7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ecAMP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCyclic AMP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAIM2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eInterferon-inducible protein\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e10\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNLRP3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNACHT, LRR and PYD domains-containing protein 3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDDX41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eProbable ATP-dependent RNA helicase\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTREX1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eThree-prime repair exonuclease 1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e13\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCD14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMonocyte differentiation antigen CD14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIFI16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGamma-interferon-inducible protein 16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePI3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eElafin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTBK1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSerine/ threonine-protein kinase\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e24\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"2\" nameend=\"c2\" namest=\"c1\"\u003e\u003cp\u003eNaqvi et al. (\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2023\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSTAT1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSignal transducer and activator of transcription 1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e13\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNTM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNeurotrimin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eASIC2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eAcid sensing ion channel subunit 2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTRPV1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTransient receptor potential cation channel subfamily V member 1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTRPM8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTransient receptor potential cation channel subfamily M member 8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTRPA1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTransient receptor potential cation channel subfamily A member 1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eASIC3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eAcid sensing ion channel subunit 3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIL6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eInterleukin 6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eExtending this analysis to discover relationships that traverse annotated pathway gene sets, we applied tools from the INDRA Python language library to extract documented relationships linking these molecules from the biological expression language (BEL) Large Corpus and the BioPax/ Pathway Commons databases. These relationships were reconciled and expressed as INDRA statements where each statement assigned a source molecular mediator to its known target with a specific standardized relationship type (e.g. IncreasedAmount). For the purposes of this project, we retained only those relationships related to regulatory control of a target molecular species, specifically those INDRA statements labelled Activation, Inhibition, IncreasedAmount and DecreasedAmount of target. The resulting initial network consisted of our original 31 entities as well as their documented first neighbors for a total of 399 molecular mediator nodes linked by 560 INDRA relationship statements (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e\u003cb\u003e(a)\u003c/b\u003e). Of these 399 mediator nodes, 132 were source nodes such as cAMP feeding into the network (no upstream mediator or zero indegree) and 203 were reporter or sink nodes regulated by the network (no downstream target or zero outdegree). In order to support regulatory stability, recall that we require all network mediator nodes to regulate and be regulated, that is to be part of a feedback regulatory motif. As such, we removed all source and reporter or sink nodes to produce a pruned fully closed loop network consisting of 62 molecular species, including 11 of our original 31 text-extracted entities. After removal of duplicate entries from the various component databases in the Pathway Commons environment, these 62 nodes were now linked by 149 regulatory relationships (directed edges) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e\u003cb\u003e(b), Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e(b)\u003c/b\u003e). Recall that cAMP was devoid of known upstream regulatory actions from other nodes in the initial network. As such it was removed during this editing process and subsequently reintroduced by linking cAMP to putative mediators proposed by a LLM as described in the next section.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e3.2. LLM predictions of network interactions closed-loop regulation of cAMP\u003c/h2\u003e\u003cp\u003eTo re-introduce cAMP into the closed loop regulatory network, a large language model (LLM) was used to predict how cAMP would interact with other nodes in the network, with upstream interactions regulating cAMP being of special interest. Due to the tendency for LLMs to generate highly speculative information, the believability of these predictions was assessed using metrics incorporated into the Trustworthy Language Model (TLM) (Cleanlab, San Franscisco, CA) (\u003cb\u003eSection 2.3\u003c/b\u003e). Of the 84 queries submitted to the LLM, 61 predicted either a positive or negative regulatory relationship (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e(c)\u003c/b\u003e) with trustworthiness scores ranging between 0.134 to 0.989. Applying Otsu\u0026rsquo;s method [Otsu, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e1975\u003c/span\u003e] for separating foreground values from background to these 61 values of trustworthiness we obtain 13 new putative relationships involving cAMP with a trustworthiness greater or equal to the threshold value of 0.623. Of these, cAMP is a predicted target of upstream mediated downregulation by CXCL12, ERK and TNF (0.793, 0.763, 0.763 trustworthiness respectively). With all incoming regualtory actions being inhibitory, cAMP activation will inevitably decrease to its floor value with no means of recovering. This violates one of our criteria according to which all network nodes must be the target of at least one positive regulatory action [Reimer et al., \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e2025\u003c/span\u003e]. This unilateral downregulation was also the case for network elements IL-1b, ERBB2, Insulin (INS), NFKBIA and AGT. For the latter, we conducted an additional interrogation of the LLM with a focus on upstream regulation of these entities yielding an additional 7 positive regulatory relationships at a high trustworthiness for IL-1b, NFKBIA and AGT. The LLM also predicted 5 relationships positively mediating ERBB2 and INS but with low trustworthiness (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e(d)\u003c/b\u003e). These were included as highly speculative and tagged as available for removal should they not prove helpful in explaining the available data (\u003cb\u003eSection 2.4\u003c/b\u003e). Finally, since no positive upstream regulators of cAMP were predicted by the LLM, we conducted a focused search of the Elsevier Biology Knowledge Graph (Elsevier, Amsterdam, NL) which is populated by an automated text-mining of the peer-reviewed literature (\u003cb\u003eSection 2.3\u003c/b\u003e). This yielded positive regulation of cAMP by EGFR and MAPK1as reported in 4 and 25 journal publications respectively (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e(e)\u003c/b\u003e). These and other inclusions necessary to satisfy a closed loop architecture resulted in a final network consisting of 178 regulatory relationships linking 63 molecular species (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e3.3. Reverse engineering and simulation of regulatory behavior\u003c/h2\u003e\u003cp\u003eWe expect this network model to support pathway-informed explanations of partially observed immune responses to Mtb infection. As we are demonstrating the use of this approach in the extreme case of a potentially novel pathogen where little if any experimental data is available, we turn to fundamental behaviors such as a stable baseline resting state and reports of up expression or down expression of measurable illness markers. In this example use case we define three stable persistent states that must be supported by any regulatory logic applied to this immune network circuitry (Section 2.4). These consist of a baseline resting state where all 63 molecular mediators are expressed at their basal state of 0 (constraint C1), a largely unmeasured state supporting a persistent activation of cAMP to 2 (constraint C2), and a partially observed state corresponding to persistent Mtb infection (constraint C3). The later manifested as maximal activation to level 2 of IL-6, STAT3, IL-17a, SerpinA3, IL-18, TNF, IL-18, CCL2, IFNg, TLR2, CD40, TLR9, CXCL8 concurrent with baseline activation (0) of CXCL12 (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e\u003c/b\u003e). Formulating a constraint satisfaction optimization problem, we solve for values of decisional weights and threshold values for every regulatory relationship as well as feasible values of unmeasured mediators at each condition as a steady state, that is where the next desired state of all nodes is their current state. As this problem is highly underdetermined mathematically there might exist a large number of solutions that satisfy these conditions equally well. Interestingly when required to satisfy these 3 conditions as strict steady states no parameter sets were found for this network configuration where the next predicted state remained the current state for all nodes. Only when allowing a 5% deviation from steady state, or 6 bits across all nodes, was a family of parameter sets recovered. Here we analyze a random subset of 100 logic parameter sets and corresponding predicted states for unmeasured mediators from this broader family of solutions (\u003cb\u003eSupplementary Table S3\u003c/b\u003e). Of these 100 models, only 48 fully supported a persistently elevated state of 2 (high with zero deviation from steady state) for cAMP with 57 to 60 or 92\u0026ndash;97% the remaining 62 markers also occupying a stable steady state (\u003cb\u003eSupplementary Table S4\u003c/b\u003e). These 3 to 5 non-compliant fluctuating nodes typically consisted of TLR9, STAT1, CD40, CD14, and EGFR. For these 48 regulatory models, a stable concurrent activation at a maximal level of 2 was also predicted for 9 other nodes, namely NFKBIA (43/ 48 models), TNFRSF1B (43/48 models), AKT2 (42/48 models), PTEN (42/48 models), TGFA (42/48 models), IL-18 (42/48 models), TNF (43/48 models), INS (41/48 models), and EGFR (41/48 models)(\u003cb\u003eSupplementary Table S5\u003c/b\u003e)(Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eDuring the reverse engineering of the regulatory logic, relationships extracted from documented pathways were forcibly retained in the network as were text-mined relationships (Elsevier Biology Knowledge Graph) and LLM posited relationships involving cAMP as a target with a high trustworthiness (\u0026gt;\u0026thinsp;0.623). Other LLM posited relationships where cAMP was not a regulatory target or where trustworthiness was low or the relationship was not predicted to exist were allowed to be discarded during identification of the regulatory logic. Of the 48 network models, LLM predicted cAMP positive regulation of NOS3 and ADIPOQ were unanimously retained. Interestingly cAMP regulation of ADIPOQ was retained as being useful in explaining the partially observed conditions despite a trustworthiness score of 0.483, well below the threshold of 0.623. Also retained unanimously were positive regulation of IL-1b by MAPK14, positive regulation NFKBIA by TNF, and of AGT by IFNg (\u003cb\u003eSupplementary Table S6).\u003c/b\u003e None of the relationships predicted by the LLM to be inexistent were retained unanimously, however, positive regulation of ERBB2 by AKT and of insulin (INS) by TNF were conserved in 46 and 41 of the 48 models respectively.\u003c/p\u003e\u003cp\u003eTo distinguish the causal upregulation of cAMP from concurrent up expression, a series of simulations were conducted where the network was initialized as its baseline state with all markers at basal activation levels of 0, then each marker was exogenously upregulated individually in turn to a maximal activation of 2 and held at that level as a transient pulse over 50 iterations or for the entire duration of 100 iterations. The network state was updated from one iteration to the next using the basic synchronous update scheme since the focus was the reachability of a persistent maximal activation of cAMP rather than the exact path traversed. The typical responses of cAMP to these perturbations consisted of 5 characteristic trajectories across all 48 models, namely (a) a stable and persistent upregulation, (b) a transient upregulation only, (c) a transient upregulation with single oscillation, (d) a transient with sustained slow low-medium oscillation and (e) transient upregulation with sustained medium-high rapid oscillation (\u003cb\u003eSupplementary Figure \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e\u003c/b\u003e).\u003c/p\u003e\u003cp\u003eThese in silico single mediator perturbations predict that persistent upregulation of cAMP could result causally from the exogenous addition of only 4 markers, namely IL-6 (48/48 continuous; 4/48 pulse), EGFR (48/48 continuous; 4/48 pulse), MAPK1 (47/48 continuous; 4/48 pulse), and IL-18 (43/48 continuous; 2/48 pulse) (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e) (\u003cb\u003eSupplementary Table S7\u003c/b\u003e). Among the network models where persistent activation of cAMP was predicted the concurrent response of the broader network varied from widely unresponsive (model 42)( \u003cb\u003eSupplementary Figure S3\u003c/b\u003e) to highly oscillatory (model 53) \u003cb\u003e(Supplementary Figure S4)\u003c/b\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e3.4. Indirect experimental validation in THP-1 macrophage cell line\u003c/h2\u003e\u003cp\u003eProlonged exposure to exogenous IL-6 is predicted to induce self-sustaining elevated IL-6 and a persistent elevation of cAMP levels by all 48 models. As such it was selected as the perturbation for experimental validation over EGFR which was also predicted to produce persistent elevation of cAMP but was not predicted to achieve a strictly self-sustaining steady state. As lipopolysaccharide (LPS) is a known inducer of IL-6 and more readily available as a reagent, it was used in this work as an indirect means of experimentally creating an elevated IL-6 medium (\u003cb\u003eSupplementary Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e)\u003c/b\u003e. A one-time transient challenge with LPS was applied to a culture of THP-1 macrophages and cAMP levels measured (\u003cb\u003eSupplementary Figure S5\u003c/b\u003e). Results of the LPS pulse experiments are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e. LPS induced a transient increase in cAMP levels 60 min after exposure, however the levels decreased after 120 min. To validate our cAMP readout, as a control we also exposed THP-1 cells to Forskolin, an activator of adenylate cyclase, which produced an increase cAMP levels as expected [Seamon and Daly, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e1981\u003c/span\u003e]. Our experiments with IL-6 suggest that the cAMP response is transient. Recall that only 4 of 48 models predicted that a transient pulse in IL-6 would be sufficient to produce a lasting elevation cAMP levels. In contrast all 48 models predicted that a persistent elevation of cAMP would require a prolonged exposure to IL-6.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eIn this work we explore how a hypothesis driven approach based on prior knowledge alone may inform on experimental design in the absence of the often-abundant data typically required by data-driven statistical analyses. This is especially relevant to the design of effective strategies for crafting rapid responses to the emergence of a novel pathogen. \u0026nbsp; In this work we explore the regulation of cAMP\u003c/p\u003e\n\u003cp\u003edynamics within macrophages, by assembling and using a minimal regulatory network model to predict shifts in the expression patterns of molecular pathway mediators. Ultimately, experimental validation of these predictions, i.e. within a live cell model, can further inform network behavior and improve the pace of focused experimental studies. As cAMP is a key signalling molecule targeted by Mtb infection, the specific objective of this proof of concept use case was to identify key mediators of cAMP signaling in the hope of understanding some of the strategies used by Mtb and how these might be disrupted. In contrast with most pathway analytical approaches, closed loop regulatory feedback and dynamic stability are rigorously represented and enforced in the architecture and programming of the network model. \u0026nbsp;Moreover, assembling the network model de novo from established prior knowledge of pathway structures makes such models inherently interpretable mechanistically in addition to enabling their use with very sparse and partially observed data unsuitable for empirical data mining. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAs a case in point, we explore cAMP regulation in the context Mtb infection in the absence of experimental data. \u0026nbsp;Results of qualitative simulations conducted in a family of competing mechanistically informed network models have been supported at least in part by experimental validation in cell culture. \u0026nbsp;The network presented here is a minimal model, based on entities extracted from a very small reference corpus. As we continue to explore the mechanisms of Mtb -host interaction, our ongoing work will increase the size of this expert-curated corpus to include peer-reviewed publications from a broader selection of authors/groups to reduce bias. Though a proof-of-concept investigation, the partial validation of in silico experiments nonetheless offer focused avenues for further investigation. For example, these preliminary simulation results suggest that in addition to cAMP produced endogenously by Mtb [Agarwal et al., 2009], the pathogen\u0026rsquo;s recruitment of IL-6 response in the host serves only to exacerbate this by further driving increased cAMP production by resident macrophages. \u0026nbsp;Notably, IL6 is reported to be involved in macrophage polarization, which plays a key factor in determining the outcome of an Mtb infection [Peyron et al.,2008; Russell et al., 2019; Martinez et al., 2013; Fernando et al., 2014; Sanmarco et al., 2017]. \u0026nbsp;Interestingly, the predicted regulation of cAMP by IL6 is contrary to current literature, with some studies establishing that cAMP is the regulator of IL6 [Luan et al., 2023; Chio et al., 2004] However, in the context of an Mtb infection within a macrophage, this relationship has not been studied.\u003c/p\u003e\n\u003cp\u003eImportantly, these same simulations also suggest the networked effects of increased IL-6 and cAMP include an upregulation of MAPK3 and STAT3, both apoptosis inhibitors [Seimon et al, 2009; Liu et al., 2003], along with a concurrent downregulation of RIPK1, a motivator of apoptosis. Similar network propagation of elevated IL-6 and cAMP in downregulation of IL-18 which along with RIPK1 are known motivators of cellular necrosis [Ju et al., 2022]. By highlighting the role of IL-6 in potentially exacerbating elevated cAMP production, these simulation experiments suggest a self-inflicted compounding role of IL-6 in disrupting apoptotic and necrotic programmed cell death in infected macrophages. Indirect effects such as these that arise as a result of pathway signal propagation require a network theoretical approach if they are to be captured and understood. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTo produce these results, it was necessary to fill gaps in our existing pathway knowledge by proposing possible regulatory actions that could provide a closed loop regulatory architecture. \u0026nbsp;We used a general-purpose Large Language generative AI model for this purpose, assessing the credibility of these predictions by means of a quantitative trustworthiness score [Chen and Mueller, 2023]. \u0026nbsp;It was interesting to observe in this admittedly limited analysis that although many of the hypothesized relationships with very high trustworthiness scores were highly retained as useful in explaining specified network dynamics, this was not a hard and fast rule. \u0026nbsp;Indeed, while 6 of the 9 relationships retained in more than 40 of 48 models were assigned high trustworthiness scores, two highly retained relationships were predicted to be non-existent, and one unanimously retained relationship scored far below the trustworthiness threshold. \u0026nbsp;Moreover, 6 of the 16 relationships (~38%) predicted with high trustworthiness were unanimously rejected. While there is undoubtedly room for improvement, it is important to remember that the LLM used was not domain-focused and reliability was based on\u0026nbsp;an\u0026nbsp;LLM-as-a-judge introspective approach purposely devoid\u0026nbsp;of a Gold Standard. \u0026nbsp;As we do not require the universality of LLMs , we are currently investigating the use of smaller domain and even task specific LM [Sinha et al., 2024]. As LLMs remain hypothesis generation engines, we propose that a more robust approach might be to combine such model-based credibility measures with the practical contribution of a novel regulatory relationships in explaining the observed and expected dynamics of the system. \u0026nbsp; \u0026nbsp; \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWhile the limited experimental validation conducted here offered evidence that a one-time pulse perturbation with IL-6 using LPS did result in an upregulation of cAMP, this upregulation appears to be transient. \u0026nbsp;This would essentially invalidate the 4 candidates in our pool of 48 competing models where a pulse perturbation was predicted to be sufficient to trigger persistent cAMP upregulation, favouring instead the remaining 44 models predicting that a continued stimulation with IL-6 would be needed. This exemplifies how even a very partially observed single experimental such as this one can be used to incrementally reduce the pool of competing network models. \u0026nbsp;Our group has been exploring the use of numerical techniques to accelerate this process of elimination by using the model pool to actually predict which experiments would best discriminate between candidate models to converge most efficiently on the smallest number of alternatives [Videla et al., 2015]. Model-informed methods such as these emphasize that not all data is equally informative and that the conventional focus on generating more data might instead be redirected to generating the right kind of data. \u0026nbsp; Similarly, the measurement of multiple network mediators in addition to cAMP measured here would also introduce much more stringent requirements on model selection and multiplex analyses will be a growing component in our continuing work. \u0026nbsp;As we continue to refine and broaden the capabilities of our live cell visualization of cAMP dynamics, we argue that the methods and tools presented here constitute a practical framework for harnessing prior knowledge and sparse partially observed data to deliver rational pathway-informed hypotheses that in turn serve to guide and improve the efficiency of experimental laboratory work aimed at unraveling the complexities of the host-pathogen interaction between Mtb and the macrophage.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements and Funding Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the University of Saskatchewan\u0026rsquo;s Vaccine and Infectious Disease Organization (VIDO). VIDO receives operational funding from the Canada Foundation for Innovation (CFI) through the Major Science Initiatives Fund and from the Government of Saskatchewan through Innovation Saskatchewan and the Ministry of Agriculture. This article is submitted with the permission of the Director of VIDO. Research in N.D\u0026rsquo;s lab was supported by funding from Canadian Institutes of Health Research (ARB-185715 and ARB-192058); Natural Sciences and Engineering Research Council of Canada (RGPIN-2023-05746) and Saskatchewan Health Research Foundation (6239). The authors also wish to thank Kim Chiok, Michelle Gerber, Neil Rawlyk, Nandini, and Shamsuddeen Ma\u0026rsquo;aruf from the Dhar lab.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests Statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003eChris Chen, Joyce Reimer, Pranta Saha: Methodology, Software, Formal analysis, Data Curation, Writing - Original Draft\u003c/p\u003e\n\u003cp\u003eShaun Wachter: Validation, Investigation, Resources, Data Curation\u003c/p\u003e\n\u003cp\u003eJeff Chen, Neeraj Dhar: Conceptualization, Methodology, Investigation, Resources, Writing - Review \u0026amp; Editing, Funding acquisition, Supervision, Project administration\u003c/p\u003e\n\u003cp\u003eGordon Broderick: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Writing - Original Draft, Writing - Review \u0026amp; Editing, Funding acquisition, Supervision, Project administration.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003eAll data used to conduct the analyses may be found as Supporting Information.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAdditional Information\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAdditional File: Supplementary_Tables.xlsx\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S1\u003c/em\u003e. (a) Pathway membership of NER extracted entities as documented in the Pathway Commons database. (b) Interaction relationships extracted by INDRA from the Pathway Commons database (* denotes contradictory database entries), (c) Interaction relationships involving cAMP predicted by the CleanLab Trustworthy LLM (CleanLab, San Francisco, CA), (d) Additional interaction relationships required to balance positive and negative regulation predicted by the CleanLab Trustworthy LLM (CleanLab, San Francisco, CA), Table S1. (e) Additional interaction relationships required to provide positive regulation of cAMP extracted by text-mining to the EmBiology database (Elsevier, Amsterdam, NL) \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S2\u003c/em\u003e. Expected stable persistence behaviors expected of the regulatory network (Low=0, Mid-range=1, High=2, Unknown=-99)\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S3\u003c/em\u003e. Activation profiles predicted by a sample of 100 competing models satisfying all 3 steady state conditions within a 5% tolerance (6 bits across 63 state variables).\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S4\u003c/em\u003e. Extent of deviation from steady state in a sample of 100 competing models satisfying all 3 steady state conditions within a 5% tolerance (6 bits across 63 state variables).\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S5\u003c/em\u003e. Subset of 48 models that exactly satisfy a strongly upregulated steady state for cAMP only from a sample of 100 competing models.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S6\u003c/em\u003e. Retention of inferred relationships in subset of 48 models that exactly satisfy a strongly upregulated steady state for cAMP only from a sample of 100 competing models.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSupplementary Table S7\u003c/em\u003e. Simulated response to persistent and pulse perturbation in 48 of 100 competing models supporting cAMP steady state.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eTuberculosis (TB). https://www.who.int/news-room/fact-sheets/detail/tuberculosis. Accessed January 4, 2025\u003c/li\u003e\n \u003cli\u003eCambier CJ, Falkow S, Ramakrishnan L. 2014. Host Evasion and Exploitation Schemes of Mycobacterium tuberculosis. \u003cem\u003eCell\u003c/em\u003e 159(7):1497-1509.\u003c/li\u003e\n \u003cli\u003eChandra P, Grigsby SJ, Philips JA. 2022. Immune evasion and provocation by Mycobacterium tuberculosis. \u003cem\u003eNat Rev Microbiol\u003c/em\u003e 20:750-766.\u003c/li\u003e\n \u003cli\u003eKJ, Keshavjee S, Rich ML. 2015. Multidrug-Resistant Tuberculosis and Extensively Drug-Resistant Tuberculosis. \u003cem\u003eCold Spring Harb Perspect Med\u003c/em\u003e 5:a017863.\u003c/li\u003e\n \u003cli\u003eRaker VK, Becker C, Steinbrink K. 2016. The cAMP Pathway as Therapeutic Target in Autoimmune and Inflammatory Diseases. \u003cem\u003eFront Immunol\u003c/em\u003e 7:123.\u003c/li\u003e\n \u003cli\u003eMcDonough KA, Rodriguez A. 2012. The myriad roles of cyclic AMP in microbial pathogens: from signal to sword. \u003cem\u003eNat Rev Microbiol\u0026nbsp;\u003c/em\u003e10:27-38.\u003c/li\u003e\n \u003cli\u003eLin CT, Chen YC, Jinn TR, Wu CC, Hong YM, Wu WH. 2013. Role of the cAMP-Dependent Carbon Catabolite Repression in Capsular Polysaccharide Biosynthesis in Klebsiella pneumoniae. \u003cem\u003ePLOS ONE\u003c/em\u003e 8:e54430.\u003c/li\u003e\n \u003cli\u003eJoseph E, Bernsley C, Guiso N, Ullmann A. 1982. Multiple regulation of the activity of adenylate cyclase in Escherichia coli. \u003cem\u003eMolec Gen Genet\u003c/em\u003e 185:262-268.\u003c/li\u003e\n \u003cli\u003eSmith RS, Wolfgang MC, Lory S. 2004. An Adenylate Cyclase-Controlled Signaling Network Regulates Pseudomonas aeruginosa Virulence in a Mouse Model of Acute Pneumonia. \u003cem\u003eInfection and Immunity\u003c/em\u003e 72:1677-1684.\u003c/li\u003e\n \u003cli\u003eShenoy AR, Visweswariah SS. 2006. Mycobacterial adenylyl cyclases: Biochemical diversity and structural plasticity. \u003cem\u003eFEBS Letters\u003c/em\u003e 580:3344-3352\u003c/li\u003e\n \u003cli\u003eKurpad SS, Dhar N. Playing Telephone: How Secondary Messengers Influence Host\u0026ndash;Pathogen Interactions in Tuberculosis. ACS Infectious Diseases. 2025 Jun 20.\u003c/li\u003e\n \u003cli\u003eAgarwal N, Lamichhane G, Gupta R, Nolan S, Bishai WR. 2009. Cyclic AMP intoxication of macrophages by a Mycobacterium tuberculosis adenylate cyclase. \u003cem\u003eNature\u003c/em\u003e 460:98-102.\u003c/li\u003e\n \u003cli\u003eGilbert D, Fuss H, Gu X, Orton R, Robinson S, Vyshemirsky V, Kurth MJ, Downes CS, Dubitzky W. Computational methodologies for modelling, analysis and simulation of signalling networks. Briefings in Bioinformatics. 2006 Dec 1;7(4):339-53.\u003c/li\u003e\n \u003cli\u003eRachel T, Brombacher E, W\u0026ouml;hrle S, Gro\u0026szlig; O, Kreutz C. Dynamic modelling of signalling pathways when ordinary differential equations are not feasible. Bioinformatics. 2024 Dec;40(12):btae683.\u003c/li\u003e\n \u003cli\u003eThomas R, Thieffry D, Kaufman M. Dynamical behaviour of biological regulatory networks\u0026mdash;I. Biological role of feedback loops and practical use of the concept of the loop-characteristic state. Bulletin of mathematical biology. 1995 Mar;57:247-76.\u003c/li\u003e\n \u003cli\u003eThomas R, Kaufman M. Multistationarity, the basis of cell differentiation and memory. II. Logical analysis of regulatory networks in terms of feedback circuits. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2001 Mar 1;11(1):180-95.\u003c/li\u003e\n \u003cli\u003eChaouiya C, Remy E. Logical modelling of regulatory networks, methods and applications. Bulletin of mathematical biology. 2013 Jun;75:891-5.\u003c/li\u003e\n \u003cli\u003eAbou-Jaoud\u0026eacute; W, Traynard P, Monteiro PT, Saez-Rodriguez J, Helikar T, Thieffry D, Chaouiya C. Logical modeling and dynamical analysis of cellular networks. Frontiers in genetics. 2016 May 31;7:94.\u003c/li\u003e\n \u003cli\u003eSedghamiz H, Chen W, Rice M, Whitley D, Broderick G. Selecting optimal models based on efficiency and robustness in multi-valued biological networks. In2017 IEEE 17th international conference on bioinformatics and bioengineering (BIBE) 2017 Oct 23 (pp. 200-205). IEEE.\u003c/li\u003e\n \u003cli\u003eSedghamiz H, Morris M, Craddock TJ, Whitley D, Broderick G. Bio-modelchecker: using bounded constraint satisfaction to seamlessly integrate observed behavior with prior knowledge of biological networks. Frontiers in Bioengineering and Biotechnology. 2019 Mar 26;7:48.\u003c/li\u003e\n \u003cli\u003eDey B, Bishai WR. Crosstalk between Mycobacterium tuberculosis and the host cell. In: Seminars in immunology 2014 Dec 1 (Vol. 26, No. 6, pp. 486-496). Academic Press.\u003c/li\u003e\n \u003cli\u003eNaqvi KF, Mazzone SB, Shiloh MU. Infectious and inflammatory pathways to cough. Annual review of physiology. 2023 Feb 10;85(1):71-91.\u003c/li\u003e\n \u003cli\u003eSmith R. An overview of the Tesseract OCR engine. In: Ninth international conference on document analysis and recognition (ICDAR 2007) 2007 Sep 23 (Vol. 2, pp. 629-633). IEEE.\u003c/li\u003e\n \u003cli\u003eBatra P, Phalnikar N, Kurmi D, Tembhurne J, Sahare P, Diwan T. OCR-MRD: performance analysis of different optical character recognition engines for medical report digitization. International Journal of Information Technology. 2024 Jan;16(1):447-55.\u003c/li\u003e\n \u003cli\u003eOnyenwe I, Ogbonna S, Onyedimma E, Ikechukwu-Onyenwe O, Nwafor C. Developing Smart Web-Search using Regex. arXiv preprint arXiv:2110.04767. 2021 Oct 10.\u003c/li\u003e\n \u003cli\u003eNeumann M, King D, Beltagy I, Ammar W. ScispaCy: fast and robust models for biomedical natural language processing. arXiv preprint arXiv:1902.07669. 2019 Feb 20.\u003c/li\u003e\n \u003cli\u003eHellmann D. The Python standard library by example. Addison-Wesley Professional; 2011 Jun 1.\u003c/li\u003e\n \u003cli\u003eMubeen S, Hoyt CT, Gem\u0026uuml;nd A, Hofmann-Apitius M, Fr\u0026ouml;hlich H, Domingo-Fern\u0026aacute;ndez D. The impact of pathway database choice on statistical enrichment analysis and predictive modeling. Frontiers in genetics. 2019 Nov 22;10:1203.\u003c/li\u003e\n \u003cli\u003eRodchenkov I, Babur O, Luna A, Aksoy BA, Wong JV, Fong D, Franz M, Siper MC, Cheung M, Wrana M, Mistry H. Pathway Commons 2019 Update: integration, analysis and exploration of pathway data. Nucleic acids research. 2020 Jan 8;48(D1):D489-97.\u003c/li\u003e\n \u003cli\u003eFisher RA. Statistical methods for research workers. In Breakthroughs in statistics: Methodology and distribution 1970 (pp. 66-70). New York, NY: Springer New York.\u003c/li\u003e\n \u003cli\u003eBenjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Annals of statistics. 2001 Aug 1:1165-88.\u003c/li\u003e\n \u003cli\u003eBachman JA, Gyori BM, Sorger PK. Automated assembly of molecular mechanisms at scale from text mining and curated databases. Molecular Systems Biology. 2023 May 9;19(5):e11325.\u003c/li\u003e\n \u003cli\u003eHoyt CT, Domingo-Fern\u0026aacute;ndez D, Aldisi R, Xu L, Kolpeja K, Spalek S, Wollert E, Bachman J, Gyori BM, Greene P, Hofmann-Apitius M. Re-curation and rational enrichment of knowledge graphs in Biological Expression Language. Database. 2019;2019:baz068.\u003c/li\u003e\n \u003cli\u003eKarin O, Swisa A, Glaser B, Dor Y, Alon U. Dynamical compensation in physiological circuits. Molecular systems biology. 2016 Nov;12(11):886.\u003c/li\u003e\n \u003cli\u003eSardana A. Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?. arXiv preprint arXiv:2503.21157. 2025 Mar 27.\u003c/li\u003e\n \u003cli\u003eChen J, Mueller J. Quantifying uncertainty in answers from any language model and enhancing their trustworthiness. arXiv preprint arXiv:2308.16175. 2023 Aug 30.\u003c/li\u003e\n \u003cli\u003eSanderson K. GPT-4 is here: what scientists think. Nature. 2023 Mar 30;615(7954):773.\u003c/li\u003e\n \u003cli\u003eWei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems. 2022 Dec 6;35:24824-37.\u003c/li\u003e\n \u003cli\u003eOtsu N. A threshold selection method from gray-level histograms. Automatica. 1975 Jun;11(285-296):23-7.\u003c/li\u003e\n \u003cli\u003eKamdar MR, Stanley CE, Carroll M, Wogulis L, Dowling W, Deus HF, Samarasinghe M. Text Snippets to Corroborate Medical Relations: An Unsupervised Approach using a Knowledge Graph and Embeddings. AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:288-297.\u003c/li\u003e\n \u003cli\u003eNovichkova S, Egorov S, Daraselia N. MedScan, a natural language processing engine for MEDLINE abstracts. Bioinformatics. 2003 Sep 1;19(13):1699-706.\u003c/li\u003e\n \u003cli\u003eDaraselia N, Yuryev A, Egorov S, Novichkova S, Nikitin A, Mazo I. Extracting human protein interactions from MEDLINE using a full-sentence parser. Bioinformatics. 2004 Mar 22;20(5):604-11.\u003c/li\u003e\n \u003cli\u003eAlbert R, Thakar J. Boolean modeling: a logic‐based dynamic approach for understanding signaling and regulatory networks and for making useful predictions. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. 2014 Sep;6(5):353-69.\u003c/li\u003e\n \u003cli\u003eSedghamiz H, Morris M, Craddock TJ, Whitley D, Broderick G. High-fidelity discrete modeling of the HPA axis: a study of regulatory plasticity in biology. BMC systems biology. 2018 Dec;12:1-6.\u003c/li\u003e\n \u003cli\u003eRussell DG, Cardona PJ, Kim MJ, Allain S, Altare F. 2009. Foamy macrophages and the progression of the human TB granuloma. \u003cem\u003eNat Immunol\u003c/em\u003e 10:943-948.\u003c/li\u003e\n \u003cli\u003eWang JY, Chang HC, Liu JL, Shu CC, Lee CH, Wang JT, Lee LN. 2012. Expression of toll-like receptor 2 and plasma level of interleukin-10 are associated with outcome in tuberculosis. \u003cem\u003eEur J Clin Microbiol Infect Dis\u003c/em\u003e 31:2327-2333.\u003c/li\u003e\n \u003cli\u003eRothchild AC, Jayaraman P, Nunes-Alves C, Behar SM. 2014. iNKT Cell Production of GM-CSF Controls Mycobacterium tuberculosis. \u003cem\u003ePLOS Pathogens\u003c/em\u003e.;10:e1003805.\u003c/li\u003e\n \u003cli\u003eSilv\u0026eacute;rio D, Gon\u0026ccedil;alves R, Appelberg R, Saraiva M. 2021. Advances on the Role and Applications of Interleukin-1 in Tuberculosis. \u003cem\u003emBio\u003c/em\u003e 12:e03134-21.\u003c/li\u003e\n \u003cli\u003eHaileMariam M, Yu Y, Singh H, Teklu T, Wondale B, Worku A, Zewude A, Mounaud S, Tsitrin T, Legesse M, Gobena A, Pieper R. 2021. Protein and Microbial Biomarkers in Sputum Discern Acute and Latent Tuberculosis in Investigation of Pastoral Ethiopian Cohort. \u003cem\u003eFront Cell Infect Microbiol\u003c/em\u003e 11.\u003c/li\u003e\n \u003cli\u003eQueval CJ, Song OR, Deboos\u0026egrave;re N, Delorme V, Debrie AS, Iantomasi R, Veyron-Churlet R, Jouny S, Redhage K, Deloison G, Baulard A, Chamaillard M, Locht C, Brodin P. 2016. STAT3 Represses Nitric Oxide Synthesis in Human Macrophages upon Mycobacterium tuberculosis Infection. \u003cem\u003eSci Rep\u003c/em\u003e 6:29297.\u003c/li\u003e\n \u003cli\u003eAyalew S, Wegayehu T, Wondale B, Tarekegn A, Tessema B, Admasu F, Piantadosi A, Sahi M, Gebresilase TT, Fredolini C, Mihret A. 2024. Candidate serum protein biomarkers for active pulmonary tuberculosis diagnosis in tuberculosis endemic settings. \u003cem\u003eBMC Infect Dis\u003c/em\u003e 24:1-15.\u003c/li\u003e\n \u003cli\u003eReimer J, Page J, Saha P, Shen S, Zhu X, Qian S, Mammen M, Qu J, Sethi S, Broderick GJ. Leveraging Dynamic Stability to Infer Regulation in Protein-Protein Interaction Networks: A Study of Infectious Vulnerability in COPD. bioRxiv. 2025:2025-05.\u003c/li\u003e\n \u003cli\u003eLyman CA, Morris MM, Richman S, Cao H, Scerri A, Cheadle C, Broderick G. High fidelity modeling of pulse dynamics using logic networks. In2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021 Dec 9 (pp. 197-204). IEEE.\u003c/li\u003e\n \u003cli\u003eSeamon K, Daly JW. Activation of adenylate cyclase by the diterpene forskolin does not require the guanine nucleotide regulatory protein. Journal of Biological Chemistry. 1981 Oct 10;256(19):9799-801.\u003c/li\u003e\n \u003cli\u003eBarde I, Laurenti E, Verp S, Wiznerowicz M, Offner S, Viornery A, Galy A, Trumpp A, Trono D. 2011. Lineage- and stage-restricted lentiviral vectors for the gene therapy of chronic granulomatous disease. \u003cem\u003eGene Ther\u003c/em\u003e 18:1087-1097.\u003c/li\u003e\n \u003cli\u003eDull T, Zufferey R, Kelly M, Mandel RJ, Nguyen M, Trono D, Naldini L.1998. A Third-Generation Lentivirus Vector with a Conditional Packaging System. \u003cem\u003eJournal of Virology\u003c/em\u003e 72:8463-8471.\u003c/li\u003e\n \u003cli\u003ePeyron P, Vaubourgeix J, Poquet Y, Levillain F, Botanch C, Bardou F, Daff\u0026eacute; M, Emile JF, Marchou B, Cardona PJ, de Chastellier C, Altare F. 2008. Foamy Macrophages from Tuberculous Patients\u0026rsquo; Granulomas Constitute a Nutrient-Rich Reservoir for M. tuberculosis Persistence. \u003cem\u003ePLOS Pathogens\u003c/em\u003e 4:e1000204.\u003c/li\u003e\n \u003cli\u003eRussell DG, Huang L, VanderVen BC. 2019. Immunometabolism at the interface between macrophages and pathogens. \u003cem\u003eNat Rev Immunol\u003c/em\u003e 19:291-304.\u003c/li\u003e\n \u003cli\u003eMartinez AN, Mehra S, Kaushal D. 2013. Role of Interleukin 6 in Innate Immunity to Mycobacterium tuberculosis Infection. \u003cem\u003eJ Infect Dis\u003c/em\u003e 207:1253-1261.\u003c/li\u003e\n \u003cli\u003eFernando MR, Reyes JL, Iannuzzi J, Leung G, McKay DM. 2014. The Pro-Inflammatory Cytokine, Interleukin-6, Enhances the Polarization of Alternatively Activated Macrophages. \u003cem\u003ePLOS ONE\u003c/em\u003e 9:e94188.\u003c/li\u003e\n \u003cli\u003eSanmarco LM, Ponce NE, Visconti LM, Eberhardt N, Theumer MG, Minguez \u0026Aacute;R, Aoki MP. 2017. IL-6 promotes M2 macrophage polarization by modulating purinergic signaling and regulates the lethal release of nitric oxide during \u003cem\u003eTrypanosoma cruzi\u003c/em\u003e infection. \u003cem\u003eBiochimica et Biophysica Acta (BBA) - Molecular Basis of Disease\u003c/em\u003e 1863:857-869.\u003c/li\u003e\n \u003cli\u003eLuan D, Dadpey B, Zaid J, Bridge-Comer PE, DeLuca JH, Xia W, Castle J, Reilly SM. 2023. Adipocyte-Secreted IL-6 Sensitizes Macrophages to IL-4 Signaling. Diabetes 72:367-374.\u003c/li\u003e\n \u003cli\u003eChio CC, Chang YH, Hsu YW, Chi KH, Lin WW. 2004. PKA-dependent activation of PKC, p38 MAPK and IKK in macrophage: implication in the induction of inducible nitric oxide synthase and interleukin-6 by dibutyryl cAMP. Cellular Signalling 16:565-575.\u003c/li\u003e\n \u003cli\u003eSeimon TA, Wang Y, Han S, Senokuchi T, Schrijvers DM, Kuriakose G, Tall AR, Tabas IA. Macrophage deficiency of p38\u0026alpha; MAPK promotes apoptosis and plaque necrosis in advanced atherosclerotic lesions in mice. The Journal of clinical investigation. 2009 Apr 1;119(4):886-98.\u003c/li\u003e\n \u003cli\u003eLiu H, Ma Y, Cole SM, Zander C, Chen KH, Karras J, Pope RM. Serine phosphorylation of STAT3 is essential for Mcl-1 expression and macrophage survival. Blood. 2003 Jul 1;102(1):344-52.\u003c/li\u003e\n \u003cli\u003eJu E, Park KA, Shen HM, Hur GM. The resurrection of RIP kinase 1 as an early cell death checkpoint regulator\u0026mdash;a potential target for therapy in the necroptosis era. Experimental \u0026amp; Molecular Medicine. 2022 Sep;54(9):1401-11.\u003c/li\u003e\n \u003cli\u003eSinha N, Jain V, Chadha A. Are Small Language Models Ready to Compete with Large Language Models for Practical Applications?. arXiv preprint arXiv:2406.11402. 2024 Jun 17.\u003c/li\u003e\n \u003cli\u003eVidela S, Konokotina I, Alexopoulos LG, Saez-Rodriguez J, Schaub T, Siegel A, Guziolowski C. Designing experiments to discriminate families of logic models. Frontiers in bioengineering and biotechnology. 2015 Sep 4;3:131.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Mycobacterium tuberculosis, cAMP, signaling dynamics, regulatory network, simulation, pathway logic","lastPublishedDoi":"10.21203/rs.3.rs-7265502/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7265502/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cem\u003eMycobacterium tuberculosis\u003c/em\u003e (Mtb), the causative agent of Tuberculosis, resides in host lung macrophages and has evolved unique processes to hijack host signaling pathways to facilitate its survival and propagation within macrophages. Notably, Mtb exports cyclic AMP (cAMP), a key regulatory signaling molecule, during infection. As can often be the case, experimental data exploring immune modulation by cAMP during Mtb infection are sparse, largely cross-sectional and offer only very partial coverage. Data-poor conditions such as significantly challenge conventional data-driven analyses. Accordingly, we apply a hypothesis driven approach to construct a mechanistically informed network model from prior knowledge of pathway signaling recovered from manually curated pathway schema and extracted from literature. Undocumented pathway elements are hypothesized under strict confidence measures using generative artificial intelligence to ensure a closed loop architecture consistent with homeostatic stability. Simulated perturbations to using the most plausible network models highlight the impact of IL-6 on cAMP response. Subsequent experimental validation using human THP-1 monocytes differentiated to macrophages supported this effect. These results suggest that the de novo creation mechanistically informed network model from prior knowledge may support early explorations of complex pathway dynamics, such as intracellular cAMP signaling during Mtb infection, when experimental data is sparse or unavailable.\u003c/p\u003e","manuscriptTitle":"Applying prior knowledge of regulatory signaling to investigate macrophage cAMP dynamics during Mycobacterium tuberculosis infection","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-11 10:56:02","doi":"10.21203/rs.3.rs-7265502/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-02-19T05:40:56+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-10T21:45:45+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-10T05:17:49+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-08T09:30:00+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"264417010017021773286610911803780606974","date":"2025-12-01T19:40:49+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"215878990456127243770598364003543460617","date":"2025-12-01T16:49:28+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"228143636492836196456557234975161823841","date":"2025-11-30T00:35:52+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"289245320775321406318577010272465785083","date":"2025-11-29T14:07:01+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"6491601892643404030794605423539002592","date":"2025-11-29T13:07:27+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"261105446634704242640929236786217663179","date":"2025-11-29T11:05:11+00:00","index":"hide","fulltext":""},{"type":"editorInvited","content":"","date":"2025-10-30T13:18:31+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-24T06:40:26+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"49727983110543904581046782443916005050","date":"2025-09-04T05:55:41+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"169224378825177637150867320932724162594","date":"2025-09-04T04:23:43+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-09-03T20:52:59+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-08-09T09:50:40+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-08-07T05:31:34+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-07-31T20:32:45+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"740a4dde-2c8e-45d3-951c-ab2ea246d561","owner":[],"postedDate":"September 11th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":54405936,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":54405937,"name":"Biological sciences/Immunology"},{"id":54405938,"name":"Biological sciences/Microbiology"}],"tags":[],"updatedAt":"2026-05-01T08:38:45+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-11 10:56:02","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7265502","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7265502","identity":"rs-7265502","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.