Phage-display immunoprecipitation sequencing reveals distinct antibody repertoire patterns across a rural-urban gradient

preprint OA: gold CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 49,677 characters · extracted from oa-pdf · 7 sections · click to expand

Abstract

Environmental microbial exposures are hypothesized to drive immune profile differences across rural-urban gradients, yet direct evidence linking exposures to immune phenotypes remains limited. Here, we employed phage-displayed immunoprecipitation sequencing (PhIP-seq) to profile antibody repertoires against 357,192 epitopes from >700 organisms in 62 age-matched healthy adults from rural Senegal (n=29), urban Senegal (n=20), and urban Netherlands (n=13). PhIP-Seq analysis identified two major patterns of antibody repertoire variation: geographically restricted signatures distinguishing Senegal from the Netherlands, and a trajectory associated with rural-urban gradients spanning from rural Senegal to urban Netherlands. Integration of PhIP-seq data with high-dimensional immunological profiling revealed immune signatures correlated with the rural-urban antibody repertoire trajectory. Our findings demonstrate that antibody repertoire profiling captures distinct microbial exposure histories across both geographic regions and rural-urban gradients and correlates with variation in immune signatures. This work establishes PhIP-seq as a powerful tool for linking environmental exposure histories to population-level immune landscapes and provides insights into immune adaptation across diverse environments. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 2

Introduction

Environmental exposures and lifestyle factors drive variation in immune profiles (1-4) both between geographical regions (5) and across rural-urban gradients within countries (3), influencing disease susceptibility and vaccine responses (6). Understanding these immune variations is critical for developing tailored interventions (7), yet the specific environmental drivers and their mechanisms remain poorly defined. Emerging evidence reveals multiple contributing factors. Gut microbiome composition varies markedly between rural and urban populations, with rural Tanzanians exhibiting greater microbial diversity and enrichment of Prevotella compared to urban populations with higher Bacteroides abundance (8, 9) reflecting broader differences between low- and middle-income countries (LMICs) versus high-income countries (HICs) (8). Lifestyle scores also correlate with immune phenotypes, as rural-associated lifestyle scores predict more activated immune profiles than urban-associated scores (10). Additionally, dietary transitions from traditional to Western diets promote inflammatory status (11). Specific microbial exposures profoundly shape immune profiles through diverse mechanisms. Repeated malaria infections drive distinct transcriptomic profiles in immune cells of Kenyan children (12-14), while Mycobacterium exposure— both through BCG vaccination and environmental encounters— remodels innate immune responses (15). Fungal β-glucan exposure enhances monocyte responses to unrelated pathogens (16), and helminth infections selectively expand type 2 and regulatory T and B cell populations (17). These exposure-driven immune remodeling extend beyond LMIC settings: cytomegalovirus-seropositive individuals in HICs exhibit reduced vaccine responses compared to seronegative counterparts, exemplified by diminished Ebola vaccine immunogenicity in exposed versus naïve British adults (10). Current methodological limitations have precluded direct assessment of microbial exposure differences across rural-urban gradients at the individual level. Phage-displayed immunoprecipitation sequencing (PhIP-seq) now enables scalable, high-resolution profiling of individual antibody repertoires as molecular records of microbial encounters (18-20). This approach has been used to investigate autoimmunity (12-14), microbiome-directed responses (21), and population-level antibody diversity in Northern Netherlands (22), yet its potential for comparing global populations remains unrealized. Here, we employed PhIP-seq to profile antibody repertoires across individuals residing in rural Senegal, urban Senegal, and urban Netherlands, revealing distinct population-specific antibody epitope patterns. Furthermore, integration with comprehensive multi-omics datasets demonstrate shared variation between antibody repertoires and baseline immune signatures. This work provides insights into how environmental microbial exposures histories shape immune landscapes across rural-urban gradients in understudied populations. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 3

Results

Study population To investigate antibody repertoire profiles along the rural-urban gradient, we analyzed 62 age- matched healthy adults from rural Senegal (n=29), urban Senegalese (n=20), and urban Netherlands (n=13) (Fig. 1A). Participants showed no significant age differences across groups (median 27 years, range 18-43 years; Kruskal-Wallis P=0.55). Sex distribution was balanced within Senegalese cohorts (rural: 48.3% female; urban: 50.0% female) but skewed toward females in the Dutch cohort (76.9% female; Chi-square P=0.069). All Senegalese participants had maintained residence in their respective environments for ≥10 years. As described previously (7), rural Senegalese participants exhibited typical rural lifestyle features, including larger household sizes and traditional housing with earth floors and mud walls compared to their urban counterparts. Figure 1. Overview of experiment and PhIP-seq data A, Graphical overview of the experiment. Serum from individuals residing in rural and urban Senegal and urban Netherlands was taken. The serum was incubated with a peptide phage library, containing 357,192 epitopes, and immunoprecipitation was performed to capture phages bound by antibodies. The samples were sequenced before bioinformatic analysis to annotate the epitopes (using protein BLAST) and extract patterns of antibody repertoire across the three groups. Created using Biorender.com. B, Boxplot showing the number of significant anti-epitopes found per individual. Centre line shows the median and box limits show the 25th and 75th percentiles. The whiskers extend to 1.5 times the interquartile range from the 25th and 75th percentiles. C, Histogram of the number of individuals an anti-epitope was detected in. The Y-axis is log10 transform and shows the number of anti-epitopes. D, PCA plot generated from the filtered PhIP-seq dataset, where individuals are coloured by residence. The large circles represent the centroid of the different groups in the PC1 and PC2 space. P values denote group centroid comparisons with the PERMANOVA test. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 4 E, Boxplot showing the PC2 values of each individual, grouped by residence. Linear trend test and Tukey post hoc test for pairwise comparisons P values are indicated. ***P/FDR < 0.001. Antibody repertoires vary across the rural-urban gradient PhIP-seq analysis detected antibodies against 17,139 epitopes from over 700 organisms across all samples (Fig.1A, Supplementary File 1). Individual antibody repertoires contained a median of 609 significant anti-epitope responses (Fig.1B), with substantial heterogeneity in epitope recognition: 12,647 epitopes were detected in at most 1 individual, while 60 epitopes were recognized by over 50 individuals (80%) (Fig. 1C). Total anti-epitope numbers did not differ significantly between groups (Supplementary Fig. 1). Principal component analysis (PCA) revealed significant differences in overall antibody repertoires between Dutch and rural Senegalese populations (PERMANOVA P<0.001). Notably, we observed that rural Senegalese, urban Senegalese, and urban Dutch populations were ordered sequentially along the second principal component (PC2) (Figs. 1D, E), suggesting an underlying rural-urban trajectory of microbial exposure histories. Age and sex showed no significant associations with antibody profiles (Supplementary Figs. 2A, B). Hierarchical clustering of antibody repertoire profiles revealed several exposure history clusters across the residence groups (Fig. 2A). Clusters 2 and 4 showed geographically restricted signatures, with epitopes enriched mostly in Senegalese or Dutch participants, respectively. Cluster 3 demonstrated a clear rural-urban gradient, showing highest detection frequencies in rural Senegalese, followed by urban Senegalese, and lowest in urban Dutch subjects. To quantify these observed clustering patterns, we performed linear trend analysis across the rural-urban gradient. Eighty-six anti-epitopes showed significant rural-urban associations: 28 increased in frequency from rural Senegal to urban Netherlands, while 58 decreased (Fig. 2B, Supplementary Figs. 3-5, Supplementary File 2). These results demonstrate that differences in microbial exposure histories resulted in distinct antibody repertoire signatures spanning both geographic areas and rural-urban gradients. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 5 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 6 Figure 2. Antibody repertoire analysis across the rural-urban axis A, Clustered heatmap showing whether antibody (Ab) against a particular epitope was detected (red) or undetected (blue) in an individual. The heatmap is sliced column wise by the residence of the individual, and the rows sliced by k- means clustering on the epitopes. Selected epitopes are annotated on the rows and are coloured according to the type of organism. B, Barplot showing the percentage of individuals per residence group who have antibodies against epitopes found to have a significantly linear trend across the rural-urban axis. For each epitope, the protein and organism of origin is detailed. The organism text is coloured by organism type, in the same fashion as panel A. C, Barplot showing the percentage of epitopes from a particular organism that were found to have a significant linear trend across the rural-urban axis. The direction and colour of the bars denotes whether the anti-epitopes frequencies trend towards the rural Senegalese or urban Dutch direction. The organism names are coloured by whether they are viral, bacterial, parasitical or mammalian, and the most common protein of the significant anti-epitopes are also denoted. Centre line shows the median and box limits show the 25th and 75th percentiles. The whiskers extend to 1.5 times the interquartile range from the 25th and 75th percentiles. The number of significant anti- epitopes for the most common protein (per organism) is also shown. Linear trend test and Tukey post hoc test for pairwise comparisons P values are indicated. Linear trend p-values adjusted for multiple comparisons using Benjamini- Hochberg. NS P/FDR >= 0.05, *P/FDR < 0.05, **P/FDR < 0.01, and ***P/FDR < 0.001. Pathogen-specific antibody patterns varied across the rural-urban gradient. Rural Senegalese populations showed highest detection for Shigella spp., herpes simplex virus, and cytomegalovirus (CMV) epitopes, while urban Dutch individuals exhibited highest detection for rhinovirus B, Epstein-Barr virus (EBV), and Haemophilus influenzae (Fig. 2B). Interestingly, antibodies against epitopes from Staphylococcus aureus (S. aureus), Phocaeicola vulgatus and Streptococcus pyogenes proteins showed opposing patterns across the gradient (Fig. 2B), with S. aureus protein A anti-epitopes ranking among the most prevalent repertoires (detected in 61/62 individuals; Supplementary File 1). Since individual epitopes from the same pathogen sometimes showed divergent trends along rural-urban gradients, we assessed pathogen-level enrichment by calculating the proportion of each repertoire targeting organisms with significant linear associations (Supplementary File 2). Analysis of the four organisms with highest median trend Z-values for each direction of association confirmed epitope level findings: Shigella sonnei, S. flexneri, and human gammaherpesvirus 8 showed significant rural-trending responses, while EBV and Streptococcus dysgalactiae trended toward urban Dutch subjects (Fig. 2C). Geographic restriction in anti-epitope detection was particularly striking—nearly all Senegalese individuals (48/49) recognized Shigella sonnei epitopes compared to only three Dutch participants. Human gammaherpesvirus 8 were detected exclusively in rural Senegalese (14/29 individuals) but were completely absent from urban Senegalese and Dutch groups (Fig. 2C). .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 7 PhIP-seq anti-epitope trajectory was correlated with immune-signatures of urbanization Figure 3. Associations between PhIP-seq anti-epitope trajectory and immunological parameters. A, Boxplots showing IgG1 galactosylation levels across the residence area (left); pairwise t-test p-values are shown. Scatter plot showing correlation between IgG1 galactosylation levels and PhIP-seq PC2 (right); Spearman’s correlation coefficient (ρ) and p-value are shown. B, Analysis scheme for integrating anti-epitope trajectory obtained from PhIP-seq (PhIP-seq PC2) with the previously published immune features trajectory obtained from immune profiling datasets (rural-urban lambda metric). Created using Biorender.com. C, Correlation between PhIP-seq PC2 scores and rural-urban lambda metric. Spearman correlation coefficient (rho) and p-value are shown with a fitted regression line. D, Immune features significantly correlated with PhIP-seq PC2 scores (FDR < 0.1), sorted by magnitude and direction of correlation coefficient. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 8 Having established distinct antibody profiles across residence groups, we investigated whether these exposure signatures correlate with immune activation. IgG1 Fc galactosylation, which decreases with immune activation across populations (23), showed a clear rural-urban gradient: lowest in rural Senegalese, intermediate in urban Senegalese, and highest in urban Dutch participants (Fig. 3A). This pattern suggests increasing immune activation from urban to rural settings. Importantly, IgG1 galactosylation levels showed strong negative correlation with the PhIP-seq PC2 scores (Spearman's ρ=−0.64, P<0.0001), suggesting that population-specific microbial exposure histories modulate immune We next integrated PhIP-seq data with comprehensive immune profiling data from the same individuals (Fig. 3B; n = 18 rural Senegalese, 8 urban Senegalese, 4 urban Dutch). Our previously established rural-urban immune trajectory lambda metric (7)—derived from unsupervised integration of multiple immunological datasets—correlated significantly with PhIP- seq PC2 scores (Spearman's ρ=0.55, P=0.0021; Fig. 3C, Supplementary Fig. 6A). This convergence demonstrates that microbial exposure histories captured by antibody repertoires align with immune signatures. To identify immune features driving this association, we correlated PhIP-seq PC2 scores with all immune parameters from our published dataset. Nine immune features exhibited significant associations (at FDR<0.1; Supplementary Fig. 6B), revealing distinct immunological signatures linked to exposure patterns. Rural-associated antibody profiles correlated with higher frequencies of ILC2s, CD11c + B cells, and CD56dimCD16+ NK cells, as well as elevated serum IL-8 levels (Fig. 3D, Supplementary Fig. 6C). Conversely, urban-associated profiles correlated with increased ILC3 and NKp46+ NK cell frequencies alongside higher IgG1 Fc galactosylation. These findings demonstrate that serum antibody repertoires serve as robust proxies for microbial exposure histories and their downstream immunological consequences.

Discussion

Using PhIP-seq, we profiled antibody repertoires of individuals across the rural-urban gradient and revealed distinct antigen exposure histories that correlate with environmental and lifestyle factors. Repertoires detected among rural Senegalese individuals were enriched for antibodies against pathogens associated with poor sanitation and endemic exposure, including Shigella species, human herpesviruses, and Plasmodium falciparum. Conversely common respiratory pathogens including Rhinovirus B, Epstein-Barr virus, and Haemophilus influenzae. These patterns align with known epidemiological differences in infectious disease burden between LMICs and HICs. Interestingly, antibodies against human gammaherpesvirus 8–also known as Kaposi’s sarcoma herpesvirus (KSHV)–were detected among rural Senegalese participants but not their urban counterparts. KSHV is known to be highly prevalent in Sub-Saharan Africa and its higher prevalence in rural regions has been described previously (24). Studies have linked higher KSHV prevalence to factors typically associated with rural living, including denser household, .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 9 which facilitate viral transmission (25), and Schistosoma infections, which induced viral reactivation (26)— factors also present in rural Senegal. It is interesting to note that antibodies against certain epitopes are enriched in particular populations and that antibodies against epitopes from the same organism can display opposite enrichments. Enrichment of antibodies against particular epitopes has been reported in the context of disease, with antibodies against Epstein-Barr virus lytic phase proteins significantly enriched in individuals with primary sclerosing cholangitis compared to control (27) and antibodies against bacterial flagellin epitopes enriched in individuals with Crohn's disease (12). For this paper, the antibody response S. aureus stands out, simultaneously detected in most individuals for some epitopes and for others enriched towards both ends of the rural-urban axis. The implications of these results, especially for vaccine development, is worth investigating further. The differences in immune profile across the rural-urban axis is hypothesised to be driven, at least in part, by pathogen exposure in the environment (7, 23), as suggested by differences in infectious disease burden between LMICs and HICs (28, 29). Here, we demonstrated the convergence between the independently constructed trajectories of antibody repertoire profiles and immune remodeling along the rural-urban gradients. This analysis further revealed specific immune features associated with rural-urban exposure histories. For example, ILC2s, which was associated with rural repertoire profiles, were known to be expanded following helminth infections (17). Conversely, urban repertoire profiles were associated with higher frequencies of NK cells expressing NKp46, critical NK cell marker for controlling viruses such as influenza (30). Future studies with larger sample size will provide greater statistical power to link specific microbial epitopes to immune parameters to further our understanding of mechanistic link between exposures and specific factors driving it.(28, 29). Together these findings highlight the need to utilise this technology to further investigate the link between pathogen exposures and immune adaptations. This work can be extended upon in multiple ways, the first being to increase sample size to numbers similar to other papers (12, 22), allowing greater statistical rigour and overcoming interpretation issues confounded by individual library size, such as exposure diversity and read count levels. The second expansion would be in the pathogen epitopes investigated. Most of the epitopes are from microbiome organisms, reflecting the background of PhIP-seq in studying inflammatory bowel diseases (12, 21). When assessing differences across the rural-urban axis it would be beneficial to include epitopes from pathogens common to LMICs, but not HICs, such as helminths (29). Nevertheless, the fact that most of our findings align with epidemiological observations, such as higher detection of KSHV in rural Senegal, highlight the strength of PhIP- seq as a tool for a broad, systematic survey of microbial exposures and might inform regional disease screening policies. In conclusion, PhIP-seq robustly captures differential microbial exposure patterns across rural- urban gradients, providing insights into how environmental exposures shape population-specific .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 10 immune signatures. This approach establishes a framework for understanding immune adaptation in diverse global populations and environments. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 11

Materials and methods

Study design Participants were recruited between November 2018 and August 2019 from three sites: rural Senegal (Pakh and Richard Toll; n=72), urban Senegal (Dakar; n=46), and the Netherlands (Leiden; n=28). Recruitment was coordinated through local healthcare centers in Senegal and Leiden University Medical Center. Written informed consent was obtained from all participants. The study was approved by the ethics committees of Cheikh Anta Diop University of Dakar (0339/2018/CER/UCAD) and Leiden University Medical Center (NL66287.058.18, NL- OMON48968). Rural Senegalese participants originated from agricultural villages in the Senegal River basin, where schistosomiasis has historically been highly endemic. Urban Senegalese participants were recruited from Dakar, a non-endemic area with improved sanitation and consistent access to clean water. Retrospective surveys in Dakar reported intestinal parasite prevalence of ~19– 26%, although estimates may be biased by sampling from symptomatic patients. Rural and urban classifications were confirmed using questionnaire data on household size, animal contact, housing conditions, and amenities. The Senegalese population is ethnically diverse (predominantly Wolof, Pular, and Serer), though cultural admixture is common; most participants were Wolof. Inclusion criteria were: age 18–40 years, residence in the study area ≥10 years, and absence of chronic illnesses (e.g. diabetes, hypertension). Participants underwent clinical examination and completed standardized questionnaires covering demographic, socioeconomic, lifestyle, and medical history information. Exclusion criteria were evidence of infection with Plasmodium spp. (thick smear, rapid diagnostic test), Schistosoma spp. (Kato-Katz and urine filtration, 12 µm filters), or intestinal helminths (Ascaris lumbricoides, Trichuris trichiura, hookworm) as determined by stool microscopy. To ensure comparable sample quality across sites, technical staff received centralized training at Leiden University Medical Center and adhered to standardized operating procedures (SOPs) for sample collection and processing. Venous blood was drawn into heparin, EDTA, and dry tubes for cell isolation, full blood count, and serology testing. Description of immune-profiling datasets Five datasets from our previously published study (7) were re-analyzed in this manuscript: peripheral blood immune cell phenotyping (ex vivo; EXV), cytokine production following 6-hour stimulation with PMA/ionomycin or monophosphoryl lipid-A (MPL), IgG Fc glycan profiling (GLY), and serum cytokine levels (CYT). Detailed methodological description for the acquisition, pre-processing, and analysis for each dataset as well as the integrative analysis of these datasets can be found elsewhere (7). Briefly, the EXV, PMA, and MPL datasets were obtained from mass cytometry analysis of peripheral blood mononuclear cells (PBMCs). Cryopreserved PBMCs (3×10⁶ cells) were .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 12 cultured with either PMA/ionomycin or MPL for 6 hours (Brefeldin-A added for the final 4 hours), then stained using two standardized antibody panels for surface markers, intracellular cytokines, and nuclear antigens, with dead cell discrimination via intercalator staining. Cells were acquired on a Helios mass cytometer with automated tuning, normalization using calibration beads, and quality control through FlowJo gating and specialized R packages. Batch effects in stimulated samples were corrected using CytoNorm, followed by two-step unsupervised clustering via FlowSOM and FastPG to identify immune cell clusters. IgG Fc-glycopeptides (GLY) were profiled by affinity purification on Protein G beads, trypsin digestion, and LC–MS analysis. LacyTools was used for glycopeptide quantification, yielding subclass-specific Fc glycosylation profiles for IgG1 (24 glycopeptides) and IgG2 (14 glycopeptides); IgG3/IgG4 peptides were excluded due to allotype overlap. From these data, glycosylation traits—including fucosylation, galactosylation, sialylation, bisection, and the sialic acid/galactose ratio—were derived using standard formulas based on relative glycopeptide abundances. Plasma cytokines (CYT) were quantified using the R&D Systems Luminex Discovery Assay Human Premixed Multi-Analyte Kit (CCL3, CCL8, CCL24, complement component C5a, CXCL10, HGF, IL-3, IL-8, IL-18, vascular endothelial growth factor, CCL2, CCL7, CCL20, CXCL9, CXCL13, IFN-γ, IL-6, IL-10, and S100A8) with reduced reagent volumes. Analytes undetectable in >40% of samples were excluded, and values below the lower limit of detection (LLOD) were imputed as half the LLOD. Finally, these immune-profiling datasets were integrated using MOFA2 R package–an unsupervised method that learns latent factors capturing variations across datasets. Latent factors were further reduced into two dimensions using multidimensional scaling and then used as input for trajectory analysis using principal curve (princurve R package), which have previously revealed a trajectory of immune variation across the rural-urban gradient. Phage immunoprecipitation sequencing (PhIP-Seq) PhIP-Seq assays were conducted following the protocol described by Vogl et al. (2021), using the same core parameters with a few minor modifications: a 4,000-fold coverage of phages per variant and 2 µg of antibodies per reaction. The microbiota library, consisting of 244,000 variants, was mixed with two additional libraries: a 100,000-variant pool containing peptides derived from allergen and environmental compound databases (as described by Leviatan et al.) (31), and a 13,192-variant pool primarily composed of Coronavirus antigens (developed by Klompus et al.) (32). Bead washing steps were automated using a Tecan Fluent 1080 liquid- handling robot. PCR amplifications were performed using primers specified in Vogl et al. (2021) (21), except for PCR2, where 10-nucleotide unique dual index primers (purchased from IDT) were used. PhIP-Seq data processing Samples were sequenced with Illumina Novaseq X. Raw reads bam files were then converted to fastq format using bamtofastq (bedtools v2.31.0) and subsequently analyzed through a .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 13 previously established workflow. Reads were mapped to the three antigen libraries mentioned above [refs], and raw counts for each peptide were obtained. To assess the enrichment of antibody bound peptides, peptide counts were normalized by comparing them to the oligopeptide counts in the initial library as previously done. The phage library was sequenced before immunoprecipitation and peptide counts were modeled as a generalized Poisson distribution to form the null model. For each sample, peptide enrichment significance (p-values) was computed while controlling for Family Wise Error Rate (FWER) with the Bonferroni method. Annotation of epitopes The agilent library peptides were run against the Uniref90 (33) database to annotate the protein and organism of origin. Some of the results from the Uniref90 database did not have a specific tax, showing up as “Bacteria” or “root”. The amino acid sequences from the non-specific agilent and the entire twist peptide libraries were analysed with blastP against non-redundant protein sequences and the result XML files retrieved. For each epitope, the scientific names of the top 15 results (in terms of E-value) were extracted. The organism origin of the epitope was assigned to be the most common scientific name in the top 15 list. Ties were broken randomly. Manual standardization of the organism names was also carried out, for example removing strain information from the scientific name and ensuring continuity between the names derived from the Uniref90 database and blast. For example, cytomegalovirus is named “Human betaherpesvirus 5” in blast results and “Human cytomegalovirus” in the Uniref90 results. This was standardised to “Human cytomegalovirus”. Statistical analysis All statistical analyses were performed in R (v4.4.1). All data points represent one biological replicate, and technical replicates were not performed. No sample points were omitted from analysis. All testing was two sided. The PhIP-seq dataset was treated as a binary matrix showing presence/absence of antibody against a particular epitope. Individuals who had a significant enrichment for a particular epitope were treated as though an antibody against that epitope was detected in the individual. Principal component analysis was performed using the R stats package on a subset of the binary PhIP-seq dataset where epitopes were removed that were expressed in less than 25% of the individuals across all three residence groups. Significant pairwise differences between residence and sex groups in the PC space were assessed using PERMANOVA, as implemented in the adonis2 function of the Vegan R package. To assess whether the frequency of epitopes showed a linear trend across the residence groups a Cochran-Armitage test was performed across all the individuals and epitopes, with p- values adjusted using Benjamini-Hochberg procedure to account for multiple testing. For testing linear trend across the continuous data, orthogonal polynomial contrast was utilised where features which were significant for linear (but not quadratic) trend were treated as significant. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 14

Acknowledgements

We thank the laboratory members of the M.Y. group for their critical feedback. We would like to acknowledge all staff members in Senegal (field workers, medical staff of the health Centre of Richard Toll, staff of the Immunology Laboratory at Cheikh Anta Diop University, Senegal) and the Netherlands who helped to make this study possible. Last, we would like to thank all volunteers who participated in this study, without whom the study could not have been performed. Funding: This study is part of the EDCTP2 program supported by the European Union (grant no. TMA20216CDF-1595). M.Y. was supported by a NWO Spinoza Prize 2021, an NWO- WOTRO Science for Global Development Programme (no. W 07.30318.019), and an ERC Advanced Grant (no. 101055179). M.D.M. were supported by the Silicon Valley Community Foundation (DAF2020-217471). Author contributions: M.Y., M.M., and M.D.M. were responsible for the study design. R.F.L. and M.D.M. were responsible for the statistical analysis and data interpretation and prepared the first draft. M.Y., M.M., and M.D.M. acquired funding. G.I. and T.V. generated the data and optimized the PhIP-seq experimental protocols. All authors reviewed the manuscript. Competing interests: The authors declare that they have no competing interests. The funding sources had no role in collecting, analyzing, interpreting, or reporting the data. The opinions expressed in this document reflect only the author’s view. The funding sources are not responsible for any use that may be made of the information that it contains. Data and materials availability: Data are available upon reasonable request to the authors. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 15

References

1. Brodin P, Jojic V, Gao T, Bhattacharya S, Angel CJ, Furman D, et al. Variation in the human immune system is largely driven by non-heritable influences. Cell. 2015;160(1-2):37-47. 2. Smolen KK, Cai B, Fortuno ESR, Gelinas L, Larsen M, Speert DP, et al. Single-cell analysis of innate cytokine responses to pattern recognition receptor stimulation in children across four continents. J Immunol. 2014;193(6):3003-12. 3. de Ruiter K, Jochems SP, L. TD, Stam KA, König M, van Unen V, et al. Helminth infections drive heterogeneity in human type 2 and regulatory cells. Sci transl med. 2020;12(524). 4. Carr EJ, Dooley J, Garcia-Perez JE, Lagou V, Lee JC, Wouters C, et al. The cellular composition of the human immune system is shaped by age and cohabitation. Nat Immunol. 2016;17(4):461-8. 5. Mouwenda YD, Betouke Ongwe ME, Sonnet F, Stam KA, Labuda LA, De Vries S, et al. Characterization of T cell responses to co-administered hookworm vaccine candidates Na-GST- 1 and Na-APR-1 in healthy adults in Gabon. PLoS Negl Trop Dis. 2021;15(10):e0009732. 6. van Dorst MMAR, Pyuza JJ, Nkurunungi G, Kullaya VI, Smits HH, Hogendoorn PCW, et al. Immunological factors linked to geographical variation in vaccine responses. Nature Reviews Immunology. 2024;24(4):250-63. 7. Manurung MD, Heieis GA, König M, Azimi S, Ndao M, Veldhuizen T, et al. Systems analysis unravels a common rural-urban gradient in immunological profile, function, and metabolic dependencies. Sci Adv. 2025;11(18). 8. Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, et al. Human gut microbiome viewed across age and geography. Nature. 2012;486(7402):222-7. 9. Pyuza JJ, van Dorst M, Barnett D, Stam K, Manurung M, Wammes L, et al. Tanzanian gut microbiota profiles linked to high but rapidly waning yellow fever antibody titers. NPJ Biofilms Microbiomes. 2025;11(1):110. 10. Bowyer G, Sharpe H, Venkatraman N, Ndiaye PB, Wade D, Brenner N, et al. Reduced Ebola vaccine responses in CMV+ young adults is associated with expansion of CD57+KLRG1+ T cells. J Exp Med. 2020;217(7). 11. Temba GS, Pecht T, Kullaya VI, Vadaq N, Mosha MV, Ulas T, et al. Immune and metabolic effects of African heritage diets versus Western diets in men: a randomized controlled trial. Nature Medicine. 2025;31(5):1698-711. 12. Bourgonje AR, Andreu-Sanchez S, Vogl T, Hu S, Vich Vila A, Gacesa R, et al. Phage- display immunoprecipitation sequencing of the antibody epitope repertoire in inflammatory bowel disease reveals distinct antibody signatures. Immunity. 2023;56(6):1393-409 e6. 13. Klepper A, Asaki J, Kung AF, Vazquez SE, Bodansky A, Mitchell A, et al. Novel autoantibody targets identified in patients with autoimmune hepatitis (AIH) by PhIP-Seq reveals pathogenic insights. medRxiv. 2024. 14. Vazquez SE, Mann SA, Bodansky A, Kung AF, Quandt Z, Ferre EMN, et al. Autoantibody discovery across monogenic, acquired, and COVID-19-associated autoimmunity with scalable PhIP-seq. eLife. 2022. 15. Mai D, Jahn A, Murray T, Morikubo M, Lim PN, Cervantes MM, et al. Exposure to Mycobacterium remodels alveolar macrophages and the early innate response to Mycobacterium tuberculosis infection. PLOS Pathogens. 2024;20(1):e1011871. 16. Netea MG, Domínguez-Andrés J, Barreiro LB, Chavakis T, Divangahi M, Fuchs E, et al. Defining trained immunity and its role in health and disease. Nature Reviews Immunology. 2020;20(6):375-88. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 16 17. de Ruiter K, Jochems SP, Tahapary DL, Stam KA, König M, van Unen V, et al. Helminth infections drive heterogeneity in human type 2 and regulatory cells. Science Translational Medicine. 2020;12(524):eaaw3703. 18. Mohan D, Wansley DL, Sie BM, Noon MS, Baer AN, Laserson U, et al. PhIP-Seq characterization of serum antibodies using oligonucleotide-encoded peptidomes. Nat Protoc. 2018;13(9):1958-78. 19. Larman HB, Laserson U, Querol L, Verhaeghen K, Solimini NL, Jing Xu G, et al. PhIP- Seq characterization of autoantibodies from patients with multiple sclerosis, type 1 diabetes and rheumatoid arthritis. Journal of autoimmunity. 2013;43:1-9. 20. Larman HB, Zhao Z, Laserson U, Li MZ, Ciccia A, Gakidis MA, et al. Autoantigen discovery with a synthetic human peptidome. Nat Biotechnol. 2011;29(6):535-41. 21. Vogl T, Klompus S, Leviatan S, Kalka IN, Weinberger A, Wijmenga C, et al. Population- wide diversity and stability of serum antibody epitope repertoires against human microbiota. Nat Med. 2021;27(8):1442-50. 22. Andreu-Sanchez S, Bourgonje AR, Vogl T, Kurilshikov A, Leviatan S, Ruiz-Moreno AJ, et al. Phage display sequencing reveals that genetic, environmental, and intrinsic factors influence variation of human antibody epitope repertoire. Immunity. 2023;56(6):1376-92 e8. 23. de Jong SE, Selman MH, Adegnika AA, Amoah AS, van Riet E, Kruize YC, et al. IgG1 Fc N-glycan galactosylation as a biomarker for immune activation. Sci Rep. 2016;6:28207. 24. Pfeiffer RM, Wheeler WA, Mbisa G, Whitby D, Goedert JJ, de Thé G, et al. Geographic Heterogeneity of Prevalence of the Human Herpesvirus 8 in Sub-Saharan Africa: Clues About Etiology. Annals of Epidemiology. 2010;20(12):958-63. 25. Sabourin KR, Marshall VA, Eaton W, Kimono B, Mugisha J, Miley WJ, et al. Factors affecting Kaposi's sarcoma-associated herpesvirus transmission in rural Ugandan households, a longitudinal study. Infect Agent Cancer. 2024;19(1):49. 26. Nalwoga A, Webb EL, Chihota B, Miley W, Walusimbi B, Nassuuna J, et al. Kaposi's sarcoma-associated herpesvirus seropositivity is associated with parasite infections in Ugandan fishing communities on Lake Victoria islands. PLoS Negl Trop Dis. 2019;13(10):e0007776. 27. ElAbd H, Pesesky M, Innocenti G, Chung BK, Mahdy AKH, Kriukova V, et al. T and B cell responses against Epstein-Barr virus in primary sclerosing cholangitis. Nat Med. 2025;31(7):2306-16. 28. Scott TA, Baker KS, Trotter C, Jenkins C, Mostowy S, Hawkey J, et al. Shigella sonnei: epidemiology, evolution, pathogenesis, resistance and host interactions. Nat Rev Microbiol. 2025;23(5):303-17. 29. Chen J, Gong Y, Chen Q, Li S, Zhou Y. Global burden of soil-transmitted helminth infections, 1990-2021. Infect Dis Poverty. 2024;13(1):77. 30. Bar-On Y, Glasner A, Meningher T, Achdout H, Gur C, Lankry D, et al. Neuraminidase- Mediated, NKp46-Dependent Immune-Evasion Mechanism of Influenza Viruses. Cell Reports. 2013;3(4):1044-50. 31. Leviatan S, Kalka IN, Vogl T, Klompas S, Weinberger A, Segal E. BIPS-A code base for designing and coding of a Phage ImmunoPrecipitation Oligo Library. PLoS Comput Biol. 2022;18(11):e1010663. 32. Klompus S, Leviatan S, Vogl T, Mazor RD, Kalka IN, Stoler-Barak L, et al. Cross- reactive antibodies against human coronaviruses and the animal coronavirome suggest diagnostics for future zoonotic spillovers. Sci Immunol. 2021;6(61). 33. Suzek BE, Wang Y, Huang H, McGarvey PB, Wu CH, UniProt C. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics. 2015;31(6):926-32. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 17 Supplementary information Supplementary Figure 1. Boxplot showing the number of significant anti-epitopes identified for each individual, grouped by the residence of individual. Kruskal Wallis and pairwise Wilcoxon Rank Sum p values are indicated. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 18 Supplementary Figure 2. PCA plot generated from the filtered PhIP-seq dataset, where individuals are coloured by sex (A) and age (B). The large circles represent the centroid of the different groups in the PC1 and PC2 space. P values denote group centroid comparisons with the PERMANOVA test (left) or Pearson correlation test (right). .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 19 Supplementary Figure 3. Clustered heatmap showing whether antibody against a particular epitope was detected (red) or undetected (blue) in an individual. The heatmap is sliced column wise by the residence of the individual, and the rows sliced by k-means clustering on the epitopes. Epitope IDs, protein and organism of origin are annotated on the rows and are coloured according to their type, as in Figure 2A & B. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 20 Supplementary Figure 4. Barplots of all the anti-epitopes which had a significant linear trend towards the rural Senegalese side of the rural-urban axis. Barplots show the percentage of individuals per group who have antibodies against that particular epitope .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 21 Supplementary Figure 5. Barplots of all the anti-epitopes which had a significant linear trend towards the urban Dutch side of the rural-urban axis. Barplots show the percentage of individuals per group who have antibodies against that particular epitope .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 22 Supplementary Figure 6. A, PCA plot of PhIP-seq dataset, subsetted for individuals with corresponding high-dimensional immune-profiling datasets. B, Volcano plot of correlations between PhIP-seq PC2 trajectory with immune parameters. C, Scatter plot showing immune parameters with statistically significant correlation (FDR<0.1) with PhIP-seq PC2 trajectory. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint 23 Supplementary File 1. File containing all epitopes in which antibodies were detected in at least one individual. Information included is whether the organism name was derived from blastP or not, the number of individuals in which antibody was detected, the amino acid sequence and the epitope ID. Supplementary File 2. File containing all the epitopes which were found to have a significant linear trend across the rural-urban axis. Statistics from the Cochran-Armitage test results can be found, as well as the epitope ID and the protein and organism from which the epitope is from. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted August 27, 2025. ; https://doi.org/10.1101/2025.08.26.671278doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-NC-4.0