Abstract
17
18
Immunity to Mycobacterium tuberculosis (Mtb), like many pathogens, is encoded jointly by the 19
antigen specificities and functions of responding CD4 T cells. However, these features span a 20
large two-dimensional possibility space – defined on one axis by the Mtb proteome, and on the 21
other by the T cell transcriptome – that exceeds the dimensionality of existing technologies. Here 22
we present an approach (“CRESTA”) that combines highly -multiplexed DNA-barcoded epitope 23
probes, single cell sequencing, and clonal analysis of T Cell Receptors (TCRs) to robustly detect 24
rare antigen-specific CD4 T cells across hundreds of epitopes simultaneously and reveal their 25
transcriptome-wide phenotypes. By comprehensively assaying known epitopes in Mtb-infected 26
participants, we reveal polyclonal and multi -epitope responses across a spectrum of 27
differentiation states, uncover previously -unobserved phenotypic diversity within and between 28
epitopes, and increase the total number of known Mtb epitope-mapped TCRα:βs by ~8-fold. We 29
expect CRESTA to enable high-dimensional analyses of CD4 T cell responses in various settings, 30
including infection, cancer, autoimmunity and allergy. 31
32
33
34
Introduction
35
36
Helper T cell immunity resides in populations of CD4+ cells that have clonally expanded in 37
response to particular MHC class II -bound peptide antigens, and differentiated to acquire 38
specialized effector functions 1–4. Although these features together theoretically encode an 39
individual’s state of immunity, they are difficult to measure in an integrated and comprehensive 40
way because existing technologies do not scale well to the large diversity of possible CD4+ T cell 41
antigen specificities and functions. Traditional assays for the identification of antigen -specific T 42
cells include those that measure peptide -stimulated cytokine production (e.g., immunospot 43
assays and flow cytometric detection of intracellular cytokines5,6) or upregulated markers7, as well 44
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
as assays that use fluorescently - or isotopically-labeled peptide:MHC probes to directly detect 45
antigen-binding T cells 8,9. Although widely useful, these approaches have a limited ability to 46
multiplex across antigens and phenotypic markers, meaning that available sample volumes and 47
cell numbers quickly become limiting, typically precluding comprehensive analyses10. 48
49
More recently, DNA-barcoded peptide:MHC probes have been used to bypass the spectral limits 50
of cytometric approaches (which generally peak at 10 –50-plex), and have enabled CD8 T cell 51
responses to be resolved across a multiplexity of up to ~1000 distinct peptide:MHCs 11. While it 52
represents a major advance, this system has two key limitations: (i) by itself, it does not enable 53
the simultaneous capture of T cell phenotypic information and (ii) it has not yet been adapted to 54
the analysis of CD4 T cells. The latter likely reflects a combination of factors, including: (i) the 55
lower frequencies of epitope -specific CD4 T cells, (ii) the generally lower affinity of CD4 T cell 56
receptors (TCRs) for their MHC class II -restricted antigens, and (iii) greater challenges in 57
identifying MHC class II:peptide binding pairs and constructing the corresponding probes. Indeed, 58
despite considerable efforts to enable the multiplexed detection of CD4 T cell antigen -specificity 59
9,12–18, the greatest reported scale at which epitopes have been resolved simultaneously in a 60
primary sample using any approach is 6 -plex19. A powerful assay developed recently uses an 61
elegant cell interaction reporter system to enable the genome -scale discovery of CD4 T cell 62
antigens across 100,000s of candidate epitopes 20, however it requires substantial genetic 63
engineering of T cells and therefore does not enable the analysis of primary and/or polyclonal 64
samples (currently it has been limited to the mapping of TCRs). Overcoming the barriers to highly-65
multiplexed analysis of CD4 T cell epitopes in primary samples would represent an important 66
advance with widespread applications across the many settings in which helper T cell immunity 67
is implicated. 68
69
The opportunities that would be enabled by high-dimensional assays of CD4 T cell specificity and 70
phenotype are exemplified by Mycobacterium tuberculosis (Mtb), a pathogen responsible for 1.3 71
million global deaths in 2022, and for which an effective vaccine would have a major impact21. No 72
vaccine for Mtb has been approved since BCG (~100 years ago), although there are several 73
experimental formulations under development 22–24. A major challenge in the field has been 74
selecting which Mtb antigens (from among >4000 proteins) to include in a vaccine, and what T 75
cell functions it should elicit. Previous attempts to define protective immunity to Mtb have shown 76
mixed success and have focused on diversity in one dimension at a time – eg using proteins / 77
lysates to identify diverse CD4 T cell states (e.g. cytokine polyfunctionality25,26), or IFNγ production 78
in response to diverse sets of peptides predicted to bind Human Leukocyte Antigen (HLA) 79
proteins27,28. 80
81
More recently, sequence clustering approaches have been used to analyze the TCRs of T cells 82
enriched for Mtb specificity and reveal public TCR groups29,30, including groups that are positively 83
and negatively associated with disease progression 31. Importantly, these findings indicate that 84
particular antigen-specificities may be important in protection from Mtb. However, the analysis of 85
TCRs alone does not enable the efficient identification of the cognate Mtb epitopes (which have 86
been mapped for only a minority of TCR groups), nor does it capture the corresponding T cell 87
phenotypes. These two missing features – antigen-specificities and phenotypes – are critical 88
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
attributes if a correlate of protection is to be maximally actionable in guiding next -generation 89
vaccine design. 90
91
Here, we present an assay that can measure reactivity across 100s of HLA class II:peptide 92
antigen specificities simultaneously within a single PBMC sample, and associate each epitope -93
specific response with a transcriptome -wide T cell phenotype. We use this assay to generate a 94
comprehensive portrait of the CD4 T cell response to Mtb in infected participants – molecularly 95
resolved across 100s of epitopes and 1,000s of transcripts – and reveal previously undescribed 96
polyclonality, phenotypic diversity and TCR sequence features within this response. 97
98
99
Results
100
101
Simultaneous, highly -multiplexed measurement of CD4 T cell epitope -specificity and 102
transcriptional state enabled by clonal analysis 103
104
To enable the integrated, high -dimensional analysis of CD4 T cell epitope -specificities and 105
transcriptional states, we adapted an approach developed previously for the highly -multiplexed 106
analysis of CD8 T cell epitope -specificities11. That assay involved generating DNA -barcoded 107
probes of MHC class I:peptide complexes multimerized onto dextran backbones, incubating T 108
cells with pools of probes, using fluorescence to sort probe -binding cells, and analyzing binding 109
by deep sequencing of the DNA barcodes. From this starting point, we introduced three main 110
modifications, each of which was critical to enable our analysis of CD4 T cell responses. First, in 111
place of MHC class I, we used MHC class II:peptide complexes, prepared by exchanging peptides 112
of interest into HLA reagents bearing peptides tethered by a protease-cleavable linker 32. Second, 113
we used single cell sequencing in place of bulk sequencing, to enable T cell transcriptional states 114
to be detected and linked to epitope specificity. Third, instead of cell sorting, we introduced an 115
antigen-specific clonal expansion step to overcome the low frequencies of circulating epitope -116
specific CD4 T cells, and then leveraged this expansion to boost analytical power by developing 117
a “pseudobulk” approach in which we aggregated single cell data at the clonal level. Together, 118
we refer to this workflow as the Clonally-Resolved Epitope-Specificity and Transcriptome Assay 119
(CRESTA) (Figure 1). 120
121
We applied CRESTA to study the T cell response in HLA -typed ( Supplementary Table 1 ), 122
Quantiferon-positive, HIV -negative participants from a cohort in Western Kenya , sampled 123
following recent (≤ 3 months) household exposure to active pulmonary tuberculosis 33. To 124
represent the known antigen space, we used the Immune Epitope DataBase (IEDB) to 125
comprehensively identify a total of 206 Mtb peptide:HLA pairs known to generate human T cell 126
responses in the context of 4 class II restriction elements that were prevalent in this cohort (HLA-127
DQB1*06:02, HLA-DRB1*15:03, HLA-DRB1*11:01, or HLA -DRB5*01:01). We also included 32 128
peptide binders for a fifth prevalent HLA – HLA-DRB1*01:02 – identified in an in vitro peptide:HLA 129
binding assay. Finally, we selected a total of 85 additional peptide:HLA pairs as controls, which 130
included CLIP-tethered (uncleaved) versions of each HLA, Mtb epitopes restricted by additional 131
alleles (not expressed by the participants of interest), and IEDB epitopes from Influenza A virus 132
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
(see Supplementary Table 2 for a complete list of peptide:HLA pairs). We expanded antigen -133
specific T cells from 5 – 10 x 10 6 PBMCs for 7-9 days using a master pool of all peptides. We 134
then analyzed two aliquots of each sample: the first (“antigen-specificity aliquot”) was stained with 135
pooled DNA -barcoded peptide:HLA class II multimers corresponding to the participants’ HLA 136
types, the second (“gene expression aliquot”) was not stained but instead briefly restimulated with 137
PMA/ionomycin to enhance antigen-responsive gene expression. This dual aliquot strategy was 138
motivated by initial experiments (not shown) indicating that PMA/ionomycin greatly enhanced our 139
ability to resolve T cell states, but led to diminished multimer binding, likely due to downregulation 140
of the TCR34. Both aliquots of each sample were analyzed by single cell sequencing using the 141
10X Genomics Chromium platform to recover multimer barcode counts, transcriptome -wide 142
mRNA abundance data and paired, full -length TCRα:β sequences on 1,000s-10,000s of single 143
cells. We then informatically matched TCR sequences between the two aliquots to identify cells 144
from common clones and thereby unite antigen-specificity and gene expression data for each. 145
146
In a representative participant (ID:40059), we recovered data on a total of 28,129 cells –17,312 147
and 10,817 for the antigen -specificity and gene expression aliquots, respectively – the former 148
aliquot having been stained with a pool of 48 multimers (containing DRB1*15:03, DRB1*11:01 or 149
DQB1*6:02, all expressed by the participant). We reasoned that TCRα: β sequences could be 150
used to assign single cells into clonal families – each representing the progeny of a single Mtb-151
specific precursor – that each share a common epitope -specificity and differentiation state. To 152
test this hypothesis, for each aliquot we organized individual cells into clones based on shared 153
TCRα:β sequences, filtered on clones that contained ≥3 individual cells in each aliquot, and used 154
a Kruskal -Wallis test to quantify the extent to which signal from peptide:HLA multimers and 155
transcripts was driven by clonal identity ( Figure 2). This analysis included a total of 170 clones, 156
each of which contained 6 – 616 individual cells across the two aliquots (median=45) (Figure 2a). 157
158
Consistent with the hypothesis, we observed strong partitioning of multimer binding by clone. In 159
particular, we identified 10 of the 48 multimers to be binding in this participant (p -values ranging 160
from ~1e-5 to 1e -298), of which 3, 4 and 3 were restricted by DRB1*15:03, DRB1*11:01 and 161
DQB1*6:02, respectively (Figure 2b). To identify the individual binding clones for each of these 162
epitopes, we applied one -tailed Wilcoxon tests post -hoc to each binding multimer, in which we 163
compared the binding of each individual clone to the overall distribution (Figure 2c). This analysis 164
identified a total of 54 binding clones (32% of all clones), ranging from 1-19 per multimer, revealing 165
extensive polyclonality in the response to individual Mtb epitopes. Binding clones were non -166
overlapping between multimers, with the exception of 3 pairs of multimers that featured 167
overlapping peptides restricted by the same HLA and shared overlapping patterns of clonal 168
recognition (discussed further below), supporting the fidelity of the process. Moreover, the TCR 169
sequences of epitope -binding clones showed significant homology within epitopes and 170
recapitulated known public motifs, as discussed further below. 171
172
Similarly, we observed strong partitioning of transcript abundance across clonotypes: applied to 173
the same set of 170 clones, the Kruskal-Wallis test detected 155 genes at a Bonferroni-corrected 174
threshold of p<0.05 and fold -difference threshold of 10 ( Figure 2d ). These genes, which we 175
hereafter refer to as Clonal Differentiation Genes (CDGs), were strongly enriched for genes 176
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
implicated in helper T cell function, and prominently featured cytokines and chemokines known 177
to be produced by particular subsets, including IFNG, IL17A/F, IL1A, CCL3, CCL4, CCL5, GNLY, 178
CXCL8, CCL3L1, CCL4L2, CCL3 and IL22. We also observed patterns that were consistent with 179
the maintenance of distinct differentiation states across clones, for example the anti -correlated 180
expression of the Th1 chemokine CCL4 and the Th17 cytokine IL17A ( Figure 2e). Together, 181
these results establish CRESTA as a method for robustly resolving antigen -specific CD4 T cells 182
across many antigen-specificities simultaneously from a single PBMC sample, and associating 183
each with a transcriptome-wide gene expression state. 184
185
186
Comprehensive, hypothesis-free phenotyping of the Mtb-specific response using CRESTA 187
188
To perform a deep, multi-participant analysis of the epitope-specific CD4 T cell response to Mtb 189
infection, we performed CRESTA on a total of 5 LTBI individuals (ID:40059, analyzed above, and 190
4 additional participants: IDs: 30128, 30133, 30129, 30168) from the Kenyan cohort. We stained 191
cells from each participant with multiplexed probesets matching their HLA types (32 -173 192
multimers per participant, across 5 distinct HLA class II alleles), and recovered single cell 193
sequencing data on a total of 106,673 cells (4,400 -14,496 and 7,866-13,475 per participant for 194
the antigen-specificity and gene expression aliquots, respectively). In total, this yielded 1,496 195
evaluable clones (169–454 per participant). 196
197
Consistent with our observations above, epitope staining (analyzed as described in Figure 2b) 198
partitioned strongly by clonal identity and revealed polyclonal responses across all participants. 199
The highest dimensional staining was 173 -plex in participant ID:30168, in which we observed 200
robust binding across 15 distinct epitopes (Figure 3a), with a total of 126 epitope-binding clones. 201
Analysis of these profiles revealed 7 epitope clusters (each containing 1-5 epitopes) within which 202
there was extensive sharing of binding clones ( Figure 3b). All 7 clusters were internally HLA 203
matched and comprised sets of overlapping peptide sequences, indicative of a common core 204
peptide epitope in each (but with possible, clone -specific additional contributions from 205
flanking/polymorphic residues). Importantly, no clones were positive for epitopes from more than 206
1 cluster, indicating high assay specificity. 207
208
Across all 5 participants, we detected a total of 19 epitopes that showed significant binding to ≥1 209
clone in ≥1 participant (Figure 3c). Based on analysis of shared clones, these epitopes partitioned 210
into 11 clusters, each of which exclusively contained members sharing a common HLA and 211
overlapping peptide sequences, and we again found no examples of clones binding across 212
clusters. For simplicity, in all downstream analyses, we refer to these clusters as “epitopes”, even 213
though some comprise signal from several multimer probes that overlap a common core epitope. 214
In total, we detected 293 epitope-specific clones in the antigen specificity aliquot, ranging from 4–215
139 per participant and 4–93 per epitope. (Figure 3d). This number already represents nearly an 216
order of magnitude increase over the total number of TCRα: β sequence pairs mapped to 217
individual HLA II-restricted Mycobacterial epitopes in all prior studies to date (36 TCRs identified 218
in 5 studies from 2009-2023: IEDB, queried 6/29/2024). Finally, across the range of 32-173-plex 219
staining, we saw no evidence that the intensity of multimer probe binding was inversely correlated 220
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
with the number of probes pooled for any given participant ( Figure 3e), indicating that we have 221
yet to approach the limit of assay multiplexing. 222
223
Next, we used the clonally resolved expression analysis described above in Figure 2d, to identify 224
CDGs from the “gene expression aliquot” from each participant, and again observed these to be 225
strongly enriched for known CD4 T cell differentiation genes. For each CDG, we collapsed 226
individual cell expression values into per-clone medians, and then used UMAP to generate a 2 -227
dimensional representation of the CDG-wide expression profile of each clone. This allowed us to 228
cluster clones according to gene expression in a hypothesis -free fashion ( Figure 4a shows 229
participant ID:30133 as an example), and revealed distinct clusters whose expression profiles 230
corresponded to those of naive, Th1, Th2 and Th17 cells. Beyond canonical gene expression 231
patterns (i.e. IL2RA –, IFNG+, IL4+, IL17+, in naive, Th1, Th2, Th17 cells, respectively), 232
hypothesis-free correlation analysis across all CDGs revealed that these clusters were each 233
characterized by broad state -specific expression patterns, which included previously -described 234
genes (e.g. TNF, CCL3, CCL4, HOPX in Th1 clones; IL3, IL5, IL13, GATA3, IL17RB in Th2 235
clones; IL17F, IL22 RORC, CCL20, NR4A2 in Th17 clones) as well as genes not previously 236
associated with these T cell subsets (e.g. PTGS2 in Th2 clones; TGIF1 in Th17 clones) ( Figure 237
4b). We also observed clones with a Treg -like expression pattern, characterized by high 238
expression of FOXP3 and CTLA4, although these did not form a distinct cluster. 239
240
Among a total of 1,496 clones analyzed for gene expression across the 5 participants, 1,344 241
(90%) could be assigned a differentiation state; composed of Th1 (74%), Th2 (2%), Th17 (9%), 242
Treg (2%) or naive (4%) subsets. The average clone sizes across these diverse states differed 243
significantly, with Th1 clones being the largest (median = 10 cells), followed by Th17, Th2, Treg 244
and then naive clones (median = 3 cells) (Figure 4c). Of the 1,496 clones, we were able to assign 245
Mtb antigen specificity to 243 (16%), across the 11 epitopes described above in Figure 3 (50 of 246
the 293 clones described in Figure 3d were not evaluable because although they had ≥3 cells in 247
their respective antigen -specificity aliquot, they had <3 cells in the gene expression aliquot). 248
Comparing these epitope-mapped clones to the total population, we observed a marked further 249
skewing towards the Th1 state (97% of multimer -binding clones), at the expense of all other 250
subsets (Figure 4d). As expected, no naive clones were found to be multimer binding. For this 251
dataset, we therefore conclude that CD4 T cells specific for Mtb-specific epitopes are strongly 252
biased towards the Th1 subset. 253
254
255
256
257
258
259
Analysis of phenotypic heterogeneity within and between Mtb epitope-specific T cell responses 260
261
Despite the predominance of Th1 clones, we also detected rare Mtb-specific clones with Th17 262
(n=2) and Treg (n=2) phenotypes. A closer examination of these clones confirmed the expression 263
of a range of transcripts corresponding to their respective states, and revealed that these patterns 264
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
were broadly consistently across the individual constituent cells of each clone ( Figure 5a ). 265
Notably, each of these clones existed alongside others specific for the same Mtb peptide:HLA 266
epitope in the same participant, including the DRB1*15:03_HVSFVMAYPEMLAA
(PE13) and 267
DRB1*01:02_NIAFASGFRAIN (Rv0293c) epitopes. Together, this represents the first evidence 268
of which we are aware that T cells recognizing the same Mtb epitope in the same individual can 269
adopt highly-divergent differentiation states. 270
271
We next tested the hypothesis that, even within a Th1 dominated response, T cells recognizing 272
different epitopes can have divergent gene expression programs. To address this in a systematic 273
way, we considered all cases in which we observed ≥2 distinct reactive epitopes within a 274
participant, each of which was recognized by ≥3 clones. These criteria focused the analysis to a 275
total of 230 clones across 9 epitopes in 3 participants. We began by identifying CDGs as above, 276
yielding a total of 81, 118 and 139 genes in participants ID:40059, ID:30129 and ID:30168, 277
respectively. Focusing on these subsets of genes, we then used Kruskal-Wallis tests to compare 278
the median gene expression of each clone across epitopes, measuring whether expression was 279
correlated with epitope specificity. We detected strong correlations between gene expression and 280
epitope specificity within all 3 participants, with p-values up to 1e-5 (Figure 5b). At thresholds of 281
p2, we identified 11, 17 and 27 epitope -linked genes in the 3 282
respective participants. These sets overlapped significantly: 5 genes were common to ≥2 283
participants – IFNG, CCL3L1, GNLY, CCL4, LTB – all of which have known roles in Th1 effector 284
function. 285
286
To visualize how clones specific for different epitopes cluster in gene expression space, we used 287
UMAP to render the CDG -wide expression profiles of each clone in 2 dimensions ( Figure 5c, 288
upper row). Rather than a random distribution of clones specific for different epitopes (indicated 289
in different colors) across this space, in each participant we observed clustering of clones 290
according to their epitope specificity, consistent with our Kruskal -Wallis analysis. Prominent 291
among these were clusters of: (i) DRB1*15:03_HVSFVMAYPEMLAA -specific clones in 292
participant ID:40059 enriched for GNLY expression (green dots in Figure 5c , upper left), (ii) 293
DRB1*11:01_VDLAKSLRIAAKIYS-specific clones in participant ID:30129 enriched for CCL3L1 294
expression (yellow dots in Figure 5c, upper middle), and (iii) DQB1*06:02_EQQWNFAGIEAAA-, 295
DRB1*15:03_AAVVRFQEAANKQK- and DRB1*15:03_HVSFVMAYPEMLAA -specific clones 296
with high, high and low IFNG expression, respectively (cyan, red and green dots in Figure 5c, 297
upper right). Together these results reveal that different epitopes can program diverse Th1 gene 298
expression states within the same Mtb response, and that these are characterized by the 299
differential expression of important effector genes. 300
301
302
Analysis of TCR sequence clustering within the anti -Mtb response across epitopes and 303
participants 304
305
An alternative approach, to the one described here, for the multiplexed detection of epitope -306
specific T cell responses has been the identification of clusters of homologous TCR sequences, 307
followed in some cases by screening assays to map their specificities 30. Applied to Mtb, this 308
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
approach has been used to identify reactivities associated with protection from disease 309
progression31. However, it remains unclear how TCRs identified by such homology -based 310
approaches map onto the overall epitope-specific response. 311
312
The identification here of an unprecedented number of TCR α:β pairs mapped across a panel of 313
Mtb peptide:HLA epitopes presented an opportunity to assess the degree of TCR sequence 314
homology within epitope-specific T cell responses to Mtb in a minimally-biased way. To enable 315
robust statistical power, we focused on the 8 HLA:peptide epitopes for which we detected ≥10 316
multimer-binding clones across participants – with the requirement that each clone have a single 317
α and single β chain (which excluded 19% of clones, for which we sequenced 1 or >2 chains), to 318
eliminate ambiguity about the chains involved in epitope binding – yielding a total of 206 TCRα:β 319
pairs. We then used TCRdist 35 to perform comprehensive pairwise sequence similarity 320
measurements within each epitope. To rigorously quantify significance, we constructed a null 321
distribution of >1e12 pairwise TCRdist measurements on a large set of TCR α:βs randomly -322
sampled from an unenriched repertoire. We used this distribution to transform TCRdist measures 323
into p -values and applied Benjamini –Hochberg adjustment for the number of pairwise 324
measurements performed for each epitope. 325
326
We detected significant TCR homologies for all 8 of the Mtb epitopes analyzed – evident both as 327
the formation of clusters at an adjusted threshold of p<0.1 ( Figure 6a ), as well as overall 328
deviations from the null p -value distribution ( Figure 6b). However, the extent of this clustering 329
differed markedly by epitope, encompassing up to 10/14 (71%) of TCRs specific for the esxH 330
epitope DRB1*11:01_HEANTMAMMARDTAE, and as few as 2/13 (15%) for the CFP-10 epitope 331
DQB1*06:02_ISTNIRQAGVQYSR. The PE13 epitope DRB1*15:03_HVSFVMAYPEMLAA, for 332
which we detected the largest number of clustered clones overall, was notable for a large and 333
unusually tight cluster of highly-homologous TCRs that was dominated by clones from two of the 334
three participants that reacted to that epitope. For these TCRs, we observed a precise match to 335
V/J segment usage ( α: TRAV25/27, TRAJ52/40, β: TRBV9) and CDR3 motifs ( α:CAG***S/TY, 336
β:CASSVAL*G) described previously30, further supporting the fidelity of our multiplexed epitope-337
specific assay (Figure 6c). Overall, across the 8 epitopes, 99/206 (48%) epitope -specific TCRs 338
were part of detectable homology clusters. Importantly, however, this degree of clustering was 339
highly-dependent on filtering for single epitope binding to focus the number of TCR comparisons. 340
When testing an oligoclonal scenario in which epitope -specific TCRs were diluted to 5% with 341
random TCRs – designed to simulate more traditional enrichment methods where TCRs specific 342
for >20 epitopes may be analyzed together – detectable clusters remained in only 5 of the 8 343
epitopes, and comprised just 24% of all TCRs. 344
345
Many of the observed clusters included TCRs from different participants, underscoring their public 346
nature, and indeed for 5 of the 7 epitopes that contained clones from >1 participant, we detected 347
no difference between the distributions of intra -participant v inter -participant TCR distances 348
(0.16<p<1 by Kolmogorov –Smirnov test), indicating that features recognizing the common 349
peptide:HLA epitope often predominate over any participant -to-participant differences in TCR 350
repertoires. A striking exception to this was the DRB1*15:03_HVSFVMAYPEMLAA epitope: 351
whereas the TCRs from participants 40059 and 30129 clustered indistinguishably (red v blue dots 352
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
in the left box of Figure 6a: p(Inter v Intra | 40059v30129) = 0.69), the TCRs from 30168 (green 353
dots) were an outlier (p(Inter v Intra | 30168v40059) = 0.001; p(Inter v Intra | 30168v30129) = 354
0.0005), driven by a dearth of 30168 epitope -specific TCRs in the epitope’s central homology 355
cluster, and a correspondingly greater frequency of non-clustered TCRs. Intriguingly, this outlier 356
participant (30168) is a homozygote for the restricting allele (DRB1*15:03), and in fact represents 357
the only case among the 7 DRB1-restricted epitopes for which the participant was homozygous. 358
359
Together, these findings indicate that public TCR sequence features arise in the response to all 360
or most Mtb epitopes: occasionally (e.g. for the PE13 epitope) these are prominent and can 361
characterize the majority of responding TCRs, however for most epitopes, sequence homologies 362
are relatively rare and likely only detectable when the analysis is focused on TCRs from epitope-363
specific cells (e.g. by filtering on probe binding). 364
365
366
Discussion
367
368
In this study, we present a platform – the Clonally -Resolved Epitope -Specificity and 369
Transcriptome Assay (CRESTA) – that enables the integrated analysis of CD4 T cell specificities 370
and phenotypes at a dimensionality that greatly exceeds what was previously possible. By 371
applying this assay to study the response across 100s of HLA class II-restricted epitopes in Mtb-372
infected participants, we generate a portrait of CD4 T cell immunity to Mtb at unprecedented 373
breadth and resolution. This analysis reveals previously undescribed features including extensive 374
intra-epitope polyclonality and phenotypic heterogeneity, both within and between epitopes. In 375
the process, we also map 100s of new TCRα:β pairs to Mtb peptide:HLA epitopes, and describe 376
how public epitope -specific sequence features vary across epitopes and individuals. More 377
generally, since its assay targets are fully customizable, we expect CRESTA to be broadly 378
applicable beyond Mtb, and to enable similar insights in other research/disease settings in which 379
antigen-specific CD4 T cell responses are implicated. 380
381
Our finding that human Mtb epitope-specific T cells can occupy a range of differentiation states – 382
including Th17, Treg, and a range of Th1 substates – within the same host, and sometimes even 383
against the same epitope (Figures 4, 5) – illuminates layers of diversity in this response that were 384
not previously understood. Especially notable is the related observation that different epitope 385
specificities can elicit markedly different Th1 phenotypes within the same response (Figure 5b,c). 386
Together, these findings indicate that existing approaches for studying Mtb-specific T cell 387
responses – which are typically characterized by the use of antigen pools and the detection of a 388
limited number of phenotypic markers (e.g. IFN γ, TNF) 28,36 – capture only a subset of the 389
response. The finding that different Mtb antigens can elicit T cells with different 390
phenotypes/functions also offers a new class of mechanism to inform the interpretation of prior 391
Results
that point to antigen-specific protection, including: (i) that different Mtb antigens are under 392
purifying v diversifying evolutionary selection pressure 37, (ii) that particular TCR clusters (i.e. 393
epitope-specific responses) are associated with protection v non -protection from disease 31. 394
Particularly intriguing is our finding that the same PE13 epitope that was associated with 395
protection in the latter study, can be uniquely associated with high expression of GNLY ( Figure 396
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
5c), a pore -forming effector found in cytotoxic granules that can kill Mtb bacteria at high 397
concentrations38,39. More generally, our findings invite future applications of CRESTA to larger 398
cohorts of Mtb-infected participants, including comparisons of progressors v controllers, to explore 399
more comprehensively which T cell epitope-specificities and phenotypes may be associated with 400
protection from disease. 401
402
Our unbiased, epitope -resolved analysis of >200 TCR α:β sequences identified by CRESTA 403
(Figure 5) yields several important insights. First is the finding that, when filtering stringently on 404
TCRα:β pairs specific for individual Mtb epitopes, some degree of homology is detectable for all 405
epitopes. This observation is consistent with the first-principles consideration that, although most 406
epitopes are likely capable of recognition by many possible TCR binding modes, each binding 407
sequence is likely to be surrounded by a cluster of similar sequences separated by amino acid 408
substitutions that preserve binding. Second, however, is the countervailing observation that, even 409
in our stringent setting (filtering on individual epitopes), these homologies encompass only a 410
minority of antigen -specific TCRs. Moreover, when extrapolating to the more typical scenario 411
where oligoclonal/polyclonal repertoires are analyzed, such homologies remained detectable in 412
only rare cases – implying a limit to the sensitivity of approaches that rely on TCR clustering alone. 413
Third, although the most common finding was indistinguishable TCR similarity patterns within v 414
between individuals, our observation of a striking exception – in which one participant showed 415
dramatically weaker TCR clustering for the PE13 epitope – represents the best evidence of which 416
we are aware that there can be large individual-specific differences in the TCR sequences raised 417
to a fixed peptide:HLA antigen. Conceivable interpretations include: (i) individual -to-individual 418
differences in the T cell repertoire, shaped by other HLAs, self-peptides and/or antigen exposures, 419
(ii) individual -to-individual differences in the milieu in which the epitope -specific T cells were 420
primed, potentially altering the affinity threshold for T cell activation, (iii) a gene-dose effect of the 421
restricting HLA, wherein homozygosity may increase the density of peptide:HLA complexes and 422
thereby decrease the affinity threshold for T cell activation. Finally, CRESTA’s ability to generate 423
large numbers of TCR α:β sequences mapped to individual peptide:HLA epitopes (as of June 424
2024, considering HLA class II:peptide -specific TCR α:β pairs in the IEDB, this study alone 425
generated ~8X more pairs for Mtb, and >10% of the total number of pairs identified across all 426
research fields), positions it as a powerful tool for TCR discovery, with the potential future 427
applications that include the development of TCR-based therapies, as well as the training of next-428
generation models for predicting antigen specificity from TCR sequences. 429
430
A cornerstone of CRESTA is clonal analysis, which we use to (i) make rare epitope-specific CD4 431
T cells detectable within limited sample volumes, and (ii) enable robust, multi -datapoint-based 432
inferences about their antigen specificity and phenotype from single cell sequencing data. At the 433
same time, the use of clonal expansion has the potential to limit the assay in several ways. First, 434
it is likely that the efficiency of such expansion varies according to the state of the precursor T 435
cells, meaning that highly proliferative subtypes may reach detectable levels before less 436
proliferative subtypes do, biasing the representation of cells analyzed. Indeed, within cells 437
expanded from Mtb-infected participants, we observe a significant correlation between clone size 438
and helper T cell state (Figure 4c). Nonetheless, CRESTA successfully detects cells across the 439
Th state spectrum, including numerous clones of the less proliferative Treg and Th2 subtypes 440
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
(Figure 4), and its sensitivity to less expanded clones is likely to increase as the throughput of 441
single cell sequencing technologies continues to grow. Importantly, the impact of such biases can 442
be controlled by comparing clones of interest against bystander clones within the same expanded 443
samples, as we exemplify in Figure 4d. It is also noteworthy that the sensitivity of common (non-444
expansion-based) T cell specificity assays is often also biased to particular phenotypes (eg 445
ELISpot, that depends on cytokine secretion). A second potential limitation of clonal expansion is 446
the possibility of introducing artefactual gene expression patterns. While we have not directly 447
quantified these effects, our observation of gene expression patterns that partition strongly 448
according to clonal identity within a mixed culture (Figures 2d,e) – and recapitulate the known Th 449
subsets upon unsupervised multi -gene analyses ( Figure 4a,b ) – indicates that initial 450
differentiation states persist during expansion to at least a substantial degree. In the future, any 451
remaining impact may be mitigated by reducing the degree of expansion (our observed clone size 452
distributions indicate that even an order of magnitude less expansion is unlikely to substantially 453
impact assay sensitivity), and/or by using other sample types (e.g. bronchoalveolar lavage) that 454
are more enriched for the relevant antigen-specificities. 455
456
While it represents a major advance beyond what has been enabled in prior studies of CD4 T 457
cells, the application of CRESTA at a multiplexity of up to ~170 epitopes in this work does not yet 458
reach the scale that is needed for broad proteome-scale epitope discovery and characterization. 459
However, since: (i) we expect DNA barcoding and sequencing to be intrinsically highly -scalable 460
(>100,000s-plex), (ii) we saw no evidence for loss of binding signal as plexity increased up to 461
~170 (Figure 3e), and (iii) analogous approaches for CD8 T cells been successfully implemented 462
up to ~1000-plex (even without the clonal analysis enhancement we describe here)11, we expect 463
CRESTA to scale well beyond the plexity that we have demonstrated here. At that point, new 464
bottlenecks to be overcome will be the efficient identification of candidate peptide:HLA pairs (e.g. 465
using in silico prediction and/or high -throughput binding screens), and the preparation of the 466
corresponding probes in large numbers (e.g. using microfluidic automation). Realizing these 467
advances could enable powerful new studies of CD4 T cell specificities and functions, and their 468
interactions, at a truly genome-wide scale in diverse disease states. 469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
Figures and legends 485
486
487
488
489
Figure 1: Workflow of the Clonally-Resolved Epitope Specificity and Transcriptome Assay 490
(CRESTA). To enable the integrated, high-dimensional analysis of CD4 T cell epitope-specificities 491
and transcriptional states, we begin with the high -throughput assembly of hundreds of 492
peptide:MHC probes, each multimerized and DNA-barcoded using a streptavidin-bearing dextran 493
backbone construct (upper left). Pools of these probes are used to stain T cells clonally expanded 494
from PBMC samples using peptides (lower left, center), and then single cell sequencing is used 495
to read out epitope -level specificities and transcriptome -wide expression profiles ( upper right). 496
Finally, to interpret the resulting data, we apply a “pseudobulk” analysis that aggregates cells into 497
clones according to shared TCRα:β sequences (lower right), enabling inferences about epitope-498
specificity and transcriptional state that are robust to the noise inherent in single cell data. 499
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
500
501
Figure 2: Clonal analysis enables robust, high -dimensional analysis of CD4 T cell Mtb 502
epitope-specificity and transcriptome -wide gene expression, simultaneously. Peptide-503
expanded PBMCs from a LTBI participant (ID:40059) were stained with a pool of 48 DNA -504
barcoded peptide:MHC multimer probes corresponding to all Mtb T cell epitopes in IEDB known 505
to be restricted by either HLA-DRB1*15:03, HLA-DRB1*11:01 or HLA-DQB1*6:02, and analyzed 506
by single cell sequencing (“antigen-specificity aliquot”). A second aliquot that was unstained but 507
stimulated with PMA/ionomycin was also assayed (“gene expression aliquot”). A total of 28,129 508
cells across the 2 aliquots were collapsed into clones based on identical TCRα:β sequences, 170 509
of which contained ≥3 cells in each aliquot (≥6 total). (a) Shown are the distribution of cell numbers 510
in each of the 170 clones. (b) To identify which of the 48 epitopes were recognized, we applied a 511
Kruskal-Wallis test to data from the “antigen -specificity aliquot” to determine whether, for each 512
multimer probe, staining across cells partitioned in a clonally-restricted way (p-value, y-axis), and 513
to quantify the magnitude of such partitioning (fold-difference, x-axis). (c) For significant epitopes, 514
we next identified their particular binding clones using a Wilcoxon test to compare the distribution 515
of probe binding to each clone against the distribution for all other clones. Shown, by way of 516
example, are 2 of the significant epitopes identified in (b), with their respective significant clones 517
indicated in yellow. (d) To identify genes whose expression varies by clone (“clonal differentiation 518
genes / CDGs”), we applied the same clonally-resolved analysis described in (b), but this time to 519
gene expression values measured in the “gene expression aliquot”. (e) Examples of the clonally-520
resolved expression patterns of 2 CDGs with anti -correlated abundance, corresponding to the 521
Th17 (IL17A) and Th1 (CCL4) states, respectively. Vertical alignment of the 220 clones is 522
consistent between panels (a), (c) and (e). 523
524
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
525
526
Figure 3. CRESTA epitope detection is specific up to at least 173-plex and reveals highly-527
polyclonal, multi-epitope responses to Mtb. (a) LTBI+ participant ID:30168 was analyzed as 528
described in Figure 2a-c, this time using a pool of 173 DNA-barcoded multimer probes restricted 529
by HLA-DRB1*15:03, HLA-DRB5*01:01 or HLA-DQB1*6:02. Shown is a volcano plot revealing 530
the significant binding of 15 probes. (b) For the analysis in (a), boxplots show all 126 (of 454 total) 531
clones for which we detected ≥1 significant probe binding event (highlighted in yellow) across the 532
15 probes. Based on these binding profiles, the 15 probes cluster into 7 groups (demarcated by 533
horizontal boxes), whose members were uniformly HLA matched and contained overlapping 534
peptide sequences (in 2 cases these groups contained identical replicates). Notably, none of the 535
126 clones showed significant staining across >1 of these groups (100% specificity on this 536
measure). (c) The assay and analysis described in (a, b) was applied to a total of 5 participants 537
(IDs: 30128, 30133, 40059, 30129, 30168), which revealed a total of 19 unique probes that had 538
significant binding to ≥1 clone in ≥1 participant. Each of these probes is depicted as a node in a 539
force-directed graph that connects probes whose binding clones overlap. This analysis yielded a 540
total of 11 clusters (distinguished by color) which again correspond precisely to sets of probes 541
with shared HLA and overlapping peptide sequences. (d) Shown are the number of significant 542
binding clones for each of the 5 participants and 11 epitope clusters described in (c). Relative to 543
(c), peptide sequences are trimmed to show only the longest subsequence common to all 544
members of each cluster, and bold type indicates the HLA -binding core 9mer predicted by 545
netMHCIIpan. (e) For each binding probe across the 5 participants (colored as in (d), and plotted 546
by participant on the x -axis), shown is the intensity of probe staining (y -axis), which shows no 547
decline with increasing plexity. 548
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
549
550
Figure 4: CRESTA reveals a spectrum of CD4 T cell differentiation states and localizes the 551
epitope-specific Mtb response predominantly, but not exclusively, within the Th1 552
compartment. (a) Shown, for representative LTBI participant (ID:30133; 217 clones), is a 553
clustering analysis in which each clone is represented as a circle (sized according to its number 554
of constituent cells) and represented in 2 dimensions using a UMAP projection of its median 555
expression of each Clonal Differentiation Gene (n=763 CDGs). Green circles (upper left) identify 556
multimer-binding clones, orange ovals (lower left) demarcate inferred T cell subsets, and blue-red 557
coloring represents the median clonal mRNA abundance for the genes indicated in the upper -558
right corner of each plot. (b) For the Th subset -specific cytokine genes IFN γ, IL4 and IL17A 559
(represented in purple), Pearson analysis across all clones and CDGs was used to identify 560
additional genes with correlated expression (represented in green). (c) Clones across 5 561
participants (n=1,496) were assigned to T-helper (Th) states Th1, Th2, Th17, Treg or naive states 562
(or unassigned); the distribution of cell numbers per clone across the 6 Th states is shown. (d) 563
The 1,496 clones described in (c) were further classified according to whether or not they bound 564
an Mtb peptide:HLA multimer. Shown are the distributions across Th states of all clones (upper) 565
or of multimer-binding clones (lower), which were compared by Fisher’s exact test. 566
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
567
Figure 5: CRESTA reveals phenotypic heterogeneity within and between the CD4 T cell 568
responses to individual Mtb epitopes. (a) Expression profiles of epitope-binding T cells for two 569
selected individuals/epitopes, showing genes representative of the Th1 (IFNG, CCL4), Th2 (IL4, 570
IL5, GATA3, CCL1), Th17 (IL17A, IL17F, IL22) and Treg (FOXP3, CTLA4) subsets. Each column 571
shows data from an individual epitope -specific clone (n=33 clones), which comprise 4 -347 572
individual cells, quantified in the barplot. Violin plots show the distribution of expression values for 573
cells in each clone and are shaded according to median expression values. (b) For all cases in 574
which we observed ≥2 distinct reactive epitopes each recognized by ≥3 clones within a participant 575
(total = 230 clones across 9 epitopes in 3 participants), we identified CDGs within each participant 576
and then applied Kruskal-Wallis tests to determine whether the expression of each CDG across 577
clones was correlated with epitope specificity. Highlighted are 5 unique genes that were significant 578
in ≥2 participants (using thresholds shown in the yellow box). (c) Gene expression states for the 579
clones described in (b) were rendered in 2 dimensions using UMAP across all CDGs, and colored 580
according to epitope specificity (dot plots, upper row). For each participant (column), expression 581
of a selected gene observed to have strong epitope -dependent expression is shown across all 582
clones (dot plots, center row ), and compared between epitopes with the lowest v highest 583
expression (violin plots, lower row). Violin plots are colored by epitope. 584
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
585
586
587
Figure 6: Analysis of 206 Mtb epitope-mapped TCRα: β pairs reveals that sequence 588
publicity is widespread, but varies in magnitude across epitopes and participants. (a) For 589
8 Mtb epitopes (columns) for which CRESTA identified ≥10 epitope -specific clones, TCRα:β 590
sequences were compared within and between participants using comprehensive pairwise 591
TCRdist measurements. Each TCR is shown as a single dot (colored to distinguish participants) 592
in a force -directed graph in which proximity indicates the degree of sequence similarity. Line 593
segments connect TCR pairs whose TCRdist scores are significant, based on a large set of 594
distances randomly-sampled from an unenriched repertoire, with Benjamini–Hochberg correction 595
for the total number of comparisons made for each epitope (FDR<0.1). The number of epitope -596
specific TCRs for each participant, and the total number of TCRs within significant clusters, are 597
shown at the top and bottom of each box, respectively. (b) Shown, for the analysis described in 598
(a), are the full distributions of TCRdist p -values, but this time unadjusted: x-axis = expected, y-599
axis = observed), for all comparisons (black), and comparisons within (pink) and between (gray) 600
participants. (c) Logos showing sequence features of the significantly -clustered TCRs for each 601
epitope shown in (a). V and J segments are shown for cases where >50% of clustered TCRs 602
contain the same segment. CDR3 letters are sized according to conservation/entropy, and 603
colored by amino acid properties (polar=red, non -polar=green, negatively charged=gold, 604
positively charged=blue, aromatic=purple). 605
606
607
608
609
610
611
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
Methods
612
613
Study cohort, PBMC collection and processing 614
615
Household contacts (HHCs) of newly diagnosed active pulmonary TB cases were referred to the 616
Kenya Medical Research Institute (KEMRI) Clinical Research Center in Kisumu, Kenya, and their 617
demographic and medical history data were collected. HHCs were persons who shared the same 618
home residence as the index case for ≥5 nights during the 30 days prior to the date of TB 619
diagnosis of the index case, and were enrolled no more than 3 months (mean: 18 days; range: 620
1–77 days) after the index case began TB treatment. All participants provided written informed 621
consent to join the study and were recruited from two community -based health clinics located in 622
Kisumu City and Kombewa, Kisumu County. All enrolled individuals met the following inclusion 623
criteria: ≥ 13 years of age at the time of enrollment, positive QuantiFERON TB Gold in Tube (QFT) 624
result, seronegative for HIV antibodies, no previous history of diagnosis or treatment for active TB 625
disease or LTBI, normal chest X-ray, and not pregnant. All participants were presumed to be BCG 626
vaccinated due to the Kenyan policy of BCG vaccination at birth and high BCG coverage rates 627
throughout Kenya. All participants gave written informed consent for the study, which was 628
approved by the KEMRI/CDC Scientific and Ethics Review Unit and the Institutional Review Board 629
at Emory University, USA. 630
631
Blood samples were collected from participants in sodium heparin or lithium heparin Vacutainer 632
CPT Mononuclear Cell Preparation Tubes (BD Biosciences or Greiner Bio -One). PBMC were 633
isolated by density centrifugation, rested in complete media (RPMI 1640 containing L -glutamine 634
supplemented with 10% heat -inactivated fetal bovine serum (FBS), 1% PenStrep, 1% Hepes) 635
before counting. PBMC isolation was initiated <2 hours after the blood was drawn. Isolated PBMC 636
were cryopreserved in 90% heat -inactivated fetal calf serum/10% DMSO, and kept in LN2 (and 637
shipped on dry ice) until they were thawed for study at the TGen laboratory. 638
639
640
Cell culture and T cell expansion 641
642
PBMCs were thawed in a 37°C water bath and then washed and plated at 1.25 x 10 6 cell/mL in 643
24-well flat-bottomed plates in complete RPMI (RPMI-1640 with 10% AB human serum, 0.8 mM 644
sodium pyruvate, 0.8x non-essential amino acids, 80 U/mL penicillin-streptomycin, 0.4x HEPES, 645
200 mM L-glutamine, 0.07x 2-Mercaptoethanol; hereafter “cRPMI”) at 37oC with 5% CO2. On day 646
2, pools of peptides dissolved in DMSO (up to 579 peptides – which included all peptides from 647
multimers used for staining – each at a final concentration 0.47ug/mL; with a maximum final 648
DMSO concentration of 1.3%) were added. On day 3, taking care to not disturb the cells, 50% of 649
the media was exchanged. Cells were split as needed on days 4-5, after which 50% of the media 650
was again exchanged. Cultures were maintained for a total of 8 -10 days. Media included 651
recombinant human interleukin -2 (IL-2) at 230IU/mL – 1,025IU/mL (Biolegend), with the lower 652
concentrations used on days 1-2, followed by higher concentrations throughout the remainder of 653
the expansion. At the end of the culture, cells were harvested by collecting the supernatant, 654
treating wells with 2 mM EDTA in PBS for 2-4 minutes and adding the detached cell suspension 655
to the collected culture. After combining all wells from the same donor, cells were spun down at 656
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
300 x g for 5 min at room temperature, resuspended in storage media (9:1 cRPMI and DMSO) 657
and stored in liquid nitrogen until further use, or resuspended in cRPMI and used directly in the 658
CRESTA assay. 659
660
661
Production of HLA:peptide complexes 662
663
Peptides were custom ordered from Millipore -Sigma (PEPscreen, unpurified) and reconstituted 664
to 20 mg/mL in DMSO. CLIP peptide -tethered, biotinylated HLA monomers were obtained from 665
the NIH Tetramer Core Facility. To generate peptide-bound HLA monomers, we used a protocol 666
described previously32, consisting of (i) CLIP peptide cleavage, followed by (ii) peptide exchange. 667
CLIP peptide cleavage was performed by incubating the HLA monomer (at a final concentration 668
of 0.4 mg/mL with 3C protease (HRV-3C protease, Sigma Aldrich) in 3C cleavage buffer (0.05 M 669
Tris pH, 7.5, 150 mM NaCl) or thrombin protease in 10X Thrombin cleavage buffer (Thrombin 670
restriction grade, Millipore) overnight at room temperature. To perform peptide exchange, the 671
cleaved monomer was incubated at a final concentration of 0.2 mg/mL with individual peptides 672
(at a final concentration of 76-666 ug/mL) in 50 mM citrate buffer pH 5.2 containing 100 mM NaCl, 673
2 mM EDTA, 0.2x protease inhibitor (Promega), at reaction scales of 30 –1000 μL depending on 674
desired yield, at 30°C for 4 days. Following incubation, HLA:peptide complexes were cleared of 675
excess peptide and concentrated using dPBS-rinsed Amicon Ultra-0.5 Centrifugal 10 kDa filters 676
(spun twice at 14,000 x g for 15 min at 4°C). Finally, the cleared HLA:peptide monomer products 677
were quantified using a Nanodrop 1000 spectrophotometer reading at 280 nm, and stored in the 678
presence of 0.75x protease inhibitor cocktail (50X protease inhibitor, Promega) at -80°C for up to 679
12 months. 680
681
682
Generation of DNA-barcoded HLA:peptide probes 683
684
Barcoding DNA sequences compatible with the 10X Chromium Single Cell 5’ chemistry were 685
designed according to the 69 mer construct recommended by 10X Genomics (Surface Protein 686
Labeling Protocol CG000186), utilizing a 3’ Capture Sequence and containing 15 mer barcodes 687
from the 10X barcodes whitelist (Demonstrated Protocol, CG000193). These sequences were 688
purchased from IDT as 5’ biotinylated DNA oligos with standard desalting. For each desired 689
multimer probe, 1 μL of streptavidin-bearing dextran backbone (Klickmer APC or PE, Immunodex, 690
0.16 μM) was barcoded by incubating it with 1 μL of barcoding oligonucleotide (at 0.32 μM in 1x 691
TE) at 4°C for 30 min, in Lo-bind (Eppendorf) 96 well plate (stoichiometry = 1 dextran : 2 oligos). 692
After incubation, 2 μL of the desired HLA:peptide monomer (at 3.2 μM, prepared as above) was 693
added to the barcoded dextramer and incubated for 30 min at RT (stoichiometry = 1 dextran : 2 694
oligos : 20 HLA:peptide monomers). Binding reactions were quenched by adding 1 μL free D-695
biotin (5μM) in excess at 4°C for 30 min. These individual probe constructs (“multimers”) were 696
stored for up to 1 week at 4°C prior to cell staining. On the day of cell staining, the total volume 697
(5 μL) of each desired multimer (up to 173 – see Supplementary Table 2 for the probes used for 698
each donor) was pooled and concentrated using a Vivaspin2 column (#VS0241; Sartorius). The 699
column was pre-washed with 1x PBS, followed by 2 mL of barcode buffer (dPBS with 0.5% BSA 700
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
and 5 μg/mL Herring DNA (Promega)) and then stored at 4°C prior to loading with the probe pool. 701
Once the probe pool was added, the column was spun at 3,300 rpm for 10–20 minutes (until the 702
liquid level reached ~50-100 μl), and then inverted and spun at 2,000 x g for 5 minutes to recover 703
the pool, which was kept on ice until cells were ready for staining (up to 2 hours). 704
705
706
PMA/ionomycin stimulation (“gene expression aliquot”) 707
708
Expanded cells were thawed, washed with cRPMI, and then resuspended in cRPMI with 200 709
IU/mL IL-2 at 1.25 x 10 6 cell/mL in 24-well flat-bottomed plates and incubated at 37 oC with 5% 710
CO2 overnight. Cells were then stimulated with PMA and ionomycin at final concentrations of 0.04 711
μM and 0.67 μM, respectively (Biolegend, Catalog #423301), and then incubated for 1 -1.5hrs at 712
37oC. Stimulated cells were collected into 1 5mL conical tubes, including detachment from plates 713
with 2 mM EDTA in PBS for 2 -4 minutes, then centrifuged at 200 x g for 5 minutes. Cells were 714
washed a total of three times, in the following buffers, with spinning at 300 x g for 5 min at RT 715
between each: (i) 5 mL of cRPMI; (ii) 2.5 mL cRPMI + 2.5 mL EasySep Buffer (# 20144; StemCell); 716
and (iii) 5 mL of Loading Buffer (1x PBS + 0.04% BSA). Following the final wash, cells were 717
resuspended in 100μL of Loading Buffer and a 10 μL aliquot was taken for cell counting prior to 718
single cell partitioning. 719
720
DNA-barcoded peptide:HLA class II multimer staining (“antigen-specificity aliquot”) 721
722
Expanded cells were thawed and rested overnight as described above ( PMA/ionomycin 723
stimulation section). The next day, cells were collected into 15 mL conical tubes, including 724
detachment from plates with 2 mM EDTA in PBS for 2-4 minutes, then centrifuged at 200 x g for 725
5 minutes. Supernatant was discarded without disturbing the cell pellet, and cells were washed 2 726
times in cRPMI (300 x g for 5 min) and the pellet resuspended (via flicking) in 100 μL cRPMI. 727
Resuspended cells treated with 0.003 M of the protein kinase inhibitor Dasatinib (Axon Medchem, 728
VA) with gentle swirling at 37 oC, for 10 min with a loose cap, and then 20 min with a tight cap. 729
The multimer probe pool (50-100 uL total volume, prepared as above) was then added, resulting 730
in a final concentration of ~0.8 –1.1 nM for each multimer, and cells incubated for 30 min at RT 731
with a tight cap. After incubation, cells were spun in a 4 oC (300 x g for 5 min) for 3 washes; first 732
in 5 mL of cRPMI, then 2.5 mL cRPMI + 2.5 mL EaspSep Buffer (# 20144; StemCell) and finally 733
in 5mL Loading Buffer (1x PBS + 0.04% BSA). Following the last wash, cells were resuspended 734
in 100 μL Loading Buffer and a 10 μL aliquot was taken for cell counting prior to single cell 735
partitioning. 736
737
738
Generation and sequencing of single cell libraries 739
740
To generate single cell libraries, 10,000–40,000 cells per participant aliquot were loaded into the 741
Chromium Controller (10X Genomics), and processed according to the manufacturer’s 742
instructions (Chromium Next GEM Single Cell 5' Reagent Kits v2 protocol) to prepare libraries 743
corresponding to the feature/probe barcodes (antigen -specificity aliquot), 5′ gene expression 744
(gene expression aliquot), and VDJ-T repertoire (both aliquots). Fragment size and quantity was 745
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
assessed on the 4200 TapeStation (#G2991A; Agilent) using high sensitivity DNA tapes (#5067–746
4626; Agilent). Before sequencing, qPCR was performed after serial dilution of the libraries 747
(#KK4824 – 07960140001; Kapa Biosystems). Libraries were sequenced on Illumina 748
NextSeq1000 or NovaSeq instruments using the read configurations and PhiX loading 749
recommended by 10X Genomics. 750
751
752
Analysis of single cell sequencing data 753
754
Data processing and merging: Following Illumina BCL file conversion, fastq files were processed 755
through the CellRanger Multi Pipeline (10X Genomics) to perform demultiplexing, alignment, 756
filtering, barcode and UMI counting, and VDJ assembly. We then developed custom R code for 757
all downstream analytical steps. We first merged the filtered VDJ clonotype sequences (from the 758
filtered_contig_annotations.csv file) with the raw counts matrix of transcripts and/or probe 759
barcodes (raw_feature_bc_matrix), by matching on cell barcodes and retaining only those cells 760
associated with ≥1 filtered VDJ sequence. 761
762
Clonotype collapsing and shortlisting: Since some CellRanger VDJ-T clonotype calls contain the 763
same TCR chain sequence both alone and paired with other sequence(s) (likely attributable to a 764
combination of: (i) droplets containing >1 cell (multiplets), and (ii) the fact that TCR chain 765
sequences that are present do not always successfully amplify/assemble in every cell), we 766
developed a method to improve assignment of TCR chains to clonotypes that involved clustering 767
chains according to their correlated occurrence across cells (collapseclonotypes.R). This allowed 768
us to more accurately identify chains belonging to the same clone, by merging cells that were 769
assigned to different clonotypes but shared ≥1 chain, and by excluding cells that contained chains 770
from ≥1 cluster (as likely multiplets). For each aliquot (antigen-specificitiy or gene expression) we 771
then focused only on collapsed clones containing ≥3 cells (listclonotypes.R). 772
773
Normalization of multimer probe signal: We developed a method to normalize multimer -to-774
multimer and cell -to-cell differences in the multimer assay yield (normmaster.R). First, we 775
normalized the read counts for each multimer to a fixed depth such that the sums for all multimers 776
were the same (equal to the number of cells in the data matrix). For each multimer, we then 777
normalized in the other dimension (across cells) by dividing the resulting values for each cell by 778
the median value of all other cells. To reduce the impact of high-staining outliers (e.g. caused by 779
multimer aggregation) on visualization, for each multimer we applied the value of the cell at the 780
top 0.1 percentile to all cells with greater staining. Finally, for each multimer, we divided values 781
by the multimer max value, to scale values onto the [0,1] interval prior to visualization 782
(stackedplot.R) and statistical analysis. 783
784
Identification of staining multimers / clonotypes and Clonal Differentiation Genes: To identify 785
binding multimers and Clonal Differentiation Genes (CDGs), we applied Kruskal -Wallis tests to 786
quantify the influence of clonal identity on (i) the normalized multimer probe signal, or (ii) raw 787
reads, for each multimer or gene, respectively ( testclonalgenes.R). To identify the individual 788
binding clones for each of binding epitope, we then applied one-tailed Wilcoxon tests post-hoc to 789
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
each binding multimer, in which we compared the normalized multimer probe signals in each 790
individual clone against the signal of the central 20% of cells across all clones (testsigclones.R). 791
792
UMAP analysis, transcript correlation analysis and identification of T cell subsets: Beginning with 793
either all clones ( Figure 4) or all epitope -mapped clones ( Figure 5c), we identified significant 794
CDGs as above (with max fold -difference > 10 and Bonferroni -adjusted p-values < 0.05), and 795
then generated log 10(median+1) values (“clone medians”) for each gene in each clone. To the 796
resulting gene-by-clone table, we applied UMAP to generate a 2-dimensional rendering using the 797
R umap package. For the lineage defining genes IFNγ, IL4 and IL17A, we generated Pearson 798
correlation coefficients (r) across clone median values, against every other significant CDG. 799
Genes with r > 0.5 are depicted as a force-directed graph rendered using the R igraph package. 800
Using clone median values, we identified Th2, Th17 and Treg clones as IL4+GATA3+, 801
IL17A+RORC+, and FOXP3+CTLA4+, respectively. Among the remainder, we identified Th1 802
clones as IFNG+CCL4+, and finally, among the remainder, naives clones as IL2RA–. 803
804
Analysis of TCRα:β homologies using TCRdist: To analyze TCR homologies, we developed our 805
own computationally efficient implementation of the TCRdist metric 35 to quantify the degree of 806
sequence similarity between pairs of TCRα: β heterodimers. We focused on epitope -specific 807
clones with exactly 1 TCRα and 1 TCR β chain, and, within each epitope, performed TCRdist 808
measurements between all pairs of binding TCRs. We then mapped each TCRdist value to a p -809
value using a large distribution of TCRdists measured on a large, unenriched repertoire, as 810
follows. Beginning with 1e4 randomly -sampled TCRαs and TCR βs, we calculated all ~5e7 811
pairwise TCRdists for each chain type. We then calculated the frequencies of all possible α: β 812
chain TCRdist values by considering all combinations of α and β TCRdists, allowing us to 813
efficiently estimate frequencies down to ~1e-12. Epitope-specific sets of p-values were adjusted 814
using the Benjamini –Hochberg procedure. We visualized the CDR3 sequences of significantly 815
clustered TCRs using the ggseqlogo R package, after multiple sequence alignment. Positions in 816
which there was an alignment gap for the majority of CDR3s were excluded from display. 817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
References
834
835
836
1. Mosmann, T. R. & Coffman, R. L. TH1 and TH2 cells: different patterns of lymphokine 837
secretion lead to different functional properties. Annu. Rev. Immunol. 7, 145–173 (1989). 838
2. Zinkernagel, R. M. & Doherty, P. C. Restriction of in vitro T cell-mediated cytotoxicity in 839
lymphocytic choriomeningitis within a syngeneic or semiallogeneic system. Nature 248, 840
701–702 (1974). 841
3. Babbitt, B. P., Allen, P. M., Matsueda, G., Haber, E. & Unanue, E. R. Binding of 842
immunogenic peptides to Ia histocompatibility molecules. Nature 317, 359–361 (1985). 843
4. Davis, M. M. & Bjorkman, P. J. T-cell antigen receptor genes and T-cell recognition. Nature 844
334, 395–402 (1988). 845
5. Lovelace, P. & Maecker, H. T. Multiparameter Intracellular Cytokine Staining. Methods Mol. 846
Biol. 1678, 151–166 (2018). 847
6. Sidney, J., Peters, B. & Sette, A. Epitope prediction and identification- adaptive T cell 848
responses in humans. Semin. Immunol. 50, 101418 (2020). 849
7. Dan, J. M. et al. A Cytokine-Independent Approach To Identify Antigen-Specific Human 850
Germinal Center T Follicular Helper Cells and Rare Antigen-Specific CD4+ T Cells in Blood. 851
J. Immunol. 197, 983–993 (2016). 852
8. Altman, J. D. et al. Phenotypic analysis of antigen-specific T lymphocytes. Science 274, 853
94–96 (1996). 854
9. Newell, E. W., Klein, L. O., Yu, W. & Davis, M. M. Simultaneous detection of many T-cell 855
specificities using combinatorial tetramer staining. Nat. Methods 6, 497–499 (2009). 856
10. Bentzen, A. K. & Hadrup, S. R. Evolution of MHC-based technologies used for detection of 857
antigen-responsive T cells. Cancer Immunol. Immunother. 66, 657–666 (2017). 858
11. Bentzen, A. K. et al. Large-scale detection of antigen-specific T cells using peptide-MHC-I 859
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
multimers labeled with DNA barcodes. Nat. Biotechnol. 34, 1037–1045 (2016). 860
12. Magnin, M., Guillaume, P., Coukos, G., Harari, A. & Schmidt, J. High-throughput 861
identification of human antigen-specific CD8 and CD4 T cells using soluble pMHC 862
multimers. Methods Enzymol. 631, 21–42 (2020). 863
13. Lantz, O. & Teyton, L. Identification of T cell antigens in the 21st century, as difficult as 864
ever. Semin. Immunol. 60, 101659 (2022). 865
14. Rockinger, G. A. et al. Optimized combinatorial pMHC class II multimer labeling for 866
precision immune monitoring of tumor-specific CD4 T cells in patients. J Immunother 867
Cancer 8, (2020). 868
15. Yang, J. et al. Multiplex mapping of CD4 T cell epitopes using class II tetramers. Clin. 869
Immunol. 120, 21–32 (2006). 870
16. Davis, M. M., Altman, J. D. & Newell, E. W. Interrogating the repertoire: broadening the 871
scope of peptide-MHC multimer analysis. Nat. Rev. Immunol. 11, 551–558 (2011). 872
17. Ge, X. et al. Peptide-MHC cellular microarray with innovative data analysis system for 873
simultaneously detecting multiple CD4 T-cell responses. PLoS One 5, e11355 (2010). 874
18. Justesen, S., Harndahl, M., Lamberth, K., Nielsen, L.-L. B. & Buus, S. Functional 875
recombinant MHC class II molecules and high-throughput peptide-binding assays. 876
Immunome Res. 5, 2 (2009). 877
19. Uchtenhagen, H. et al. Efficient ex vivo analysis of CD4+ T-cell responses using 878
combinatorial HLA class II tetramer staining. Nat. Commun. 7, 12614 (2016). 879
20. Dezfulian, M. H. et al. TScan-II: A genome-scale platform for the de novo identification of 880
CD4 T cell epitopes. Cell vol. 186 5569–5586.e21 (2023). 881
21. Fletcher, H. A. & Schrager, L. TB vaccine development and the End TB Strategy: 882
importance and current status. Trans. R. Soc. Trop. Med. Hyg. 110, 212–218 (2016). 883
22. Hansen, S. G. et al. Prevention of tuberculosis in rhesus macaques by a cytomegalovirus-884
based vaccine. Nature medicine vol. 24 130–143 (2018). 885
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
23. Schrager, L. K., Harris, R. C. & Vekemans, J. Research and development of new 886
tuberculosis vaccines: a review. F1000Res. 7, 1732 (2018). 887
24. Tait, D. R. et al. Final Analysis of a Trial of M72/AS01 Vaccine to Prevent Tuberculosis. N. 888
Engl. J. Med. 381, 2429–2439 (2019). 889
25. Lewinsohn, D. A., Lewinsohn, D. M. & Scriba, T. J. Polyfunctional CD4+ T Cells As Targets 890
for Tuberculosis Vaccination. Front. Immunol. 8, 1262 (2017). 891
26. Seder, R. A., Darrah, P. A. & Roederer, M. T-cell quality in memory and protection: 892
implications for vaccine design. Nat. Rev. Immunol. 8, 247–258 (2008). 893
27. Lindestam Arlehamn, C. S. et al. Memory T cells in latent Mycobacterium tuberculosis 894
infection are directed against three antigenic islands and largely contained in a 895
CXCR3+CCR6+ Th1 subset. PLoS Pathog. 9, e1003130 (2013). 896
28. Lindestam Arlehamn, C. S. et al. A Quantitative Analysis of Complexity of Human 897
Pathogen-Specific CD4 T Cell Responses in Healthy M. tuberculosis Infected South 898
Africans. PLoS Pathog. 12, e1005760 (2016). 899
29. Huang, H., Wang, C., Rubelt, F., Scriba, T. J. & Davis, M. M. Analyzing the Mycobacterium 900
tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide 901
antigen screening. Nat. Biotechnol. 38, 1194–1202 (2020). 902
30. Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 903
94–98 (2017). 904
31. Musvosvi, M. et al. T cell receptor repertoires associated with control and disease 905
progression following Mycobacterium tuberculosis infection. Nat. Med. 29, 258–269 (2023). 906
32. Willis, R. A. et al. Production of Class II MHC Proteins in Lentiviral Vector-Transduced 907
HEK-293T Cells for Tetramer Staining Reagents. Curr Protoc 1, e36 (2021). 908
33. Ogongo, P. et al. Rare Variable Antigens induce predominant Th17 responses in human 909
infection. bioRxiv (2024) doi:10.1101/2024.03.05.583634. 910
34. Gallegos, A. M. et al. Control of T cell antigen reactivity via programmed TCR 911
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
downregulation. Nat. Immunol. 17, 379–386 (2016). 912
35. Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor 913
repertoires. Nature 547, 89–93 (2017). 914
36. Lewinsohn, D. M. et al. Human Mycobacterium tuberculosis CD8 T Cell Antigens/Epitopes 915
Identified by a Proteomic Peptide Library. PLoS One 8, e67016 (2013). 916
37. Coscolla, M. et al. M. tuberculosis T Cell Epitope Analysis Reveals Paucity of Antigenic 917
Variation and Identifies Rare Variable TB Antigens. Cell Host Microbe 18, 538–548 (2015). 918
38. Dotiwala, F. & Lieberman, J. Granulysin: killer lymphocyte safeguard against microbes. 919
Curr. Opin. Immunol. 60, 19–29 (2019). 920
39. Stenger, S. et al. An antimicrobial activity of cytolytic T cells mediated by granulysin. 921
Science 282, 121–125 (1998). 922
923
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted November 8, 2024. ; https://doi.org/10.1101/2024.11.05.622086doi: bioRxiv preprint
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.