Section 2
Participant recruitment for the TRiPP project took place across 3 sites: University of Oxford, UK; Boston's Children's Hospital, USA; and Instituto de Biologia Molecular e Celular, Portugal. Participants were women of reproductive age (18-50 years) with and without chronic pelvic pain. The CPP group comprised 4 pelvic pain sub-groups with different underlying diagnoses: (1) Endometriosis-associated pain (EAP): individuals with a prior surgical diagnosis of endometriosis and at least 1 type of pelvic pain rated >4/10; (2) Endometriosis-associated pain with comorbid bladder pain (EABP): participants meeting EAP criteria with additional bladder-related pain and urinary symptoms (frequency and/or urgency); (3) BPS: individuals reporting bladder pain >4/10 and urinary symptoms (frequency and/or urgency) without a prior surgical diagnosis of endometriosis; and (4) Pelvic pain without bladder or urinary symptoms and no prior endometriosis diagnosis (PP). All CPP participants had reported at least 1 pelvic pain score of >4/10. Participants in the control group had no history of endometriosis, no urinary symptoms, and reported pelvic pain of <3/10 on a numerical rating scale (NRS). The full protocol and selection criteria for the cohort are described in the study protocol (Demetriou et al., 2022).
Chronic pelvic pain participants were asked to complete a set of questionnaires selected to capture a full picture of their pain experience, other potentially relevant clinical variables, quality of life, pain interference, and other measures previously described as being relevant to visceral pain summarized in Table 1 . Pain-free controls completed all the same measures except those capturing additional detail about their pelvic pain (eg, the painDETECT scale).
Summary of assessment domains, measurement tools with scoring procedures, and interpretation guidelines for all clinical, psychological, and physiological variables.
Twenty minutes of a 12 lead electrocardiogram (ECG) was conducted 12 at rest, lying down, before and after the CPM paradigm (described below) to assess the heart rate.
A 24-hour profile of cortisol levels was assessed using saliva samples. Participants were asked to provide 5 saliva samples using a saliva kit during a normal day (in which no study testing took place): (1) as soon as they wake-up, (2) 30 to 45 minutes after waking up, (3) before lunch, (4) before dinner, and (5) bedtime. They were asked to record the exact time of each sample and store them in the fridge until bringing them back to the researchers for processing. In addition, saliva samples were collected pre and post physiological testing.
The German Neuropathic Pain Network quantitative sensory testing (QST) paradigm 36 , 45 was conducted by trained team members. These tests include assessment of thermal and mechanical detection and pain thresholds, as well as vibration detection, mechanical sensitivity, wind-up ratio, and pressure pain thresholds, giving an overall 13 measures. We conducted testing on the dorsum of the right foot (control site) and lower abdomen/pelvis below the umbilicus (test site). All testing were conducted in a temperature-controlled room at approximately 20°C. For participants from the Institute of Molecular and Cell Biology (IBMC), testing was conducted in Portuguese, with the script forward and backward translated for accuracy. Full description of the paradigm, analysis, and results is published in Coxon et al., (2023). 9
The conditioned pain modulation (CPM) paradigm assesses the efficiency of the body's endogenous pain inhibitory pathways. Participants underwent CPM testing in a temperature-controlled room (20°C). Pressure pain thresholds (PPT) were assessed using a force dial algometer applied 3 times to the right dorsal foot, with the mean pressure (PPT1 average ) recorded as baseline. A pressure cuff conditioning stimulus (CS) was then applied to the left arm, inflated until participants reported pain, and maintained for 60 seconds. Before deflation, participants rated their pain (0-10), and the CS pressure was recorded. Immediately after, the algometer was reapplied to the foot to determine PPT2 average . After a 10-minute rest, the procedure was repeated, recording the CS pain rating and PPT3 average to assess CPM effects. A full description of the CPM paradigm and analysis is provided in Demetriou et al., 2025. 12
We used a noninvasive bladder paradigm in this study, which has been previously developed. 48 Participants are asked first to void their bladder and then to drink 20 fl.oz (US) of water in 5 minutes. They are instructed to inform the researcher when they reach certain sensations, at which point in time (since onset of drinking), pain intensity rating (NRS 0 [no pain]—10 [worst pain imaginable]) and urgency rating (NRS 0 [no urgency]—10 [worst urgency imaginable]) are recorded. These sensations are First Sensation (described as “when riding in a car, the drivers pulls over to a rest-stop to urinate, you would go as well”), First Urge (“when riding in a car, you would initiate the request to find a rest-stop to urinate”), and Maximum Tolerance (“when riding in a car, you would urinate on the side of the road in bumper-to-bumper traffic”). Once participants have reached Maximum Tolerance (or after 2 hours, whichever occurs first), the participant is asked to void, and the volume of urine is recorded by the researcher.
As illustrated in Figure 1 , our approach to analysis was in 3 stages. First (stage 1), to address aim 1, we compared those with CPP with the pain-free control group. For these analyses, we used data from the measures that all participants completed, excluding those measures that had been used to define the groups (eg, NRS of pelvic pain symptoms). To address aim 2 (stratification of women with CPP into meaningful subgroups), we then focussed only on those with CPP. Our stage 2 analyses took a data-driven approach to identify subgroups within those with CPP. For this stage, we used data from the physiological assessments and questionnaire measures relevant to underlying pain mechanisms (eg, painDETECT, sleep, psychological variables) but again did not include those that were used to define the groups or that would have been expected to align with specific diagnostic groups (eg, measures related to bladder symptoms). Finally, in stage 3, our analyses aimed to better understand the clinical characteristics of the identified subgroups. For these analyses, we therefore did use our fuller set of clinical data including measures of pain intensity and diagnostic groupings.
Overview of the study aims and analysis pipeline.
A between-group analysis was run using independent t-tests with Bonferroni correction ( significance at 0.005 level ) to explore any significant effects between the CPP and control groups. The variables included at this stage were questionnaire measures of Sleep and Fatigue scores, mental wellbeing (anxiety and depression scores), personality and pain catastrophising (PCS) as well as physiological and biological measures (average R-R intervals, heart rate variability [HRV], cortisol profiles, CPM response, QST profiles). All the standardised questionnaire data were scored as per the published algorithms for each questionnaire (Table 1 ). At this stage, the painDETECT and Widespreadness assessments were not included as only CPP participants were asked to complete them.
All physiological data were processed and analysed as per recommended guidelines in the literature. For QST data, raw values were z-transformed as per published literature using age- and sex-matched reference data for the foot (control site) 45 and trunk (test site), 43 and z-values are as such that positive numbers show a gain of function and negative a loss of function.
The CPM effect was quantified using both the absolute difference in pressure pain threshold (PPT2 average − PPT1 average ) and the percentage change [(PPT2 average − PPT1 average )/PPT1 average ] × 100. 28 The absolute difference was used to determine the presence of a “true” CPM effect. R-R intervals and HRV were calculated from the ECG recordings for the first 5 minutes pre and post the CPM paradigm using LabChart software, with both baseline and change in R-R interval calculated. The saliva samples for the cortisol profiles were analysed to extract cortisol levels (nmol/L) for pre and post the CPM paradigm, as well as for the 24-hour profile. For the latter, the area under the curve was calculated between the 5 time-points.
A latent profile analysis (LPA) was used to stratify CPP participants based on a selected set of the measures used in the study. Latent profile analysis is a statistical model used to identify profiles within a heterogeneous population based on a set of continuous variables. Therefore, the selected variables were composed of a set of continuous measures that were either found to be significantly different to the control group or were considered potentially mechanistically important due to evidence from other published literature. As we were aiming to identify mechanistically relevant subgroups, we specifically did not include clinical variables such as pain intensity or diagnostic category, nor did we include quality of life (QoL) measures (Fig. 1 ).
The data set was preprocessed, and missing values were imputed using mean imputation. The set of selected variables were then standardized into z-scores (mean of 0 and standard deviation of 1). The LPA analysis was conducted using the Gaussian Mixture Model approach with 1 to 5 profiles fitted to the data to calculate the Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC) to determine the optimal number of clusters. The optimal number of clusters was selected by comparing BIC and AIC across values across different numbers of profiles, and the model with the lowest BIC was selected for further analysis. In addition, we adopted a minimum profile size criterion, ensuring that each profile contained at least 5% of the total sample to reduce the risk of spurious classes. 38
In addition to LPA, we also conducted an exploratory K-means clustering as a sensitivity analysis to evaluate the robustness of the identified profiles using the same variables as LPA. Because K-means requires the number of clusters to be specified in advance, we set k to align with the number of clusters identified in LPA results. K-means clustering also requires complete data across all variables, and therefore, this analysis was restricted to the subset of participants with no missing values (n = 21). The resulting clusters were compared with the LPA profiles using agreement indices Adjusted Rand Index, Normalized Mutual Information.
The final stage of analysis aimed to explore clinically relevant differences between the identified subgroups. Analysis of variances (ANOVAs), post hoc Tukey HSD (Honestly Significant Difference) tests, and Kruskal–Wallis H tests were used for comparisons between the profiles regarding diagnostic group (EAP, EABP, BPS, PP), pelvic pain symptoms, bladder sensitivity, comorbidities, QoL, and pain interference in work, daily activities, sleep, exercising, and social activities.
Section 3
Comparing those with CPP with controls, we identified significant differences ( P < 0.005) between the groups for all questionnaire measures except the Big 5 Inventory for personality. Specifically, those with CPP reported higher levels of fatigue (t = 5.16, P < 0.001, d = 0.95, CI [0.57-1.32]); poorer sleep (t = −4.21, P < 0.001, d = −0.78, CI [−1.15 to 0.40]); higher levels of: anxiety (t = 2.94, P = 0.004, d = 0.54, CI [0.17-0.91]), depression (t = 2.88, P = 0.005, d = 0.53, CI [0.16-0.90]), and pain catastrophising (t = 6.39, P 0.005): abdominal pain (t = 3.32, P < 0.001, d = 0.51, CI [0.14, 0.86]), indigestion (t = 4.42, P < 0.001, d = 0.65, CI [0.29, 1.01]), constipation (t = 3.82, P < 0.001, d = 0.58, CI [0.22, 0.94]), and GSRS Total (t = 4.03, P < 0.001, d = 0.62, CI [0.25, 0.98]) (Fig. 2 and Table 2 ).
Analysis stage I. Group comparisons between women with chronic pelvic pain (CPP; red) and controls (blue) across domains. (A) Sleep quality (ASCQ-Me v2 Sleep Impact Short Form); (B) Anxiety and depression symptoms (Hospital Anxiety and Depression Scale); (C) Fatigue levels (Neuro-QOL v1 Fatigue); (D and E) Childhood and recent trauma exposure, reported as number of events and burden scores; (F) Big 5 personality traits (Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness); (G) Pain catastrophizing (Pain Catastrophizing Scale total score); (H) Conditioned pain modulation (CPM) response using pressure pain thresholds (PPT); (I) Autonomic reactivity as measured by changes in heart rate variability (average R-R intervals); (J) Salivary cortisol response to pain stimulus (pre–post CPM difference); (K) Salivary cortisol levels 24 hours profile(area under the curve; AUC); (L and M) Z-score profiles of somatosensory function from quantitative sensory testing (QST) on the lower abdomen (L) and foot (M), including thermal, mechanical, and pain sensitivity. Boxplots depict group medians, interquartile ranges, and outliers.
Stage I statistical analysis results from independent t -tests comparing between participants with chronic pelvic pain and controls.
Values represent group means for each assessment, followed by results from independent samples t -tests: t -value, degrees of freedom (df), associated P -value, Cohen d (effect size), and the 95% confidence interval for the effect size (lower and upper bounds).
AUC, area under the curve; CPM, conditioned pain modulation; CPP, chronic pelvic pain; GSRS, gastrointestinal symptom rating scale; HADS, hospital anxiety and depression scale; PCS, pain catastrophizing scale.
By contrast, we found only limited differences between the groups for the physiological measures assessed. Four components of the QST paradigm were significantly different, with the CPP group exhibiting loss of function for thermal sensory limen on both the abdomen (t = −22.8, P = 0.032) and foot (and t = −3.5, P = 0.004), and vibration detection on the foot (t = −3.0, P = 0.017) and gain of function for pressure pain threshold on the abdomen (t = −3.0, P = 0.012). 9 There were no significant differences in measures of autonomic nervous system activity, CPM, or cortisol profiles ( P > 0.01) (Fig. 2 and Table 2 ).
Based on the results of the LPA analysis, the 3-cluster model was selected as the best solution because it had the lowest values for both the BIC (BIC = 6367.85) and the AIC (AIC = 3956.61) compared with models with fewer or more profiles. These values indicate that the model achieves the best balance between goodness-of-fit and model simplicity. The 3 identified clusters were characterized by distinct patterns of means across variables included in the analysis (Fig. 3 and Table 3 ) and included: 43 participants in cluster 1, 11 participants in cluster 2, and 54 participants in cluster 3.
Analysis stage II. Differences across 3 latent profile analysis (LPA) clusters (Cluster 1 = dark red, Cluster 2 = medium red, Cluster 3 = light red) across domains. (A) Pain catastrophizing (Pain Catastrophizing Scale); (B) Fatigue (Neuro-QOL v1 Fatigue t-score); (C) Childhood trauma burden; (D) Pain nature (painDETECT score); (E) Pain widespreadness (number of body areas with persistent pain); (F) Autonomic function measured by heart rate variability (difference in average R-R intervals pre- and post-CPM); (G) Conditioned pain modulation (CPM) response using pressure pain thresholds (PPT); (H) Salivary cortisol levels 24 hours profile (area under the curve [AUC]); (I) Salivary cortisol response to pain stimulus (pre–post CPM difference); (J) Anxiety and depression (HADS subscales); (K) Personality dimensions (Big 5 Inventory: Extraversion, Agreeableness, Conscientiousness, Neuroticism, Openness); (L) z-scores from QST domains: CS non-noxious, CS noxious, TS thermal detection, TS thermal pain, TS mechanical detection, and TS mechanical pain. Boxplots show the distribution of scores per cluster with median, interquartile range, and outliers.
Means of all study variables for each latent profile analysis cluster and results of 1-way analysis of variances comparing the 3 clusters.
Significant results ( P < 0.05) are shown in bold.
ANOVA, analysis of variance; AUC, area under the curve; CPM, conditioned pain modulation; GSRS, gastrointestinal symptom rating scale; HADS, hospital anxiety and depression scale; PCS, pain catastrophizing scale; QST, quantitative sensory testing; SF-36, 36-Item Short Form Health Survey.
An ANOVA revealed significant differences between clusters for a number of variables across our domains. Specifically, we found significant differences in questionnaire measures of: PCS (F [2, 105] = 20.57, P < 0.0001), fatigue scores (F [2, 105] = 15.92, P < 0.0001), depression (F [2, 105] = 14.39, P < 0.0001), anxiety levels (F [2, 105] = 16.21, P < 0.0001), and the neuroticism domain of the personality scale (F [2, 105] = 10.90, P < 0.0001). Similarly, significant differences were observed in pain-relevant measures of the painDETECT questionnaire (F [2, 105] = 11.55, P < 0.0001) and Widespreadness (F [2, 105] = 9.88, P < 0.001).
However, statistical analysis of the physiological measures revealed significant effects only for certain QST measures: thermal pain on the test site (F [2, 105] = 9.62, P < 0.001), thermal detection on the test site (F [2, 105] = 6.38, P < 0.01), and noxious stimuli on the control site (F [2, 105] = 7.47, P < 0.01). No significant measures were observed in any of the other physiological measures.
Post hoc Tukey HSD tests were conducted for the observed significant effects (Fig. 3 ) to determine which specific clusters differed from each other. In summary, cluster 1 scored significantly higher compared with cluster 3 on the painDetect scale as well as measures of widespreadness, pain catastrophising, fatigue, anxiety and depression, and the neuroticism domain of the personality scale. Moreover, cluster 1 had significantly higher scores than cluster 3 on the QST measures of thermal pain sensitivity (test site), thermal detection (test site), and noxious stimuli detection (control site). The only significant difference between cluster 1 and cluster 2 was a higher score of pain catastrophising in cluster 1. No significant differences were observed between clusters 2 and 3 across any of the variables.
The K-means model was run with 3 clusters to align with the solution identified by LPA, using the same 22 standardized variables. Because K-means requires complete data across all variables, this analysis was limited to participants with no missing values (n = 21). The resulting clusters showed partial but not strong agreement with the LPA profiles, with an Adjusted Rand Index (ARI) of 0.22 and a Normalized Mutual Information (NMI) of 0.37. Despite these modest agreement metrics, the overlap table demonstrated a reasonable correspondence between the 2 methods: participants in LPA cluster 1 were mostly classified into K-means cluster 3, while those in LPA cluster 3 were split between clusters 1 and 2 (see Supplementary Table S1, http://links.lww.com/PAIN/C416 ).
Descriptive exploration of the clusters regarding the clinical diagnostic group of the patients (EAP, EABP, BPS, PP) showed that all diagnostic groups were represented in each cluster (cluster 1: EAP = 30.2%, EABP = 39.5%, BPS = 25.6%, PP = 4.7%; cluster 2: EAP = 45.5%, EABP = 18.2%, BPS = 27.3%, PP = 9.1%; cluster 3: EAP = 46.3%, EABP = 13%, BPS = 22.2%, PP = 18.5%) (Fig. 4 ).
Analysis stage III. Clusters 1, 2, and 3 as defined by the LPA analysis across clinically relevant measures. Cluster groups are colored in shades of red, from darkest (Cluster 1) to lightest (Cluster 3). (A) Gastrointestinal Symptom Rating Scale (GSRS) scores by cluster for the assessed GSRS domains; (B) Scores for pain domains: dysmenorrhea, dyspareunia during intercourse, dyspareunia postintercourse, and pelvic pain during gynaecological examination; (C) SF-36 quality of life domain scores: physical functioning, physical health limitations, emotional limitations, energy/fatigue, emotional wellbeing, social functioning, pain, and general health; (D) Percentage of participants in each cluster reporting clinical comorbidities (anxiety, depression, IBS, migraine, asthma, eczema, PCOS); (E) Distribution of clinical diagnostic categories across clusters; (F) Pain ratings at first bladder sensation and first urge (visceral sensitivity); (G) Stacked bar chart showing percentage distribution of pain interference ratings (Not at all—Extremely) across 6 domains: work/school, daily activities, sleep, exercise/sports, and social activities. All boxplots show medians, interquartile ranges, and outliers. BPS, bladder pain syndrome; EABP, endometriosis-associated bladder pain; EAP, endometriosis-associated pain; PP, primary pain; SF-36, 36-item short form health survey.
Similarly, participants of each diagnostic group were spread between the 3 clusters (EAP: cluster I: 30%, cluster II: 11.6%, cluster III: 58.1%; EABP: cluster I: 64.5%, cluster II: 7.7%, cluster III: 26.9%; BPS: cluster I: 42.3%, cluster II: 11.5%, cluster III: 46.2%; PP: cluster I: 15.4%, cluster II: 7.7%, cluster III: 76.9%).
The results from 1-way ANOVAs revealed significant effects for dyspareunia during intercourse (F [2,67] = 6.48, P = 0.003), dyspareunia postintercourse (F [2,67] = 6.50, P < 0.001), and noncyclical pelvic pain (F [2,78] = 6.48, P = 0.003); but not for dysmenorrhea (F [2,103] = 2.15, P = 0.129) (Fig. 4 ).
Post hoc comparisons using the Tukey HSD test indicated that cluster 1 had significantly higher pain intensity levels than cluster III for dyspareunia during intercourse ( P = 0.002, CI [0.87, 4.34]), dyspareunia postintercourse ( P < 0.001, CI [1.40, 4.83]) and noncyclical pelvic pain ( P = 0.002, CI [0.64, 3.20]).
A 1-way ANOVA showed significant differences between the profiles for the abdominal pain (F [2,93] = 9.11, P < 0.001) and diarrhoea (F [2,96] = 5.09, P 0.008) (Fig. 4 ).
Post hoc comparisons using the Tukey test revealed that the cluster I scored significantly higher on the abdominal pain ( P < 0.001, CI [−1.67, −0.47]) and total GSRS score ( P = 0.011, CI [0.12, 1.06]) compared with cluster III. While in the diarrhoea subscale, cluster II scored significantly higher than cluster III ( P = 0.007, CI [0.24, 1.87]).
A 1-way ANOVA test was conducted to explore effects between the clusters, revealing significant effects in pain intensity at first sensation (F [2, 43] = 9.20, P < 0.001) and first urge (F [2, 42] = 5.92, P = 0.005). Post hoc tests showed that cluster I and cluster II had significantly higher pain intensity scores than cluster III for first sensation (vs cluster I P < 0.001 CI: −3.67 to 0.85, vs cluster II P = 0.045 CI: −5.75 to 0.05) and first urge (vs cluster I P = 0.018 CI: −3.46 to 0.25, vs cluster II P = 0.045 CI: −6.47 to 0.05).
A 1-way ANOVA revealed no significant differences between the 3 clusters for any of the 5 cortisol measurement timepoints, nor for the morning rise (timepoint 2 − timepoint 1) (all P > 0.05) (Table 4 ). Similarly, no significant differences were observed in baseline HRV between the clusters ( P > 0.05) (Table 4 ).
Stages II and III statistical analysis results of 1-way analysis of variances comparing across the 3 clusters on the latent profile analysis.
Each row presents the F-statistic, degrees of freedom between and within groups, associated P -value, effect size (Eta Squared, η 2 ), and the 95% confidence interval of η 2 for each variable.
AUC, area under the curve; CPM, conditioned pain modulation; GSRS, gastrointestinal symptom rating scale; HADS, hospital anxiety and depression scale.
Across all 3 clusters, the most common comorbidities reported by more than 10% of all the CPP participants were anxiety, depression, migraine, IBS, asthma, eczema, and polycystic ovary syndrome (PCOS). Chi-square test of independence used to compare frequencies of these comorbidities between clusters revealed a significant effect for the depression diagnosis (χ 2 [2] = 11.82, P = 0.003, Cramer V [effect size] = 0.331). No other significant differences were identified.
One-way ANOVAs of the 36-item short form (SF-36) subscales revealed significant effects between the clusters for Physical Functioning (F [2,103] = 19.43, P < 0.001), Physical Health Limitations (F [2,102] = 8.53, P < 0.001), Energy/Fatigue (F [2,103] = 23.19, P < 0.001), Emotional Wellbeing (F [2,103] = 11.31, P < 0.001), Social Functioning (F [2,103] = 15.66, P < 0.001), Pain (F [2,103] = 14.19, P < 0.001), and General Health (F [2,103] = 12.46, P < 0.001) but not for Emotional Limitations (F [2,102] = 3.20, P = 0.045).
Post hoc comparisons using the Tukey HSD test indicated that cluster 1 had significantly lower QoL for Physical Functioning ( P < 0.001, CI [−32.70, −14.13]), Physical Health Limitations ( P < 0.001, CI [−56.60, −15.00]), Energy/Fatigue ( P < 0.001, CI [−32.70, −14.13]), Emotional Wellbeing ( P < 0.001, CI [−29.60, −14.28]), Social Functioning ( P < 0.001, CI [−43.81, −17.52]), Pain ( P < 0.001, CI [−41.29, −14.71]), and General Health ( P < 0.001, CI [−35.31, −12.35]). No significant differences were observed between clusters 2 and 3.
Kruskal–Wallis H tests were used to compare pain interference between clusters. Significant differences were observed across clusters in interference related to work/school activities, H(2) = 19.48, P < 0.001; daily activities, H(2) = 16.29, P < 0.001; sleep, H(2) = 10.99, P = 0.0004; sexual intercourse, H(2) = 13.55, P = 0.001; and exercise or sport, H(2) = 18.86, P < 0.001. Post hoc pairwise Mann–Whitney U tests with Bonferroni correction ( P < 0.01) indicated that participants in cluster 1 reported significantly greater pain interference in work/school (U = 192.5, P < 0.001, r ≈ 0.51) and daily activities (U = 210.0, P < 0.001, r ≈ 0.53) compared with cluster 3.
Section 4
As expected, our study demonstrated perturbations in pain-relevant systems between women with CPP and pelvic pain-free women. These differences were more pronounced for questionnaire measures than physiological tests. Moreover, we demonstrated that it is possible to use these measures to identify clusters within our CPP population, which seem to represent different clinical phenotypes that are not driven by a clinical diagnosis of endometriosis or IC/BPS.
Although all the questionnaire measures were able to identify significant differences between women with CPP and controls, it was surprising that, from the physiological measures only, specific QST components differed between the 2 groups. 9 Our results are in contrast to the existing literature for other chronic pain conditions such as IBS, low back pain, and fibromyalgia in which physiological measures such as CPM, cortisol, and ECG do differ significantly when compared with a healthy population. 1 , 2 , 19 , 20 , 26 , 39 , 42 , 46 However, it should be noted that in our cohort, there is wide variability across all physiological measures assessed, especially in the women with CPP, potentially explaining the lack of significant difference at a group level. One explanation for this may be the influence of hormonal state on our chosen physiological measures. 10 Although hormonal variation can also affect some of the measures we assessed with questionnaires (eg, mood, fatigue), most of the tools we chose for these assessments were those that assessed trait rather than state. We specifically chose not to control for endogenous hormonal state or exogenous hormone use, as we aimed to identify a stratification method that was stable across time and useful in clinical settings.
Interestingly, the LPA analysis results reflected this heterogeneity within the CPP group by classifying participants into 3 clusters. However, similar to stage 1 of the analysis, physiological measures of heart rate, cortisol levels, and CPM do not seem to be affecting the identified clusters. Previous work by the MAPP consortium has shown that structural and functional brain differences are associated with distinct symptom subgroups in chronic pelvic pain, 29 , 50 suggesting that neuroimaging could provide valuable mechanistic insights into the profiles identified in our study. The original design of TRiPP had aimed to collect fMRI data with the specific aim of exploring potential central nervous system markers of any identified subgroups; however, unfortunately restrictions during and immediately after the COVID-19 pandemic limited the data we were able to collect and thus means we cannot address this question within TRiPP itself. Given the findings presented here and from MAPP previously, we believe this to be an important area for future investigation.
However, the questionnaire measures, including assessments of pain characteristics such as painDETECT and widespreadness, seem to be driving the stratification. Considering clinical translatability, this is an important finding, as the use of questionnaires plus potentially a simplified sensory test would be easier (and cheaper) to implement in a clinical setting than a battery of more complex, time-consuming tests that require expensive equipment/specialist training or lead to a delay in receiving results (eg, cortisol analysis).
We specifically chose not to include measures of symptoms or diagnoses within the data at stage II so that these factors did not influence clustering. Similarly, we excluded measures that might align more with one diagnostic group than another, such as bladder sensitivity and sleep (in case this was disproportionately affected by nocturia). In line with our hypothesis that factors other than the peripheral pathology are of importance in determining subphenotypes, we found our 4 diagnostic groups spread across the 3 clusters. Interestingly, the clusters we have identified align with those found in other studies. Exploring the QoL and pain interference data, our analysis suggests that cluster 1 comprises most of those with high-impact pain. Although it should be noted that high-impact chronic pain was originally defined as persistent pain with “substantial restriction of participation in work, social, and self-care activities for 6 months or more,” 13 , 31 more recently a timeframe of 3 months has been used in line with our assessments. 30 , 44 The MAPP network recently demonstrated that high-impact pain in a cohort of patients with urological chronic pelvic pain syndrome (64% female) was associated with both widespread pain and pain in response to consuming a standardised volume of water, as well as pelvic floor tenderness (not assessed in our study). 49 Our findings suggest that these factors are important in those with chronic pelvic pain more broadly not just urological pelvic pain.
Our clusters also have similarities with nociplastic pain (cluster 1) and nociceptive pain (cluster 3). It is important to note that there currently are no diagnostic criteria published for nociplastic pain in visceral pain conditions and some of the criteria used for musculoskeletal pain are perhaps less applicable (eg, most visceral pain is regional in distribution rather than discretely localised). 32 However, features such as fatigue, widespread pain, psychological distress, and increased sensitivity to a noxious stimulus at a distant site would all be consistent with this classification. On the other hand, cluster 3 demonstrated a low painDETECT score, pain localised to the pelvis, low levels of psychological distress, and low pain scores in response to visceral sensations, all of which would point more to a nociceptive phenotype. 18
The heterogeneity seen within the clusters in physiological measures potentially suggests that multiple different mechanisms could generate each phenotype. This would be consistent with other published literature suggesting, for example, that both top-down and bottom-up processes can lead to nociplastic pain. 18 However, recent work by the MAPP network has clearly illustrated the relationship between a widespread pain phenotype in urological chronic pelvic pain and the response to a variety of treatments. 17 Surgical procedures (including ablation/excision of endometriosis and hysterectomy) that are commonly used in the management of CPP in women are associated with complications and significant financial cost, but are frequently unsuccessful in improving pain. 24 We therefore believe that there is an urgent need to explore whether this widespread pain phenotype predicts treatment response in a broader population of patients with CPP including those with endometriosis. One study exploring hysterectomy as a treatment for CPP would support this strategy. 5 To date, there is very limited research on the role of a multidisciplinary pain management approach in CPP despite this being the recommended approach for other forms of chronic pain, particularly when nociplastic features are present.
Although our study provides valuable insights, it has some limitations. The rate of recruitment of participants for this part of the TRiPP study was severely affected by the COVID-19 pandemic, and as a result, the sample size is less than originally planned. Although this may contribute to the lack of significant differences seen in the physiological assessments, we would have expected a bigger impact on the questionnaire measures which are arguably less sensitive.
Although, the results of the K-means sensitivity analysis provided additional support for the robustness of the latent profiles, the modest agreement metrics likely reflected the reduced sample size available for complete-case clustering. These findings support the appropriateness of LPA for our data set, while also highlighting the value of complementary approaches for sensitivity testing. However, future research should replicate our findings in larger samples and conduct sensitivity analyses using alternative data-driven clustering approaches to further evaluate their validity.
The cross-sectional design of our study also limits the causal inferences that can be drawn. Future research needs to prospectively validate these clusters and to study them longitudinally to understand their stability and trajectories over time. Equally important is the investigation of how these clusters respond to different treatments. Interestingly, work from the MAPP network does suggest their subgroups are stable over time. 33
Although clusters 1 and 3 seem to relate to other published work, it is harder to understand cluster 2 particularly when this is smaller than the other clusters. The potential role of the autonomic nervous system and HPA axis in differentiating this cluster (Fig. 3 ) should be explored in a larger sample. Work in urological pelvic pain also suggests that there is a clinical phenotype characterised by dysfunction of the pelvic floor muscles. 22 This is a clinical finding seen in association with endometriosis and other types of CPPS too, 40 and thus, future work should consider determining whether this is a distinct phenotype of CPP. It is believed that the recent addition of a standardised tool for clinical assessments to the EPHect tools 35 will facilitate collaborative work in this area.
Section 5
Our study demonstrates differences in pain-relevant systems between women with CPP and pain-free controls, and how these can be used to stratify women with CPP into subgroups. These subgroups seem to be clinically meaningful and align with work in other forms of chronic pain. We believe that our findings support the need for a different more personalised and more nuanced therapeutic strategy, potentially taking a pain-focused rather than a pelvis-focussed approach to those with high-impact pain. Further clinical research is urgently needed in this area given the significant burden of CPP in women.
Intro
Globally, chronic pelvic pain (CPP) affects up to 26.6% of women and those born female, 3 , 14 , 51 , 52 affecting quality of life and incurring substantial healthcare and economic costs. 25 Despite the high prevalence, underlying mechanisms remain poorly understood, and existing management strategies are unsatisfactory.
Chronic pelvic pain is classified as a secondary pain condition if associated with an underlying pathology such as endometriosis (International Classidication of Diseases [ICD-11], Code: MG30), 47 with therapeutic approaches focussing on the pathology. For those without identifiable pathology, symptom constellations determine the appropriate primary pain condition, eg, interstitial cystitis/bladder pain syndrome (IC/BPS). Whether primary or secondary, patients with CPP are predominantly seen by the clinicians responsible for the pelvic organ(s) considered most likely the cause of their pain, eg, gynaecologists, urologists, and gastroenterologists. Despite increasing evidence of similarities with other chronic pain conditions, a pain-focussed approach usually comes late in the journey (if at all) for these women. 34
Women with CPP commonly describe many years of pain, which frequently persists or recurs despite recommended treatment of any identified associated pathology. 7 , 8 , 23 , 34 Thus, there is an urgent need to identify alternative approaches for stratifying those with CPP into clinically meaningful groups and to define appropriate treatment algorithms for these groups. Work to date has explored possible approaches for subgrouping those with specific types of CPP, including IC/BPS, 22 , 37 , 49 vulvodynia, 4 , 21 , 41 and endometriosis-associated pain, 27 and for CPP more generally. 6 These approaches seem to be able to identify subgroups with high-impact pain and, in IC/BPS, to impact on treatment response. 17 These studies have predominantly used questionnaire measures, or where clinical/psychophysical assessments have been used, these have focussed on the pelvis (eg, bladder capacity; sensitivity of the bladder to filling, the pelvic floor muscles to pressure, and the vulva to experimental stimuli). Given increasing awareness that factors outside the pelvis are of importance in predicting response to treatment, 5 , 17 it is plausible that taking a broader approach, as has been performed for other types of chronic pain, 15 , 16 may give greater insights. However, relatively little is known about the responses of those with CPP to many pain-relevant psychophysical assessments.
To address this knowledge gap, in this article, we use data from the Translational Research in Pelvic Pain (TRiPP) project ( https://www.imi-paincare.eu/PROJECT/TRIPP/ ), a project which aimed to take a deep-phenotyping approach to improve understanding of CPP in women, including better methods of stratification. 11 We aimed first to determine whether we can demonstrate perturbations in the function of pain-relevant systems in women with CPP compared with pain-free women and, second, to explore whether we can use these data to stratify women with CPP into meaningful subgroups. We hypothesised that at a group level (comparing those with CPP with pain-free controls), we would see differences in the assessed measures. However, we expected that there would be significant heterogeneity in all measures for those with CPP and that this variation would allow the identification of subgroups using a data-driven approach. A main hypothesis of our consortium was that these subgroups may be independent from the diagnostic label.
Appendix
Supplemental digital content associated with this article can be found online at http://links.lww.com/PAIN/C416 .
Coi Statement
L.D., L.C., E.E., K.K., D.P., K.P., E.T., A.C., J.F.G., P.A.M., C.E.L., L.A.N., Q.A., J.B., K.G., A.S., L.H., M.K., J.M., C.S., A.F.V.: No competing interests. A.H.: Employee of Bayer AG, Germany. A.W.H.: Dr Horne reports receiving grants from the UK Research and Innovation, UK National Institute for Health and Care Research, Scotland's chief scientist's office, Wellbeing of Women, and Roche Diagnostics; consultancy fees from Roche Diagnostics, Gesynta, and Thramex; lecture fees from Gedeon Richter; having a pending patent (UK Patent 2217921·2); serving as President-elect of the World Endometriosis Society, co–editor in chief of Reproduction and Fertility, trustee and medical adviser to Endometriosis UK, and specialty adviser to the Scottish government's Chief Medical Officer for obstetrics and gynaecology. E.M.P.Z.: Esther M. Pogatzki-Zahn received financial support from Grünenthal, Germany, for research activities and advisory and lecture fees from Grünenthal, Germany, MSD/MERCK, Germany, Merz Pharmaceuticals gmbh, and Medtronic, Switzerland. In addition, she receives scientific support from the German Research Foundation (DFG), the Federal Ministry of Education and Research (BMBF), the Federal Joint Committee (G-BA), and the Innovative Medicines Initiative 2 Joint Undertaking under Grant Agreement No 777500. This Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation program and EFPIA. R.D.T.: Dr. Treede reports grants from IMI2 PainCare project of EU, grants from TEVA, Esteve, during the conduct of the study; personal fees from Bayer, Grünenthal, GSK, Sanofi, Merz, and Vertex, outside the submitted work; In addition, Dr. Treede has a patent DE 103 31 250.1-35 with royalties paid to MRC Systems. J.V.: has received research funding from Viatris and consultancy fees from Grünenthal, AstraZeneca, and Merz Pharmaceuticals, outside the submitted work. C.M.B.: Research grants from Bayer Healthcare, MDNA Life Sciences, Roche Diagnostics, European Commission, NIH. His employer has received consultancy fees from Myovant and ObsEva for work outside of this project. F.C.: Consultant and/or investigator for Allergan (AbbVie), Astellas, Bayer, Ipsen, and Recordati. S.A.M.: Advisory board member for AbbVie and Roche; receives research funding from the National Institutes of Health, the US Department of Defence, the J. Willard and Alice S. Marriott Foundation, and AbbVie. None are related to the presented work. K.Z.: Reports grant funding from EU Horizon 2020, NIH US, Wellbeing of Women, Bayer AG, Roche Diagnostics, Evotec-Lab282, and MDNA Life Sciences, outside the submitted work. J.N.: Employee of Merz Therapeutics GmbH and shareholder of Bayer AG, Germany, and shareholder of Eli Lilly. K.V.: Declares research funding from UKRI, NIHR, NIH US, and Bayer AG outside of the submitted work, and honoraria for consultancy and talks and associated travel expenses paid to her institution from Bayer AG, AbbVie, Reckitts, and Eli Lilly.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.