Methods
Participants were primarily recruited from university students through an open invitation. Individuals who met the inclusion criteria were invited to participate voluntarily. A convenience sampling method was used, and all eligible individuals who agreed to participate were included. The sample of the study consisted of 139 individuals over the age of 18 who had primary dysmenorrhea. A pelvic examination is not required to initiate treatment for classic primary dysmenorrhea. While there is no special test to diagnose primary dysmenorrhea, individuals presenting with the classical symptoms are appropriate candidates for empirical therapy. Individuals with the following characteristics were assigned:1) Menstrual pain initiating shortly after menarche or within the initial 2 years; 2) Discomfort like pain, commencing either just prior to or at the outset of menstruation; 3) Discomfort like pain, experienced in the lower abdomen and back, potentially extending to the inner thighs or both areas; 4) Discomfort typically subsiding within a 72-hour timeframe; 5) Intermittent and spasmodic discomfort; 6) Consistent discomfort recurring from one menstrual cycle to another; and 7) Supplementary symptoms may include nausea, vomiting, fatigue, headaches, dizziness, and disruptions in sleep patterns [ 10 ].
Individuals with (1) between the ages of 18–45; (2) who showing the aforementioned description of dysmenorrhea; (3) individuals without a diagnosed chronic illness or a history of psychological disorders, past or present and (4) could speak, read and write Turkish were included in the study. (1) Individuals experiencing cognitive and psychiatric effects such as bipolar disorder, psychosis, somatic symptom disorder, moderate or severe depression or eating disorders; (2) those who took part in psychotherapy for premenstrual symptoms (current or past); (3) women in pregnancy and breastfeeding; (4) those who have acute suicidal tendencies; (5) those who have gynecological diseases (oophorectomy, hysterectomy, polycystic ovary syndrome, gynecological cancer, infertility, endometriosis); (6) individuals who have recently used or made changes in the usage of antidepressants, oral contraceptives, hormones (e.g., thyroid hormones), or benzodiazepines/antipsychotics within the last 3 months; (7) any neurological disorder; and (8) individuals who could not speak, read or write Turkish were excluded from the study.
This study was approved by the clinical research ethics committee of Tokat Gaziosmanpaşa University (16 March 2023–Decision no: 83116987-215). Informed consent was obtained from all individuals. This study was registered in ClinicalTrials.gov at 31 March 2023 ( NCT05829512 ).
Permission was obtained from the developer of the Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea (WaLIDD) prior to the translation process [ 5 ]. The process of translation and cross-cultural adaptation of the Turkish WaLIDD was carried out in accordance with established guidelines [ 13 ]:
The translation of WaLIDD into Turkish was conducted by two bilingual translators. While one of these translators knew about the study, the other did not.
The two translations obtained were evaluated and turned into a single translation.
The Turkish version created was translated back into English by two native English translators.
The expert committee reviewed the original version of WaLIDD and all its translations and obtained the most appropriate translation.
For the comprehensibility of WaLIDD, the prefinal version was completed by 30 individuals.
Outcome
The WaLIDD questionnaire comprises four items. It was prepared as a scale-type instrument that amalgamates characteristics of dysmenorrhea: (1) The questionnaire assesses the number of anatomical sites of pain, including the lumbar region, lower abdomen, inguinal region, and lower extremities, where discomfort is experienced during dysmenorrhea; (2) The Wong-Baker pain rating scale is employed to evaluate the intensity of discomfort, ranging from “does not hurt” to “hurts a lot more,” as reported by individuals experiencing dysmenorrhea; (3) The questionnaire records the number of days individuals experience pain during their menstrual period, categorized into intervals such as 0 days, 1–2 days, 3–4 days, and 5 or more days; and (4) The questionnaire evaluates the frequency of pain hindering daily activities, ranging from “never” to “always,” as reported by individuals experiencing dysmenorrhea. Each item is assigned a score between 0 and 3, with the total score ranging from 0 to 12 points. A higher score indicates a more severe level of dysmenorrhea [ 5 ].
Premenstrual Syndrome Impact Questionnaire (PMS-IQ) comprises 18 items. It examines functional interactions in daily life as well as psychological stress, specifically formulated for premenstrual symptoms. Given the intricate and multifaceted nature of the disorder, the PMS-IQ streamlines the diagnostic process by evaluating the necessary impact and aiding in treatment designing and assessment. The questionnaire employs a 4-point Likert-type response system. A higher score indicates a greater influence of premenstrual symptoms [ 14 ].
The Pain Disability Index (PDI) is a self-administered scale, developed by Pollard, designed to measure the degree to which pain resulting from a persistent condition impacts the patient’s daily life and the level of disability incurred as a result [ 15 ]. The scale comprises seven questions. Patients are asked to evaluate the impact of pain on seven functional activities of daily living, including family-home responsibilities, leisure time, occupation, social activity, sexual life, and self-care. They assign a score from 0 to 10 for each question (0 = no hindrance, 5 = moderately prevents, 10 = completely inadequate). The total score of the PDI ranges from 0 to 70. A total score of 40 or higher indicates high disability, suggesting severe impairment due to pain. There exists a Turkish version of the PDI with established validity and reliability through research studies [ 16 ].
The Big Five Inventory-10 (BFI-10) was introduced to the literature by Rammstedt and John as a concise alternative to the longer BFI-44 [ 17 ]. The scale comprises 10 items representing 5 sub-dimensions. Respondents evaluate each item using a 5-point Likert-type rating, ranging from “Disagree Strongly” to “Agree Strongly”. Statements numbered 1, 3, 4, 5, and 7 were reversed. Validity and reliability analyses have been performed on the Turkish version of the inventory [ 18 ].
Statistical analysis was conducted using the Statistical Package for Social Sciences (SPSS), version 22.0, on a Windows computer. Descriptive statistics were presented as mean ± standard deviation (X ± SD), median, or percentage (%). The parametric or nonparametric distribution of the data was assessed using the One-Sample Kolmogorov-Smirnov test. Internal consistency and test-retest analyses were performed at 7-day intervals to evaluate the reliability of the WaLIDD questionnaire. Test-retest reliability was measured utilizing the Intraclass Correlation Coefficient (ICC), while internal consistency was determined through Cronbach’s α value. ICC values of ≤ 0.5, 0.50–0.75, 0.75–0.90, and > 0.90 represent weak, moderate, good, and excellent reliability, respectively [ 19 ]. A Cronbach α value greater than 0.70 is considered sufficient [ 20 ].
Systematic variation and agreement between test and retest scores were evaluated with Bland-Altman plots and t-test.
Reproducibility was evaluated through standard error of measurement (SEM) and minimum detectable change (MDC), calculated using the following formulas [ 19 ]:
MDC 95 : z * SEM * \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\:\sqrt{2}$$\end{document} , z = 1.96 (based on 95% confidence) and SEM is the standard error of measurement
SEM 95 : SD * \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\:\sqrt{1-ICC}$$\end{document} , SD: Standard deviations of participants, ICC: Reliability coefficient
Content validity was assessed by evaluating the appropriateness of the questionnaire for dysmenorrhea and the target population, its relevance to dysmenorrhea, and its comprehensiveness in covering dysmenorrhea-related aspects. The following questions were asked: “Do you think the purpose of this scale is related to the assessment of dysmenorrhea?“, “Do you think the items of this scale represent you and your condition?“, “Do you think the items of this scale are related to your dysmenorrhea?“, and “Do you think these items comprehensively address your dysmenorrhea?“. More than 90% positive responses were considered acceptable for content validity [ 20 ].
Construct validity of the WaLIDD was assessed through Pearson correlation analysis after computing the total scores from all questionnaires. Correlation was interpreted as excellent ( r = 0.81–1.00), very good ( r = 0.61–0.80), well ( r = 0.41–0.60), poor ( r = 0.21–0.40), and bad correlation ( r = 0–0.20) [ 21 ].
Percentages of the minimum and maximum scores of the WaLIDD was calculated for ceiling and floor effects [ 20 ].
The statistical significance level was set at p < 0.05.
Results
126 individuals participated in the study but 113 individuals included. 13 individuals were excluded for some reason (endometriosis (1 individuals), infection (2 individuals), polycystic ovary syndrome (6 individuals), ovarian cyst (4 individuals)) (Fig. 1 ). Detailed demographic data of individuals are summarized in Table 1 .
Fig. 1 Flowchart of individuals
Flowchart of individuals
Table 1 Demographic information of individuals of test group and retest group Numeric variables Mean (SD) (test) ( n = 113) Mean (SD) (retest) ( n = 99) Age (years) 20.69 (2.12) 20.80 (2.16) Menarch age (years) 13.09 (1.38) 13.14 (1.39) Weight (kg) 57.38 (10.12) 57.46 (10.59) Height (cm) 163.21 (5.98) 163.24 (5.97) BMI (kg/m 2 ) 21.50 (3.40) 21.52 (3.56)
Categorical variables
n (%)
n (%)
Menstrual cycle 20 days and less 21–33 days 34 days and more 7 (6.2%) 95 (84.1%) 11 (9.7%) 5 (5.1%) 84 (84.8%) 10 (10.1%) Menstrual order Yes No 88 (77.9%) 25 (22.1%) 77 (77.8%) 22 (22.2%) Menstrual drug use Yes No Sometimes 46 (40.7%) 22 (19.5%) 45 (39.8%) 39 (39.4%) 18 (18.2%) 42 (42.4%) Smoking Yes No 20 (17.7%) 93 (82.3%) 16 (16.2%) 83 (83.8%) Use of alcohol Yes No 1 (0.9%) 112 (99.1%) 1 (1.0%) 98 (99.0%) SD: Standard deviation; kg: kilogram; cm: centimeter; m: meter
Demographic information of individuals of test group and retest group
20 days and less
21–33 days
34 days and more
7 (6.2%)
95 (84.1%)
11 (9.7%)
5 (5.1%)
84 (84.8%)
10 (10.1%)
Yes
No
88 (77.9%)
25 (22.1%)
77 (77.8%)
22 (22.2%)
Yes
No
Sometimes
46 (40.7%)
22 (19.5%)
45 (39.8%)
39 (39.4%)
18 (18.2%)
42 (42.4%)
Yes
No
20 (17.7%)
93 (82.3%)
16 (16.2%)
83 (83.8%)
Yes
No
1 (0.9%)
112 (99.1%)
1 (1.0%)
98 (99.0%)
SD: Standard deviation; kg: kilogram; cm: centimeter; m: meter
The Cronbach’s α value of the WaLIDD was determined to be 0.875, indicating a high level of internal consistency. For test-retest reliability, the ICC values indicated moderate-to-good reliability, with individual item ICCs ranging from 0.686 to 0.845 and an overall ICC value of 0.778. The SEM and MDC values were 0.74 and 2.06, respectively, ensuring the scale’s stability over time (Table 2 ). Agreement between test-retest scores was further supported by the Bland-Altman plot (Fig. 2 ). The item-total correlation coefficients ranged between 0.008 and 0.440, suggesting that each item contributed adequately to the overall consistency of the scale (Table 3 ).
Table 2 Test–retest reliability and internal consistency of the WaLIDD ( n = 113) Baseline Mean ± SD Retest Mean ± SD
p
Test– retest (ICC and 95% CI) SEM MDC Internal consistency (Cronbach’s α) WaLIDD 7.65 ± 1.55 7.53 ± 1.61 0.255 0.778 (0.686–0.845) 0.74 2.06 0.875 WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea; SD: standard deviation; ICC: intraclass correlation coefficient; CI: confidence intervals; SEM: standard error measurement; MDC: minimal detectable change
Test–retest reliability and internal consistency of the WaLIDD ( n = 113)
WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea; SD: standard deviation; ICC: intraclass correlation coefficient; CI: confidence intervals; SEM: standard error measurement; MDC: minimal detectable change
Fig. 2 Bland–Altman plots of the WaLIDD test–retest scores ( n = 99)
Bland–Altman plots of the WaLIDD test–retest scores ( n = 99)
Table 3 Mean scores, corrected item-total correlations and Cronbach’s α if item deleted results for the WaLIDD ( n = 113) Item Mean SD Corrected item-total correlation Cronbach’s α if item deleted 1 2.36 0.68 0.337 0.113 2 1.92 0.60 0.076 0.411 3 1.57 0.76 0.008 0.527 4 1.74 0.57 0.440 0.040 SD: Standard deviation; WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea
Mean scores, corrected item-total correlations and Cronbach’s α if item deleted results for the WaLIDD ( n = 113)
SD: Standard deviation; WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea
Content validity was evaluated by examining the suitability of the questionnaire for dysmenorrhea and the target group, its relevance to dysmenorrhea, and its adequacy in addressing dysmenorrhea-related factors. All responses obtained were affirmative.
Construct validity was assessed through convergent and divergent validity analyses. The WaLIDD demonstrated a strong positive correlation with the Premenstrual Symptoms Impact Questionnaire (PMS-IQ, r = 0.726), indicating that it effectively measures dysmenorrhea-related symptoms. It showed a moderate correlation with the Pain Disability Index (PDI, r = 0.413), suggesting an association with pain-related disability. Conversely, it demonstrated negligible correlations with the Big Five Inventory-10 (BFI-10, r = 0.088) and its subscales, supporting the divergent validity of the scale (Table 4 ).
Table 4 Correlations between WaLIDD with other questionnaires for convergent validity and divergent validity ( n = 113) WaLIDD
Convergent validity
PMS-IQ
- Psychological impact - Recreational and emotional impact - Functional impact 0.726** µ 0.710** 0.595** 0.520**
PDI
0.413**
Discriminant validity
BFI-10
-Extraversion -Agreeableness -Conscientiousness -Neuroticism -Openness 0.088 µ -0.122 µ -0.112 µ -0.037 µ
0.296**
µ
0.197*
µ
WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea; PMS-IQ: Premenstrual Syndrome Impact Questionnaire; PDI: Pain Disability Index; BFI-10: Big Five Inventory-10 µ : Pearson correlation analysis; *: p < 0.05; **: p < 0.001
Correlations between WaLIDD with other questionnaires for convergent validity and divergent validity ( n = 113)
PMS-IQ
- Psychological impact
- Recreational and emotional impact
- Functional impact
0.726** µ
0.710**
0.595**
0.520**
BFI-10
-Extraversion
-Agreeableness
-Conscientiousness
-Neuroticism
-Openness
0.088 µ
-0.122 µ
-0.112 µ
-0.037 µ
0.296**
µ
0.197*
µ
WaLIDD: Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea; PMS-IQ: Premenstrual Syndrome Impact Questionnaire; PDI: Pain Disability Index; BFI-10: Big Five Inventory-10
µ : Pearson correlation analysis; *: p < 0.05; **: p < 0.001
Additionally, no floor (0%) or ceiling (0.7%) effects were observed, indicating that the scale effectively captures the full range of symptom severity.
Background
Dysmenorrhea is a gynecological condition that causes painful menstrual cramps in the uterus, most commonly seen in women of reproductive age. Although it is a common condition, it is underdiagnosed and therefore undertreated [ 1 ]. A normal ovulatory menstrual cycle requires a mature hypothalamic-pituitary-ovarian axis and the presence of highly coordinated hormonal feedback loops. The normal menstrual cycle, consisting of three phases (follicular, ovulatory, and luteal), results in the development of a mature follicle and the release of an oocyte in each cycle, with menstruation occurring in the absence of fertilization. While adolescents may initially experience anovulatory cycles following menarche, the majority of cycles become fairly regular, typically lasting between 21 and 45 days, with an average bleeding duration of three to seven days [ 2 ]. Dysmenorrhea symptoms begin before menstruation and sometimes continue up to 72 h after menstruation. Dysmenorrhea occurs 6–12 months after menarche and affects young women who are in beneficial university or industrial activity. In this period, it is called primary dysmenorrhea and is usually associated with factors that cause psychological stress such as nutritional disorders, menstrual cycle irregularities, menstruation before the age of 12, and excessive menstruation [ 3 , 4 ]. Primary dysmenorrhea manifests as spasmodic, crampy menstrual pain and discomfort without any identifiable pelvic pathology, whereas secondary dysmenorrhea is characterized by menstrual pain associated with specific pelvic conditions such as endometriosis, adenomyosis, or uterine fibroids [ 5 ].
While the prevalence of dysmenorrhea worldwide is between 41.7% and 89.1% [ 6 , 7 ], it is stated that this rate varies between 55.5% and 95.6% in Türkiye [ 8 , 9 ]. Some women undergo relatively less pain, while others are significantly limited in functioning during their menstruation. Among the menstrual symptoms, lower abdomen and back pain cause the most workforce loss [ 10 ]. Up to 15% of women with dysmenorrhea suffer symptoms so severe that they are unable to attend work or school [ 11 ]. Dysmenorrhea impairs focus and productivity [ 12 ]. In the United States, menstrual pain and associated symptoms result in an estimated 600 million hours of workforce loss per year, resulting in 2 billion It has been reported that it caused a financial loss in dollars [ 10 ].
The diagnosis of dysmenorrhea is clinical, meaning that it is primarily based on patient-reported symptoms, medical history, and physical examination rather than specific laboratory or imaging tests. At present, there is a lack of consensus regarding the utilization of standardized questionnaires in the diagnostic procedure, along with insufficient confirmatory measures and frameworks to effectively classify the severity of dysmenorrhea. Developed by Teherán et al., Working Ability, Location, Intensity, Days of Pain, Dysmenorrhea (WaLIDD) score is a self-report tool crafted to facilitate the recognition of women experiencing dysmenorrhea and those at an elevated risk of requiring medical leave [ 5 ]. WaLIDD was selected for this study because it provides a multidimensional assessment of dysmenorrhea, including pain severity, its impact on daily life, response to treatment, and associated symptoms. Given the lack of a validated dysmenorrhea-specific questionnaire in Turkish, WaLIDD offers a valuable tool for standardized evaluation in this population. This study seeks to investigate the validity and reliability of the Turkish version of the WaLIDD for assessing women with dysmenorrhea. If the WaLIDD is proven to be valid and reliable, this scale can be used to evaluate Turkish women with dysmenorrhea.
Discussion
This study examined the validity and reliability of the WaLIDD in Turkish women with dysmenorrhea. The Turkish version of WaLIDD demonstrated good reliability and good validity in evaluating women with dysmenorrhea.
In the original version of the WaLIDD, the Cronbach’s α was 0.723 and showed acceptable internal consistency [ 5 ]. In the current study, the Cronbach’s α was 0.875 and this showed WaLIDD had a high level of internal consistency. ICC value of Turkish WaLIDD indicated that the questions had moderate-to-good (0.686–0.845) reliability. The total ICC value of the WaLIDD was found to be at an appropriate level (0.778). These result is consistent with the literature and showed that the Turkish version of WaLIDD had an good level of reliability. The Bland-Altman plots also supported the results.
Standard Error of Measurement (SEM) is a reliability measure that evaluates response stability across multiple measurements. It is the amount of error considered as measurement error. The Minimum Detectable Change (MDC) is an estimate of the smallest amount of change corresponding to a noticeable change in the parameter being evaluated. It is the minimum amount of change that indicates that the change in an instrument is not due to measurement error. Both measurements are a reliability criterion. Standard Error of Measurement (SEM) evaluates the stability of response in multiple measurements, while Minimum Detectable Change (MDC) indicates the smallest amount of change corresponding to a noticeable change in a parameter. Both measurements are a criterion of reliability [ 22 ]. In the current study, SEM and MDC values were 0.74 and 2.06, respectively. The SEM was 0.74 points, corresponding to 9.75% of the mean of the WaLIDD values (7.59) and 6.17% of the maximum possible score. Based on the SEM, MDC was 2.06 points (27.14% of mean values). Considering the maximum point (12 points), 2.06 points correspond to 17.17% of this. In the original study of the WaLIDD SEM and MDC values didn’t examined. However, these SEM and MDC values showed that WaLIDD is repeatable and reproducible [ 23 ].
Corrected item-total correlation values of WaLIDD were bad to poor correlations. This shows that each item questions dysmenorrhea in a different way. Cronbach’s α if item deleted values also showed that there should be 4 items in WaLIDD. Even if any of them are removed, a value higher than Cronbach’s α (0.527) of the total score cannot be obtained. Therefore, all items should be included in the WaLIDD.
Correlation analysis to determine the convergent and divergent validity of the questionnaire showed that the WaLIDD was positively and good correlated with PMS-IQ, and well correlated PDI, uncorrelated and negligible with BFI-10 and its subscales. This result suggested that dysmenorrhea was not related to personality traits. WaLIDD’s well correlations between the PMS-IQ and PDI showed good validity. In this study, we examined the correlation between the total score of the WaLIDD and the total score and sub-dimensions of other scales. We acknowledge that the WaLIDD scale does not have sub-dimensions and that it evaluates a specific aspect of women’s health related to dysmenorrhea. The scales used in this study assess different dimensions of health, such as quality of life and general well-being. Although these scales are not directly comparable in terms of content, the decision to correlate the total score of WaLIDD with the sub-dimensions of other scales was driven by the aim to investigate potential associations between the overall impact of dysmenorrhea and various aspects of health. It is important to note that while the correlation between the WaLIDD total score and the sub-dimensions of the other scales may not imply a direct or causal relationship, it serves to explore the broader context of how dysmenorrhea affects women’s health across different dimensions. The findings should be interpreted with caution, recognizing the different conceptual focuses of the scales involved.
A review of the literature reveals numerous studies investigating dysmenorrhea-related pain. However, these studies predominantly utilize scales such as the Verbal Rating Scale (VRS) [ 24 ], Visual Analog Scale (VAS) [ 25 – 28 ], Short-Form McGill Pain Questionnaire (SF-MPQ) [ 25 ], and Numerical Rating Scale (NRS) [ 29 ]. While these tools effectively measure pain intensity (VRS, VAS, NRS) or perceived pain characteristics (SF-MPQ), they do not specifically assess symptoms unique to dysmenorrhea. Given that primary dysmenorrhea is a clinically common condition that does not require additional diagnostic testing, it is often evaluated using general pain scales. Nevertheless, these existing tools may not fully capture the multidimensional impact of dysmenorrhea. The scale evaluated in the present study differs from these general pain assessment tools by incorporating key dimensions such as pain range, pain location, the number of days pain persists, disability, and pain-related limitations. These aspects provide a more comprehensive assessment of dysmenorrhea, addressing critical gaps left by existing scales. Furthermore, a notable lack of dysmenorrhea-specific assessment tools in the literature highlights the necessity of developing such an instrument. Conducting this validation study in the Turkish population is particularly valuable, as it ensures that the scale is culturally and contextually relevant, thereby enhancing its applicability in both clinical and research settings.
Due to the lack of version studies of WaLIDD in other populations, the findings of this study are specific to Turkish women. Further studies in other countries or populations would be necessary to determine whether the results can be generalized beyond the Turkish context. However, if available, studies on the cross-cultural applicability of WaLIDD could be considered for future research.
The primary strength of our study is the introduction of WaLIDD, a succinct and easily administered questionnaire, filling a crucial gap as the first questionnaire adapted to evaluate dysmenorrhea in Turkish. This initiative marks a significant advancement in clinical assessment tools for this condition.
A limitation of this study was that responsiveness of WaLIDD was not examined. Therefore, further studies evaluating responsiveness are needed. Further studies may examine the diagnostic accuracy of WaLIDD in predicting dysmenorrhea. In addition, the sensitivity and specificity of WaLIDD can be examined.
Conclusions
In conclusion, WaLIDD was found to be a well-structured, valid and reliable instrument that can be used to evaluate women with dysmenorrhea. The Turkish version of WaLIDD can be used in Turkish women in clinical practices and research.
Supplementary Material
Below is the link to the electronic supplementary material.
Supplementary Material 1
Supplementary Material 1
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.