Results
At the time of analysis, 58 participants were scheduled to complete at least 35 consecutive days of daily hormone assessments. Of these, 17 were excluded from analysis due to contributing fewer than 10 days of hormone data and/or lacking observable cyclicity in hormone patterns, as determined by visual inspection. This yielded a final analytic sample of 41 participants who provided sufficient daily hormone data (at least 10 days) and demonstrated confirmed ovulatory cycles, indicated by a clear rise in luteal-phase progesterone. The mean age of the final sample was 27.33 years (SD = 7.61), and the average menses-to-menses cycle length was 28.48 days (SD = 2.78).
Within this group, a subset of 30 participants contributed 33 cycles that had clearly identifiable estimated days of ovulation (EDO), as defined in the Methods. In this subset, the mean luteal phase length was 13.73 days (SD = 1.35), and the mean follicular phase length was 14.78 days (SD = 2.62). The remaining 11 participants contributed 11 cycles of which six cycles fell outside the typical 21–35 day range but still showed clear hormonal evidence of EDO. The other five cycle lengths were within range but lacked sufficient data to confirm EDO due to missing values within the ovulatory window. Despite this, these five cycles still exhibited a luteal progesterone rise. For these cases, EDO was estimated by counting backward 15 days from the onset of the subsequent menses, yielding an estimated luteal phase length of 14 days and a mean follicular phase length of 13.5 days (SD = 3.7).
In our subsample of 33 cycles with hormone-confirmed ovulation, we compared the day of hormone-confirmed ovulation (i.e. “ground truth”), with the backward-counted estimate. The backward count method differed from the hormone-confirmed ovulation day by an average of 0.30 days (SD = 1.29) (raw difference), with a mean absolute value of the difference of 0.97 days (SD = 0.88) ( Figure S1 ). Interestingly, the raw difference—which reflects luteal phase length—was significantly associated with cycle length at the α = .05 level (r = 0.395) ( Figure S3 ). Although the luteal phase is generally considered less variable than the follicular phase, these results suggest that luteal phase variability (as indicated by the raw difference between ovulation detection methods) may be influenced by overall cycle length. However, the sample size is too small to draw definitive conclusions.
Since analyses restricted to hormone-confirmed cycles yielded results consistent with the combined dataset, subsequent analyses include: 33 cycles with hormone-confirmed EDO ranging from 21-35 days, 6 cycles with hormone-confirmed EDO outside the 21-35 day range, and 5 cycles with 15-day backward count estimated EDO.
As shown in Figure 3 , the most notable differences in the E1G hormone trajectory when comparing the cycle day axis to the standardized cycle time axis occur during the follicular phase and periovulatory period. To compare the traditional count-based method and PACTS, we assumed a 28-day cycle and grouped the PACTS-based observations into 28 bins based on the percentage of cycle elapsed (where the first cycle day corresponds to 0%). Corresponding cycle days had higher variance in E1G than their standardized cycle time counterparts. This applied to cycle day 6 (18.5% of cycle elapsed), with 39 E1G observations for the count-based method and 22 observations for PACTS-based standardized cycle time (F(38, 21) = 2.63; p = .01); cycle day = 9 (F(24, 31) = 3.27; p = .001), cycle day = 10 (F(30, 29) = 4.57; p <.0001), cycle day = 11 (F(33, 28) = 2.76; p =.004), cycle day = 12 (F(35, 39) = 2.24; p<.008), and cycle day = 13(F(36, 35) = 1.91; p <.03). This is depicted by standard deviation shading in Figure 3 .
The luteal rise in PDG is consistent whether the hormone trajectory is plotted against the cycle day axis or the standardized cycle time axis. Like the E1G trajectory, standardized cycle time axis reduces variability at time points primarily during the follicular phase. Corresponding cycle days had higher variance in PDG than their standardized cycle time counterparts: (cycle day = 12 (F(35, 39)=2.65; p=.002), cycle day = 13 (F(36, 35) = 4.27; p<.0001), cycle day = −12(F(34, 33) = 2.01; p = 0.02)) ( Figure 4 ).
GAMM-estimated E1G trajectories differed by the time scale used (forward count, backward count, combined cycle day, or PACTS). Effective degrees of freedom (edf) ranged from 7.00 to 8.56, with the forward count model having the lowest edf and the PACTS model the highest. This pattern suggests that PACTS provides greater temporal precision, allowing for more flexible, data-driven spline formation. Notably, the forward count model showed a discontinuity between day +28 and day +1 when time was centered on menses. In Figures 5 and 6 we feature count methods to show model-implied values when each time variable is centered on menses or ovulation. Model estimates are reported in supplement ( Tables S1 - S5 ).
Similar to E1G, PDG trajectories modeled with GAMMs varied by time scale, with edf values ranging from 7.63 to 8.47. The forward count model again had the lowest edf, and PACTS had the highest, reinforcing PACTS’s ability to capture finer temporal resolution. As with E1G, the forward count model displayed a discontinuity between day +28 and day +1 when menses-centered, likely due to variability in cycle lengths in which every cycle is not 28 days. Model estimates are reported in supplement ( Tables S6 - S10 ).
Materials
To utilize PACTS, the dataset must include, at minimum, either of the following for each participant: 1) menses onset date and the consecutive estimated date of ovulation, or 2) consecutive menses onset dates. The PACTS method then outlines steps to identify/estimate ovulation and create a standardized cycle time variable across participants.
When computing a standardized menstrual cycle time variable, key criteria must be considered. Outcome data should be enclosed by menses onset or a biomarker of ovulation, as these events define the follicular and luteal phases. This requires collecting self-reported menses onset dates throughout the study, including the next menses onset after study activities conclude. In the default setting of menstrualcycleR , standardized cycle time variables are computed only for cycles lasting 21-35 days. Cycles outside this range are excluded, as unusually short or long cycles exhibit greater variability and may indicate anovulation. If the estimated day of ovulation (EDO) is confirmed for all cycles, users may choose to modify the 21-35-day range to better match the characteristics of their study population. While menses onset cannot be imputed, ovulation may be estimated if consecutive menses onset dates are known. The cycle time variable is scaled from −1 to +1, with zero representing either menses onset or estimated ovulation (LH + 1), depending on the research question. This continuous approach provides greater precision than phase-based analyses, but proper centering is crucial. For example, studies on perimenstrual symptoms (e.g., dysmenorrhea) may center on menses onset, while those on periovulatory behaviors (e.g., reward sensitivity) may center on ovulation ( Peters et al., 2024 ). Using both centering methods in separate models may yield the most comprehensive insights.
In line with Fehring et al. (2006) , we define the follicular phase as the first day of menses onset up to and including the estimated day of ovulation, and we define the luteal phase as the first day after the estimated day of ovulation up to and including the day before the next (menses) menstrual cycle. While positive LH-surge tests and the nadir in BBT do not pinpoint the exact day of ovulation (which would require ultrasound or invasive measures), they are reliable biomarkers indicating that ovulation likely occurred within 24-36 hours of these events ( Swiss Precision Diagnostics GmbH., 2008 ). We define the estimated day of ovulation as LH + 1, one day after a positive LH-surge test (LH+1) or the BBT nadir (nadir + 1), marking the end of the follicular phase ( Figure 1 ). Of note, the ovulatory window may be broader and depends on the method of ovulation detection( Roos et al., 2015 ; Su et al., 2017 ). For more details on appropriate LH concentration threshold cut-offs when choosing an LH-surge test, see Schmalenberger et al. (2021) . In `menstrualcycleR` the parameter `vtoday` represents the estimated day of ovulation, which needs to be calculated based on the ovulation detection method included in the study ( Figure 1 ).
The time scale of the cycle time variable ranges from −1 to +1 and is centered with zero either indicating menses onset ( Figure 2A ) or estimated day of ovulation ( Figure 2B ). When the time scale is centered on menses onset, −1 indicates the first day of the luteal phase, 0 indicates menses onset and 1 indicates the estimated day of ovulation (LH + 1). When the time scale is centered on ovulation, −1 indicates the first day of the follicular phase, 0 indicates estimated day of ovulation, and 1 indicates the last day of the luteal phase. This scaling allows the time variable to index the percentage of phase elapsed , considering that phase lengths may vary between cycles. In long data format, in which each row corresponds to a date, each row will have a unique cycle time variable proportional to the percent of phase elapsed. When centered on menses onset, values from −1 to 0 (not inclusive of 0), are from the luteal phase and values from 0 to 1 (inclusive of 0) are from the follicular phase. When centered on ovulation, values from −1 to 0 (inclusive of 0) are from the follicular phase and values between 0 to 1 (not inclusive of 0) are from the luteal phase. If a person provides multiple cycles of data, the pattern spanning from −1 to +1 will repeat ( Figure 2 ).
Our primary goal in developing PACTS was to improve upon count-based time coding methods by providing a standardized cycle time variable for which a given time value represents a more consistent hormonal meaning across cycles and participants. To determine whether PACTS achieves this goal, we implemented two complementary analyses.
First, we assessed whether PACTS reduces variability in hormone levels at specific points across the cycle by comparing the variance of hormone values for each cycle day (based on combined forward and backward count) to the corresponding PACTS timepoint, using F-tests of variance. This approach is based on the premise that greater error in the time variable (i.e., less accurate estimation of menstrual cycle position) would lead to increased variability in hormone levels at each timepoint. Conversely, a narrower spread of hormone values—indexed by variance and illustrated by standard deviations in figures—would indicate a more accurate alignment of individuals within the cycle.
Second, we used generalized additive mixed models (GAMMs) in the ‘mgcv’ R package ( Wood, 2006 ) to assess how well various cycle time variables—forward count, backward count, combined forward/backward count, and PACTS—captured physiological trajectories of urinary metabolites of estradiol (estrone-3-glucuronide, E1G) and progesterone (pregnanediol glucuronide, PDG). GAMMs were fit using restricted maximum likelihood (REML) estimation with a Gaussian distribution and identity link function. Hormone outcomes were log-transformed prior to analysis. Random effects for participant ID and cycle time variables were modeled as smooth terms using thin plate regression splines. Missing data were handled using listwise deletion; no imputation was performed. Smooth terms were penalized to avoid overfitting, and thin plate regression splines were used as the default basis.
GAMMs are well-suited for modeling outcomes across the menstrual cycle, especially for repeated hormone assessments. Unlike traditional linear mixed-effects models, GAMMs flexibly capture complex, nonlinear relationships without requiring a predefined functional form—essential for modeling dynamic hormonal fluctuations. By incorporating both fixed and random effects as smooth terms, GAMMs account for both population-level trends and individual differences while accommodating the nested structure of repeated measures data.
Compared to other nonlinear modeling approaches, such as piecewise splines and polynomial regression, GAMMs avoid challenges like arbitrary knot placement, boundary instability, and overfitting ( Boyd & Xu, 2009 ). Penalized splines ensure smooth, stable trajectories that reflect physiological patterns, including cycle-specific, data-driven inflection points. Their flexibility, combined with time-centering options (e.g., centering by menses or ovulation through menstrualcycleR ), enhances interpretability. While computationally intensive, GAMMs’ ability to model individual variability and nonlinear dynamics makes them particularly effective for menstrual cycle research.
Together, these analyses assess two key aspects of model performance: local consistency—the degree to which hormone values cluster tightly at each time point—and physiological representativeness—the extent to which model-implied hormone levels reflect underlying hormonal patterns. By examining both, we can determine whether PACTS provides a more reliable representation of hormonal dynamics across individuals and cycles.
This study analyzed hormone data from 58 participants in an ongoing longitudinal study examining menstrual cycle-related mood and behavioral symptoms in naturally cycling individuals with elevated borderline personality disorder (BPD) symptoms. Participants, aged 18–45, were recruited through online advertisements, community fliers, and outpatient psychiatric clinics, targeting individuals with emotional and interpersonal difficulties or suspected BPD/complex Post-Traumatic Stress Disorder (PTSD). Eligibility criteria included endorsing at least three BPD symptoms, natural menstrual cycles, not pregnant or postpartum within the past year, no bipolar I diagnosis, no use of hormonal medications or supplements, and having access to a smartphone for daily surveys. Participants completed baseline surveys and diagnostic interviews, followed by daily self-reports over at least two menstrual cycles. During at least one cycle, they provided dried urine strips for hormonal analysis (LH, creatinine, estradiol and progesterone metabolites), with collection starting at varying cycle phases. Self-reported menses dates, home ovulation test results, and demographic data were also collected.
To assess ovarian hormone levels and determine date of ovulation, daily measurements of urinary metabolites of estradiol (estrone-3-glucuronide, E1G), progesterone (pregnanediol glucuronide, PDG), LH, and creatinine were taken. Participants used filter paper to collect first-morning urine, ensuring hormone peak capture. Samples were fully dried (at least 24 hours), stored in home freezers, and later stored in a −80°C freezer before being sent to ZRT laboratory (Beaverton, OR) where they were refrigerated prior to analysis. Dried urine collection, a noninvasive practical method, offers precision comparable to serum and liquid urine samples (Appendix C of Klusmann et al., 2023 ). Enzyme-linked immunosorbent assays (ELISA) were used to measure E1G, PDG, and LH, while creatinine levels were assessed with a colorimetric assay. Detection limits were 1.07 mIU/mL for LH, 4.02 ng/mL for E1G, 57.58 ng/mL for PDG, and 0.026 mg/mL for creatinine. Method validation work at ZRT laboratory was completed with dried urine samples that were labeled as Low, Mid, and High to represent their concentrations relative to the analytical range of the respective assays ( Kim et al., 2025 ). Intra-assay coefficients of variance were as follows for E1G (Low: Mean = 10ng/mL, CV = 7.2%; Mid: Mean = 42.0 ng/mL, CV = 5.5%; High: Mean = 91.3 ng/mL, CV = 4.9%); PDG (Low: Mean = 544.5 ng/mL, CV = 10.7% ; Mid: Mean = 2435.2 ng/mL, CV = 10.9% ; High: Mean = 3726.5, CV = 6.4%). Inter-assay coefficients of variance for refrigerated samples were as follows for E1G (Low: Mean = 18.1 ng/mL, CV = 14.2%; Mid: Mean = 59.1 ng/mL, CV = 11.6%; High: Mean = 125.1 ng/mL, CV = 8.8%); PDG (Low: Mean = 1804.7 ng/mL, CV = 11.9% ; Mid: Mean = 5783.5 ng/mL, CV = 15.0%; High: Mean = 8926.3 ng/mL, CV = 17.8%). Hormone concentrations were adjusted for creatinine by dividing analyte concentrations by creatinine to ensure consistent volume units. Participants also tracked LH-surge at home using Clearblue Digital Ovulation Test (40 mIU/ml threshold for urinary LH)( Swiss Precision Diagnostics GmbH., 2008 ); if a participant reported multiple consecutive positive tests, the first positive test day was coded as the day of the LH surge.
The gold standard for identifying the exact day of ovulation is daily transvaginal ultrasound, which confirms the timing of follicular rupture. However, when ultrasound is unavailable, the estimated day of ovulation (EDO) can be determined using hormonal patterns. To achieve the highest possible accuracy, authors AN and TAEM visually inspected graphs displaying all available data (with date on the X-axis), including daily urinary E1G, PDG, and LH levels, as well as self-reported positive ClearBlue test results (40 mIU/mL) and menstrual bleeding. Based on data from Roos et al. (2015) , which compared ultrasound-confirmed ovulation to daily urinary and serum hormone measures, the EDO was defined as the first day in each cycle that met the following criteria:
Preovulatory E1G rise (without elevated PDG): In the week leading up to EDO, E2 must have risen above its menstrual baseline in the absence of elevated PDG .
Distinct preovulatory LH surge (without elevated PDG): EDO must be preceded or accompanied by an abrupt rise in LH, occurring in the absence of a distinct PDG rise. The LH surge may occur 1-2 days before its peak.
Post-ovulatory PDG rise: EDO was coded as the first day meeting the above criteria, where the following day showed a deviation from the follicular-phase PDG trend, indicating the expected luteal-phase rise. This typically occurred within 3 days of the LH surge.
Therefore, the estimated day of ovulation was identified as the day between the post-E1G-surge LH surge and the PDG rise. While a common sequence was Day 1: E1G peak → Day 2: LH surge → Day 3: EDO → Day 4: PDG surge, this timing varied across individuals, sometimes unfolding either more quickly or more slowly. Some individuals had same-day overlap of E1G peak and LH surge with PDG rise the following day; in such cases, the EDO was coded as the same day as the E1G peak and LH surge. Conversely, a small number of individuals had gaps of 2-3 days between the LH surge and a detectable PDG rise; in such cases, the EDO was coded as the day preceding the detectible PDG rise, even if this was 2-3 days after the LH surge.
Discussion
Scientific progress in menstrual cycle research has been constrained by the mathematical and practical challenges of modeling a dynamic hormonal cycle of varying lengths in a standardized way. Here, we present phase-aligned cycle time scaling (PACTS), a continuous, hormonally aligned cycle time variable that defines time and scales observations relative to both ovulation and menses. Analyzing daily urinary hormone data, we found that PACTS more effectively aligns hormonal trajectories than traditional forward- and backward-counting methods, particularly in the follicular phase ( Figures 3 and 4 ). This is expected, since the follicular phase (time from menses to next ovulation) is the major source of variability in cycle length. To address both the methodological and adoption challenges in menstrual cycle research, we developed PACTS and its companion R package, menstrualcycleR , to automate this standardization and facilitate widespread implementation.
Standardizing cycle time has critical implications when examining outcomes linked to E2 fluctuations across various fields of study, especially around ovulation, where the greatest changes in E2 occur. For example, some people experience behavioral hormone sensitivity in which periovulatory surges in E2 drive increased alcohol use ( Martel et al., 2017 ) and maladaptive reward-related risk-taking behaviors, such as gambling ( Peters et al., 2024 ) Additionally, catamenial epilepsy has a periovulatory subtype ( Barone et al., 2023 ; Herzog et al., 1997 ). Figure 5 , which compares count-based methods to PACTS-standardized cycle time, highlights that E2 variance is highest in the mid-to-late follicular phase. Notably, this variance questions the long-established tradition of using forward count to identify the mid-follicular phase, then treating that phase as a person’s “baseline” for comparison of perimenstrual or periovulatory clinical outcomes. Given the E2 variance observed in the mid-to-late follicular phase ( Figure 3 ), it is likely that a count-based mid-follicular baseline represents different E2 profiles across individuals. In sum, our findings demonstrate that traditional count-based methods poorly align the periovulatory rise, peak, and fall of E2. In contrast, PACTS significantly improves alignment between cycles; this could be particularly valuable for future studies in populations with greater cycle length variability, such as those with large age ranges. PACTS standardized cycle time provides a more precise approach that better maps cyclical hormonal fluctuations across time and should, therefore, be prioritized over count-only methods in menstrual cycle research.
While PACTS standardized cycle time improved alignment of follicular phase E2 trajectories relative to count-only methods, alignment of luteal phase P4 trajectories was similar between the approaches ( Figures 4 and 6 ). This suggests that count-based methods, specifically backward count, perform reasonably well in estimating the luteal phase, when P4 rises and falls. Supporting this, a backward count of 15 days aligned closely with hormonally-tracked ovulation in our sample–on average, within 0.30 days (SD = 1.35), making a 15-day backward count a good estimate for the final day of the follicular phase, or an estimated day of ovulation. However, this relationship may not generalize to all populations, particularly those with shorter, longer, or more variable cycle lengths (e.g., during puberty or menopause). Additionally, the significant correlation (r = 0.395) we observed between a cycle’s luteal phase length (precisely coded with daily hormone levels) and its overall length (consistent with Bull et al., 2019 ) underscores the need for further investigation into the sources of variability in luteal phase lengths and whether total cycle length may be used to improve scaling of the luteal phase in the absence of ovulatory biomarkers. In sum, for studies focusing on P4-linked effects or the luteal phase, a backward count remains a viable approach for estimating outcome changes across the luteal phase, though its applicability should be validated for the specific population under study.
Standardizing cycle time may help to improve diagnostic accuracy in cycle-related conditions like Premenstrual Dysphoric Disorder (PMDD) by improving alignment between hormonal changes and symptom timing. According to the DSM-5, PMDD is characterized by distressing affective symptoms that are present in the week before menses, start to improve within a few days after menses onset, and become minimal or absent in the week after menses. Given the centrality of symptom timing and high false positive rates with cross-sectional assessment, the DSM-5 requires the diagnosis to be made provisionally until daily symptom ratings confirm the required pattern( American Psychiatric Association, 2013 ).
To standardize longitudinal PMDD diagnosis in research, our lab developed the Carolina Premenstrual Assessment Scale (C-PASS)—a validated diagnostic algorithm that identifies high- and low-symptom phases using cycle day counting based on DSM-5 wording ( Eisenlohr-Moul et al., 2017 ). The high-risk premenstrual week (days −7 to −1) is estimated via backward counting, which likely captures symptoms in a consistent hormonal state across cycles, given that luteal phase length is relatively stable. However, our findings suggest that the forward count approach used to define the postmenstrual symptom-free period (days 4–10) may introduce bias, particularly in individuals with longer cycles. (Indeed, the few false negative cases identified in the validation dataset were determined to have arisen due to late symptom clearance ( Eisenlohr-Moul et al., 2017 ) which presumably could have be avoided by a better standardization the follicular resolution phase). Because follicular phase length varies more than luteal phase length, counting forward by a fixed number of days will not always align with the hormonally stable phase in which symptoms are expected to resolve.
This limitation is not unique to PMDD; similar timing challenges likely affect the diagnosis of catamenial epilepsy and menstrual migraine, where identifying a comparative low-risk phase of stable hormones is critical. Further, analyses of temporal subtypes of PMDD and other conditions should be revisited using standardized cycle time. Aligning hormonal meaning across cycles would ensure that individual differences in the cyclical timing of symptoms are not an epiphenomenon of variable ovulation timing and cycle length( Eisenlohr-Moul et al., 2020 ).
More broadly, the adoption of PACTS and its implementation via the menstrualcycleR package enables consistent operationalization of the menstrual cycle across studies. This standardization enhances methodological rigor, improves reproducibility, and addresses one of the primary barriers to cross-study comparisons and meta-analyses: inconsistent definitions of cycle phases ( Schmalenberger et al., 2019 ). By enabling cycle-based research to be more directly comparable across studies, PACTS lays the groundwork for robust meta-analyses, which are essential for developing evidence-based clinical guidelines for menstrual cycle-related conditions
Currently, standardized cycle time in menstrualcycleR is not modeled as circular: values at −1 and +1 represent adjacent days, whilst a circular scale would have these endpoints refer to the same day. Therefore, when centering on menses, periovulatory effects are currently split across the endpoints, while when centering on ovulation, perimenstrual effects are split. This can result in discrepancies at the model’s endpoints, where the values at −1 and +1 may not align– potentially reflecting true hormonal dynamics near cycle transitions or increased model error. While such boundary artifacts are well-documented in high-degree polynomial models (e.g., the Runge phenomenon; Boyd & Xu, 2009 ), and have been observed in menstrual cycle modeling ( Nagpal et al., 2024 ), they are greatly reduced–or absent–when using GAMMs ( Chesnaye et al., 2025 ). Future directions include developing a version of standardized cycle time that treats −1 and +1 as equivalent, allowing for circular modeling approaches such as sine-cosine basis expansions or cyclic splines within GAMMs, where endpoint continuity can be enforced. Given that model estimation is typically most reliable when the time variable is centered, we recommend selecting the centering approach—menses or ovulation—that best matches the research hypothesis (e.g., perimenstrual vs. periovulatory phenomena). In exploratory analyses, trying both centering options may be informative. Regardless, model interpretation should prioritize the centered region of the time variable, where estimation is strongest, while treating endpoints with greater caution. As shown in supplemental model tables ( Tables S1 - S5 ), centering choice can slightly affect model estimates (e.g., effective degrees of freedom), though these differences are typically minimal. Future work in circular cycle modeling aims to reduce boundary-related estimation issues and minimize the impact of centering choice on results. Additionally, further updates to menstrualcycleR may introduce an option to center mid-follicularly, where hormonal levels are relatively stable, avoiding the splitting of the hormonally dynamic perimenstrual or periovulatory timeframes at the cycle endpoints.
A limitation of the standardized cycle time measure is that it requires outcome data to be bookended by either menses onset or estimated ovulation, as these events define the time variable. In contrast, the forward and backward menses counting variables do not rely on consecutive menses dates. If consecutive menses onsets are not consistently recorded during data collection, and no biomarkers of ovulation are available, using standardized cycle time could result in substantial missing data. Since the count-only variables are indexed to menses onset, they may be appropriate to use if the focus is on perimenstrual phenomena, particularly with combined forward and backward count and backward count-only. However, it is important to note for count variables that statistical inference may become less accurate and reliable further away from menses onset, particularly into the follicular phase. Another limitation, common in menstrual cycle research, is the uncertainty in estimating ovulation. Unlike menses onset, ovulation cannot be reliably self-reported, and detection methods vary in accuracy and precision. While estimating ovulation by counting 15 days backward from menses onset is more accurate than assuming ovulation occurs at the cycle midpoint, biomarkers can provide a more precise estimate of ovulation timing. When biomarkers are unavailable and ovulation is estimated at day −15 before subsequent menses, researchers should acknowledge this imprecision and consider the potential impact on their models.
Because of the dynamic nature and individual variability of the menstrual cycle, there is a need for continuous, standardized measures that account for variability in both phase and cycle length within and between individuals. This paper validates Phase-Aligned Cycle Time Scaling (PACTS) and introduces menstrualcycleR, an R package for calculating standardized menstrual cycle time variables. By demonstrating that PACTS successfully aligns daily hormone data across cycles and improves statistical precision, we provide a resource for menstrual cycle researchers that can be used across disciplines to improve consistency and reproducibility of future work.
Introduction
Despite decades of investigation, scientists have reached no consensus method for operationalizing the menstrual cycle—a recurring, dynamic sequence of hormonal and physiological changes—as a standardized, continuous timeline. Many common approaches fail to model the menstrual cycle as a true cycle. Categorical phase-based methods reduce statistical power ( Segerstrom, 2019 ) and oversimplify the cycle’s complex dynamics. Linear counting methods (e.g., forward- and backward-counting based on calendar day, which we have used ( Eisenlohr-Moul, 2013 ; Kiesner et al., 2016 ) and recommended ( Schmalenberger et al., 2021 )) fail to account for variability in ovulation timing and its impact on cycle length, leading to misalignment of hormonal dynamics ( Meyer et al., 2007 ). Although reasonable solutions have been proposed ( Bigelow & Dunson, 2007 ; Doty, 1979 ; Joyce et al., 2018 ; Kiesner, 2011 ; Meyer et al., 2007 ; Zhang et al., 2000 ), their lack of adoption highlights the need for an accessible tool to standardize and streamline data preparation. Without such methods, measurement inconsistencies will continue to hinder scientific and clinical advancements for cycle-related conditions (e.g., endometriosis, premenstrual dysphoric disorder, catamenial epilepsy).
Thus, we introduce Phase-Aligned Cycle Time Scaling (PACTS) and its companion R package, `menstrualcycleR` , which provides an accessible method for standardizing repeated menstrual cycle observations onto a continuous timeline. PACTS aligns physiological processes across cycles by adjusting for variability in ovulation timing and cycle length.
Below, we review challenges in menstrual cycle measurement, outline our approach, and demonstrate its advantages over phase and count-based methods.
The menstrual cycle is a recurring feedback loop between the brain and the ovaries, split into two broad phases: (1) follicular phase, spanning from menses to ovulation; (2) luteal phase, from ovulation to the next menses. During the follicular phase, estradiol (E2) peaks before ovulation, while progesterone (P4) levels remain low. Post-ovulation, the corpus luteum forms, marking the luteal phase onset, characterized by rising E2 and P4 levels followed by a decline leading to menses.
Categorical approaches, as we have recently described for average-length cycles ( Schmalenberger et al., 2021 ), divide the menstrual cycle into discrete subphases (e.g., mid-follicular, peri-ovulatory, mid-luteal, perimenstrual, and sometimes early luteal) based on expected hormonal/physiological shifts. While intuitive, the division of the menstrual cycle into subphases limits science in three key ways. First, subphases reduce the number of observations per person by assigning phase labels only to days near measurable physiological events (e.g., menses, ovulation), where phase estimates are more reliable. Second, they aggregate data within phases to create averages, masking within-phase dynamics. Third, overlapping hormonal profiles across phases (e.g., elevated E2 in both periovulatory and mid-luteal phases) introduce multicollinearity, complicating the estimation of fixed effects (estimates of the sample average for within-person phase contrasts) and random effects (individual variability in those phase contrasts). Shifting from categorical to continuous cycle modeling increases statistical precision, power, and enables researchers to leverage statistical techniques to model the cycle as a dynamic temporal system with individual differences ( Eisenlohr-Moul et al., 2020 ; Reen & Kiesner, 2016 ; Schmalenberger et al., 2024 ).
Thus far, count-based methods—such as forward count (counting from day 1 of menses to the next menses onset), backward count (counting down from day −1, the day prior to menses onset, to the prior menses onset), and forward-and-backward menses count (spanning −15 to +10 around menses onset)( Schmalenberger et al., 2021 )— focus on anchoring at menses only, but fail to account for individual and cycle-to-cycle variability in ovulation timing. Ovulation does not consistently occur at cycle midpoint, and the follicular phase exhibits greater length variability than the luteal phase ( Fehring et al., 2006 ). This misalignment is particularly problematic for phase-based hormonal comparisons (e.g., premenstrual vs. mid-follicular or mid-luteal vs. mid-follicular), in which the mid-follicular phase is often assumed to represent a low-hormone “control” state. If this reference phase is misaligned and does not correspond to low hormonal activity, the validity of comparisons and study conclusions may be compromised. To correct these systematic errors, we propose that a continuous cycle variable must be anchored to both menses and ovulation.
Biomarker-based methods can be used to identify ovulation with varying levels of precision ( Schmalenberger et al., 2021 ; Su et al., 2017 ). The gold standard, transvaginal ultrasound to confirm follicle rupture, is limited by invasiveness and impracticality. Instead, most clinical studies use repeated luteinizing hormone (LH) measurements in blood, urine, or saliva, or basal body temperature (BBT), which reflects P4’s thermogenic effect ( Su et al., 2017 ). To ensure a flexible but standardized hormonal framework, PACTS outlines varying methods to identify the day of ovulation.
Biomarkers (e.g. LH or BBT) are highly recommended for assessing ovulation( Blake et al., 2016 ), but when unavailable, ovulation is often estimated by assigning it to either 1) the midpoint of the individual’s cycle length, or 2) 14 days prior to menses onset, regardless of total cycle length. Many studies identify the luteal phase as relatively consistent in length at 14 days compared to the highly variable follicular phase ( Crawford et al., 2017 ; Lenton et al., 1984 ), while others suggest more variable luteal phase lengths across age groups ( Bull et al., 2019 ; Grieger & Norman, 2020 ; Roos et al., 2015 ). Until more studies are published that verify average luteal phase lengths across premenopausal ages—comparing gold-standard methods (e.g., transvaginal ultrasound) and commonly used ovulation detection techniques (e.g., blood, urine, and saliva hormone measurements)—counting backward from the next menses remains a more reliable method for estimating ovulation than assuming it occurs at fixed mid-cycle point. The midpoint assumption systematically misaligns ovulation, particularly when the follicular phase (and therefore the cycle) is very short or very long, whereas a backward count approach better approximates the timing of ovulation. Accordingly, PACTS assigns ovulation to day −15 (where day −1 is the day before menses onset) when ovulation biomarkers are unavailable.
We developed and validated PACTS and its accompanying R package `menstrualcycleR` ( https://github.com/eisenlohrmoullab/menstrualcycleR ) to provide a standardized, hormonally meaningful, continuous menstrual cycle time variable for summarizing and predicting any repeated measure across the cycle (e.g., pain, blood glucose, affect/mood, heart rate variability, sleep). After explaining the PACTS method and our validation dataset, we demonstrate how PACTS realigns daily hormone data across cycles of varying lengths and improves statistical precision (in predicting daily hormone levels) compared to traditional count methods.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.