Longitudinal Assessment of Circulating Tumor DNA: A Proposed Statistical Framework | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Longitudinal Assessment of Circulating Tumor DNA: A Proposed Statistical Framework Christopher R. Pretz, Jiemin Liao, Caroline Weipert, Leslie Bucheit, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3788054/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract As circulating tumor DNA (ctDNA) levels can reflect disease progression, achieving a comprehensive understanding of the temporal evolution of ctDNA is key to informing clinical decision making. However, temporal changes can exhibit complex non-linear patterns and differ substantially across patients. Additionally, patient characteristics and outcomes may impact temporal change. Thus, traditional statistical approaches may be inadequate in characterizing ctDNA evolution over time. In this proof-of-concept study, we propose utilizing a new approach using a hierarchical random effects cubic spline model, which is sufficiently flexible to capture complex temporal ctDNA patterns while supporting the integration of patient characteristics. To demonstrate the benefits of the approach, a retrospective cohort of non-small cell lung cancer patients who received anti-EGFR therapies was analyzed. Model results are presented graphically in the form of patient-level response patterns, where each combination of patient characteristics produces a unique pattern. Patients with various ages, levels of health status, as well as mortality status were contrasted, where results provide examples of how the model can further our conceptualization of ctDNA dynamics and demonstrates how results can be used in targeted, patient-centered, clinical decision-making. Biological sciences/Computational biology and bioinformatics Biological sciences/Computational biology and bioinformatics/Statistical methods Figures Figure 1 Figure 2 Figure 3 Figure 4 INTRODUCTION Plasma-based next generation sequencing for comprehensive genomic profiling of oncology patients with advanced solid tumors is recommended by numerous guidelines and expert consensus statements including those from the National Comprehensive Cancer Network, the American Society of Clinical Oncology, the European Society for Medical Oncology, and the International Association for the Study of Lung Cancer. Plasma-based ‘liquid biopsies’ have many clinical applications in advanced cancers including genotyping, identification of actionable biomarkers with targeted therapeutic options, early assessment for treatment efficacy, monitoring of response to treatment and identification of acquired resistance mechanisms. Liquid biopsies have also been applied in cancer screening and early detection, detection of molecular residual disease after curative intent and predicting patients at risk of recurrence. 1,2 Additional studies suggest serial measurements of circulating tumor DNA (ctDNA) can provide insight into disease progression. For instance, Sanz-Garcia et al. discussed how temporal ctDNA can have widespread utility in disease management across cancer types, while McLaren et al. proposed serial ctDNA testing be used to inform radiotherapy dosing. 3,4 In addition to illustrating how serial ctDNA testing can support disease management, these studies highlight the power of providing a custom-tailored patient-level approach to oncology, a topic central to goals of this manuscript. Interpreting serial ctDNA results is currently challenging. ctDNA levels can vary substantially within and between patients, posing distinct obstacles when analyzing these types of data. 4 Similarly, because temporal ctDNA patterns might be impacted by patient characteristics such as age, gender, etc. these factors should be accounted in analyses. 5,6 To handle the complexities inherent to temporal ctDNA patterns while simultaneously considering how patient characteristics might be associated with these patterns, we propose the adoption of a hierarchical cubic spline random effects model (HCSREM)—an extension of a hierarchal linear mixed model. While the model has been modified to meet the challenges posed by serial ctDNA data, the foundations of the model have been widely discussed and applied. 7–17 . As these prior works point out, this family of models has many advantages in comparison to traditional longitudinal analyses such as use of change scores, calculating ratios between baseline measures and future time points, or response profile analysis. In summary, the main benefits of HCSREM are: ( 1 ) The ability to capture biomarker evolution more accurately, ( 2 ) patient characteristics can be directly incorporated, ( 3 ) time points between measures need not be equally spaced and missing data is less problematic, ( 4 ) sample size can be preserved, that is, instead reducing sample size by performing sub-group analyses, a covariate that captures different strata can be integrated, and ( 5 ) longitudinal projections of existing, and new patients (those that do not exist in the data) can be generated. To the best of our knowledge, HCSREM or similar models have not been utilized in analyzing complex longitudinal genomic data. As such, the main purpose of this paper is to discuss how to apply this model to serial ctDNAand to demonstrate the benefits of this framework by way of a specific application. Consequently, our results are for a proof-of-concept study to showcase the methodology and are not intended as a formal analysis to explore a predetermined clinical hypothesis. METHODS General Approach In this study the details of the HCSREM are discussed by demonstrating the model’s utility through analysis of a retrospective real-world cohort of patients who were diagnosed with advanced non-small cell lung cancer (NSCLC). Although the proposed framework can be applied to an assortment of longitudinal biomarkers, the biomarker of interest is ctDNA level as measured by the maximum variant allele fraction of all somatic variants detected through liquid biopsy. Since a major advantage of the method is the ability to incorporate patient information, several relevant covariates were considered. Finally, to enhance interpretation, model results are displayed graphically in the form of estimated longitudinal projections, each based upon a patient’s set of distinct traits. Within this process, patient-level projections are directly compared where comparisons are enhanced by velocity plots which are defined subsequently. Data Source and Patient Cohort The cohort used to illustrate the utility of the methodology is based on observational data and was sourced from the GuardantINFORM anonymized clinical-genomic database, which includes structured commercial payer claims collected from inpatient and outpatient facilities in both academic and community settings. Due to the observational nature of the data, the methods in this study adhered to the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) guidelines where applicable. 18 The GuardantINFORM database is fully deidentified and complies with Sections 164.514 (a)–(b)1ii of the US Health Insurance Portability and Accountability Act (HIPAA) regarding the determination and documentation of statistically deidentified data. The generation of de-identified data sets by Guardant Health for research purposes was approved by the Advarra Institutional Review Board; patient identity protection was maintained throughout the study in a de-identified database. Retrospective analysis of de-identified data is approved by Advarra IRB number Pro00034566. As the data is de-identified, obtaining patient consent was waived by the approving ethics committee. Patients selected for the cohort were diagnosed with advanced non-small cell lung cancer (NSCLC) and had at least three Guardant360 (G360) liquid biopsy tests in the US between June 1st, 2014 and June 30th, 2023. Only patients receiving targeted therapies for EGFR mutations were included, with the following therapies considered: osimertinib, afatinib, dacomitinib, erlotinib, gefitinib, and amivantamab. All patients were required to retain at least three blood samples while on a specific line of anti-EGFR therapy, or within 30 days prior to line of therapy initiation and 30 days post end line of therapy. Patients whose first G360 test on the line of therapy was more than 120 days after the start of line of therapy were excluded. For patients with multiple lines of therapy meeting these criteria, the earliest line of therapy was selected for study inclusion. Finally, patients with suspected germline mutations were removed from the cohort. Response Variable and Study Covariates The response variable, ctDNA measurements captured over time, is reported as a percentage. In instances where samples contained ctDNA levels below the assay’s limit of detection, values were replaced with ctDNA levels of 0.04%, the lowest value in the cohort and consistent with the limit of detection of the test. All covariates except mortality were captured at baseline where the baseline period is defined as six months prior to the index date, i.e., the date of the patient’s first G360 test. Baseline covariates include age (in years), line of anti-EGFR therapy, smoking status (yes/no), gender (female/male), and the Van Walraven Elixhauser Comorbidity (ELIX) score specific to lung cancer patients (expressed as a weighted measure across multiple common comorbidities 19 ). As the cohort is based on real-world-data, it is not possible to directly align treatment start date with the patient’s first G360 as is achievable in a prospective study. Therefore, days between the first G360 test and start of treatment was added as a covariate to serve as a statistical control and was set to zero days in the analysis to mimic a post treatment scenario. Patient mortality captured as alive vs deceased within the study timeframe was also included. Proposed Statistical Model In this section we provide an overview of the mathematical details of the HCSREM. As previously stated, this model was selected as it is malleable enough to capture variable nonlinear trends, and it allows for the direct incorporation of patient characteristics in the form of covariates. In addition to these properties, this model can provide a unique corresponding temporal ctDNA pattern for each combination of covariate values. It is the ability to provide this type of patient-specific information that makes this methodology attractive in targeted oncology efforts. The model is partitioned into first- and second-level equations, which create the hierarchical structure. The first-level equation assumes the form of a truncated cubic spline and captures how a particular patient’s ctDNA levels change over time (see Eq. ( 1 )). At a high level this is achieved by creating a function that is split into various pieces that traverse the abscissa. Within each piece, a cubic polynomial is used to fit the data where the ends of the consecutive cubic polynomials are connected by knots. Knot location, as well as the number of knots can be strategically devised based on data inspection, though “automated” methods for determining knot quantity and placement exist. Ultimately, a cubic spline model combines the separate pieces to form a single uniform function to represent the data. 20 $${Y}_{ij}={\pi }_{0i}+{\pi }_{1i}{t}_{ij}+{\pi }_{2i}{t}_{ij}^{2}+{\pi }_{3i}{t}_{ij}^{3}+\sum _{k=1}^{K}{\pi }_{(k+3)i}{\left({t}_{ij}-{ϵ}_{k}\right)}_{+}^{3}+{\epsilon }_{ij}$$ 1 where \({\left(t-ϵ\right)}_{+}=\left\{\begin{array}{cc}0& if t\le ϵ\\ t-ϵ& if t>ϵ\end{array}\right\}\) In Eq. ( 1 ), ctDNA measurements (or a transformation thereof) captured over time are represented by the \({Y}_{ij}{\prime }s\) , where \(i\) is used to index patients and \(j\) indexes the measurement occasion. Timepoints captured within the patient are given by \({t}_{ij}\) , \(ϵ\) is the value of the \({k}^{th}\) knot, the \({\pi }_{ri}{\prime }s\) are the \(r\) response parameters, where each i.e. \({\pi }_{0i}\) , \({\pi }_{1i}\) , \(\cdots , {\pi }_{(k+3)i}\) varies across patients i.e. the random effects, and, \({\epsilon }_{ij}\) is the error term and is assumed to be normally distributed with a mean of 0 and variance \({\sigma }^{2}\) . The response parameters are especially important as they collectively govern the shape of each patient’s unique longitudinal ctDNA trajectory and serve to bridge the first and second-level equations. The significance of the second-level equations is they contain information about individual patient characteristics and associate these characteristics with the response parameters themselves. The second-level equations are given below. $${\pi }_{ri}={\beta }_{r0}+\sum _{c=1}^{{C}_{r}}{\beta }_{rc}{X}_{ci}+{e}_{ri}$$ 2 where, \({X}_{ci}\) represents a desired patient characteristic, \({\beta }_{rc}\) captures the linear relationship between the response parameter and the patient characteristic, \({\beta }_{r0}\) is the intercept for each corresponding \({\pi }_{ri}\) , and \({e}_{ri}\) represents a random component and is assumed to adhere to the following multivariate normal distribution: When the model contains covariates, it is referred to as a conditional model, otherwise it is an unconditional model. The unconditional model provides results at the cohort level and the conditional model is responsible for producing patient-level results. A final, yet important element of the proposed methodology comes in the form of velocity plots, which are useful when examining the direction and speed in which ctDNA levels change at a given point in time i.e. the instantaneous rate of change (IRC) is of interest. Each model generated patient trajectory has at its heart, a cubic spline. An advantageous property of cubic splines is they are twice differentiable, thus, the IRC at a given time point can be calculated. 21 In the case of the adopted spline model, this amounts to taking the first derivative of Eq. ( 1 ) with respect to time resulting in: $$\frac{{dY}_{ij}}{d{t}_{ij}}={\pi }_{1i}+2{\pi }_{2i}{t}_{ij}+3{\pi }_{3i}{t}_{ij}^{2}+\sum _{k=1}^{K}{3\pi }_{(k+3)i}{\left({t}_{ij}-{ϵ}_{k}\right)}_{+}^{2}$$ 3 The value of the IRC is given by the slope of the line tangent to the patient trajectory, where positive values correspond to an increasing IRC, negative values to decrease, and IRC values of zero indicate either a peak or trough was reached, or that the trajectory is flat. The further the IRC value is from zero, the more extreme the rate of change is. STATISTICAL ANALYSIS and RESULTS Data were extracted using SAS software package 9.4 (SAS Institute, Cary, NC, USA) and all statistical analysis for the HCSREM was performed using R version 4.1.3. A total of 400 patients with advanced NSCLC were identified from the GuardantINFORM database as having at least three G360 tests. Seventy-three patients were excluded as their first test was more than 120 days after therapy initiation and five were excluded due to germline mutations. Of the remaining patients, 163 received anti-EGFR therapy with a total of 561 ctDNA longitudinal measurements, where these 163 patients defined the cohort used in the analysis. The average age of these patients was 62 years, 66% of them were females, average line of anti-EGFR therapy was 1 and the average time between G360 test and treatment initiation was 0 days (range − 115 days to 30 days) (Table 1 ). Table 1 Summary of Patient Characteristics Characteristics (Total N = 163) N/Mean %/Standard Deviation Age (years) 61.18 10.88 Female 108 66% ELIX score 1.89 1.86 Current or prior smoker 123 75% Line of anti-EGFR therapy 1.44 0.99 Time between G360 test and treatment initiation (days) 0.29 31.98 ctDNA (%)* 5.66 10.59 Deceased at the end of study period 55 33% *ctDNA value was extracted from each test and summarized, thus includes multiple ctDNA values for each patient. [Insert Table 1 ] To meet model assumptions, ctDNA levels, expressed as a percentage, were transformed into logits. Histograms in Fig. 1 show that logit transformation of ctDNA level alleviated the extreme skewness of the raw data. Likewise, the logit transformed spaghetti plot accentuates the complexity and variability of ctDNA level over time, both within and between patients. [Insert Fig. 1 ] In our example we concentrate on conducting an exploratory investigation of the data. To begin, an unconditional model was fit to the transformed data using knots set at 50, 125, 250, 500, 750, 1000, and 1250 days respectively. To ensure consistency, other knot orientations were explored, although different orientations did little to alter results. Results are presented graphically as spline model parameter estimates are difficult to interpret although parameter estimates and related output are provided in the supplemental information for reference. 22 The graphical manifestation of the unconditional model, referred to as a response pattern, is presented in Fig. 2 . [Insert Fig. 2 ] The response pattern (given in black) suggests ctDNA levels drop substantially between the first G360 test and 30 days, then rise rapidly until 150 days, at which point ctDNA levels dip slightly and rise again at around 300 days, although at a less extreme rate. Additionally, from 550 days to 1000 days ctDNA levels drop, and then rise again from 1000 to 1600 days. The corresponding 95% confidence band expand over time as the number of datapoints decreases. The flexibility built into the unconditional model revealed details hidden within the data that simpler models would not detect. Despite this, the unconditional model only estimates the response pattern for the cohort and does not account for the contingency that patients with different characteristics may exhibit different response patterns. To assess the impact of incorporating patient characteristics, a conditional model that incorporated all baseline covariates was fit to the data. As is typical in hierarchical models, all numerical covaries were centered about their respective means (Hofmann and Gavin (1998)). Figure 3 shows how baseline age and health status, as measured by the ELIX score, impact response patterns in female non-smokers receiving their first line of EGFR-TKI treatment. Results are separated by patients who are alive vs deceased. As data becomes sparse after 400 days, we examine the first 400 days only. [Insert Fig. 3 ] Examples presented above reveal that patients with different characteristics have different response patterns. In the top-left panel, the response curve for a 30 and 80-year-old with average ELIX scores are contrasted. These results suggest 80-year-old patients did not exhibit initial post treatment drop in ctDNA levels in comparison with 30-year-old patients who demonstrated a rapid decrease followed by a rapid increase. The top-middle panel indicates response patterns for patients with an average age and a maximum ELIX score of 13 appear to be quite different compared to the same patient with a minimum ELIX score of 0, implying patients with many comorbidities exhibited a delayed treatment response. In the top-right panel, response patterns are displayed for older patients with high comorbidity burden and younger patients who are otherwise healthy, illustrating how the age/health status combination amplifies the disparity in response patterns. Though not shown, beyond 400 days, a decreasing trend in ctDNA values is observed for patients who remained alive at the end of the study while the trend increases for patients who died before study end. To focus on the response pattern’s behavior, velocity plots that display the IRC for a corresponding response pattern were generated (Fig. 4 ). [Insert Fig. 4 ] In general, information presented in a velocity plot can be gleaned from the response patterns themselves, but the differences in the response patterns are accentuated when examining them through an IRC lens. Thus, comparing velocity plots can provide additional clues as to where response patterns are similar and where they diverge based on the IRC value. Another advantage of utilizing velocity plots occurs when baseline values between response patterns are dissimilar and therefore differences between response patterns may be due to the fact that biomarker values were different at the onset. In these instances, using velocity plots to make comparisons may be more appropriate as the IRC is invariant to the biomarker’s baseline value. Interpreting a velocity plot is relatively straightforward. To demonstrate, we focus on the far-left panels. In the first 100 days, velocity plots for 80-year-old alive and deceased patients (red curves) exhibited different patterns. For survivors, the IRC was initially positive but slowed to zero around 20 days (indicating a peak in the corresponding response curve as referenced by the dashed line), and then decreased, where the fastest rate of decrease (-0.026 logits per day) occurred around 43 days. Beyond 43 days the IRC continued to decrease and remained relatively flat past 100 days. In contrast, the velocity plot for 80-year-old deceased patients displays a nearly opposite pattern. DISCUSSION The purpose of this paper is to introduce a statistical methodology that can accommodate the analysis of complex longitudinal genomic data. In our example we applied the methodology to observational data and used it for exploratory purposes. However, this methodology is applicable in different data settings and can be used for hypothesis generation, statistical inference, and patient monitoring. In this section we discuss these applications while considering their limitations and conclude with discussing model modifications and sample size requirements. In observational data settings, as was the case in our example, the proposed method is often leveraged as an exploratory tool. When data exploration is the aim, the 95% confidence bands fail to retain their traditional inferential meaning and instead are used as ‘guidelines’ in identifying differences in response patterns. Since thousands of response patterns are available (there are 58,464 possible response patterns in our example), exploring results can be daunting. To alleviate this issue, RShiny, Excel, or a similar platform can be used to create an interactive tool that generates a response pattern for each covariate combination, allowing for the interrogation of a large volume of results. During this process, caution should be taken in extrapolating beyond the data range or investigating nonsensical covariate combinations. Many observations may be made in the data exploration process. For instance, results may support predefined conjecture, or may reveal unanticipated findings prompting further investigation in a follow-up study designed to evaluate the finding. For instance, results presented in Fig. 3 may motivate a study that assess whether older patients with more comorbidities exhibit a delayed treatment response in comparison to the response of younger healthier patients. This paper has focused on analyzing observational data, but this framework can be applied to representative cohorts as well. If statistical inference is the goal, since the potential exists to generate and compare numerous response patterns, the number of comparisons should be minimized, based on a priori hypotheses, and common considerations such as controlling for type-I error should be made. Hypotheses may include comparing response patterns between patients with pre-determined sets of covariate values (where other study covariates can be used as statistical controls) but can also include hypothesizing about the nature of the relationship between response pattern behavior and the covariate values themselves. Patient monitoring is another application of our proposed framework. The general idea is each response pattern is a reasonable portrayal of a patient as described by his or her own unique set of characteristics—and—in this way—the same response pattern can serve as a reference for a new patient that shares these characteristics. Additionally, if survival status (deceased or not) is incorporated into the model, a reference response pattern for survivors and non-survivors can be created. Thus, if the response pattern of a new patient is consistent with that of a survivor, intervention is unnecessary, but if the response pattern mirrors that of a non-survivor, intervention may be required. Utilizing velocity plots to compare response patterns can further enhance this process-especially if baseline values between response patterns are dissimilar. To ensure reliable classification, such a monitoring system should undergo internal and external validation. Internal validation may be achieved by creating training and test datasets and then apply say k-fold cross-validation to assess classification accuracy. If an acceptable level of accuracy is achieved, external validation can be accomplished if new patients i.e. not involved in cross-validation are also classified with a high degree of accuracy. Modifications can be made to improve model performance. For instance, the relationship between the response parameter and covariate(s) may be non-linear, and therefore, imposing a linear restriction as done in Eq. ( 2 ) will fail to correctly specify the model. In such cases, a scatterplot can be used to identify the correct relationship and appropriate adjustments can be made. Model efficiency (i.e. models with smaller standard errors) can also be improved via model reduction-especially when multicollinearity is present. Although model reduction can be an arduous task due to the complicated nature of these models, the reduction process can be guided by utilizing information criteria such as the Akaike information criteria (AIC) or Bayesian information criteria (BIC) which balance model fit with model simplicity. Finally, applicable when timepoints are fully captured and equally spaced (not the case in our example), is to account for the correlation structure of the repeated measures within the model. Possible correlation structures include the first order autoregressive, compound symmetry, and spatial power structures, where the AIC or BIC can facilitate the identification of an optimal structure. Finally, selecting an appropriate sample size is an important consideration when conducting an analysis. Minimum sample size recommendations for hierarchical models are around 100 patients with at least 3 measurements per patient, though these types of models have been fit with sample sizes as low as 22 individuals. 23–25 CONCLUSION Changes in ctDNA levels can fluctuate significantly over time from patient to patient, and the results can be difficult to interpret. In this paper we introduce a statistical framework that is flexible and therefore capable of capturing these complexities while accounting for a diverse set of patient traits. Furthermore, we apply the modeling to analyze an observational dataset consisting of patients with NSCLC who received anti-EGFR therapy. Analytic results are presented graphically and demonstrate the utility of the approach in acquiring a comprehensive understanding of how response patterns evolve and how different patient characteristics influence these evolutions. In our example we demonstrate how the method can be used as a “high powered” exploratory technique, although we have outlined many other applications. Regardless of the desired purpose, a major advantage of our proposed framework is the ability to generate patient-level results, where such results can add to our understanding of ctDNA dynamics and enhance our ability to integrate ctDNA into clinical decision-making. Declarations SOFTWARE R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Acknowledgments We thank Dr. Aaron Hardin, PhD for insights regarding ctDNA capture and the limit of detection of the G360 test. Author Contributions C.R.P. conceived of the presented model and computational framework, analyzed the data, presented the results, and authored the manuscript with critical input from all co-authors. J.L. created the cohort, aided in the development of the model and computational framework, provided analytic support, and authored the manuscript. C.W., L.D., L.B., & A.D. provided critical content knowledge expertise, guidance, and input, and authored the manuscript. Data Availability Statement The datasets generated during and/or analyzed during the current study are not publicly available and cannot be shared due to the use of a third-party healthcare claims database. Researchers interested in replicating our study or pursuing new research topics should contact Guardant Health (https://guardanthealth.com/products/biopharma-solutions/real-world-evidence/) directly. Additional Information Competing Interests: All manuscript authors are employed by Guardant Health. Each receive an annual salary, bonus, and stock options that are commensurate with the author’s job description, experience, and level of education. References Rolfo, C. et al. Liquid Biopsy for Advanced NSCLC: A Consensus Statement From the International Association for the Study of Lung Cancer. Journal of Thoracic Oncology 16, 1647–1662 (2021). Pascual, J. et al. ESMO recommendations on the use of circulating tumour DNA assays for patients with cancer: a report from the ESMO Precision Medicine Working Group. Annals of Oncology 33, 750–768 (2022). McLaren, D. B. & Aitman, T. J. Redefining precision radiotherapy through liquid biopsy. Br J Cancer 129, 900–903 (2023). Sanz-Garcia, E., Zhao, E., Bratman, S. V. & Siu, L. L. Monitoring and adapting cancer treatment using circulating tumor DNA kinetics: Current research, opportunities, and challenges. Sci. Adv. 8, eabi8618 (2022). Ørntoft, M.-B. W. et al. Age-stratified reference intervals unlock the clinical potential of circulating cell-free DNA as a biomarker of poor outcome for healthy individuals and patients with colorectal cancer. Int J Cancer (2020) doi: 10.1002/ijc.33434 . Huang, R. S. P. et al. Circulating Cell-Free DNA Yield and Circulating-Tumor DNA Quantity from Liquid Biopsies of 12 139 Cancer Patients. Clinical Chemistry 67, 1554–1566 (2021). Raudenbush, S. W. & Bryk, A. S. Hierarchical Linear Models: Applications and Data Analysis Methods . (Sage Publ, Thousand Oaks, Calif., 2010). Singer, J. D. & Willett, J. B. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence . (Oxford University PressNew York, 2003). doi: 10.1093/acprof:oso/9780195152968.001.0001 . Hedeker, D. & Gibbons, R. D. Longitudinal Data Analysis. xx, 337 (Wiley-Interscience, Hoboken, NJ, US, 2006). Fitzmaurice, G. M., Laird, N. M. & Ware, J. H. Applied Longitudinal Analysis . (Wiley, Hoboken, 2011). Welham, S., Cullis, B., Gogel, B., Gilmour, A. & Thompson, R. Prediction in linear mixed models. Aust NZ J Stat 46, 325–347 (2004). Mackenzie, M. L., Donovan, C. R. & McArdle, B. H. Regression Spline Mixed Models: A Forestry Example. Journal of Agricultural, Biological, and Environmental Statistics 10, 394–410 (2005). Straube, J., Gorse, A.-D., PROOF Centre of Excellence Team, Huang, B. E. & Lê Cao, K.-A. A Linear Mixed Model Spline Framework for Analysing Time Course ‘Omics’ Data. PLoS ONE 10, e0134540 (2015). Pretz, C. R., Kozlowski, A. J., Chen, Y., Charlifue, S. & Heinemann, A. W. Trajectories of Life Satisfaction After Spinal Cord Injury. Archives of Physical Medicine and Rehabilitation 97, 1706–1713.e1 (2016). Grajeda, L. M. et al. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines. Emerg Themes Epidemiol 13, 1 (2016). Yu, Z. et al. Beyond t test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron 110, 21–35 (2022). Janssen, J. M. et al. Longitudinal nonlinear mixed effects modeling of EGFR mutations in ctDNA as predictor of disease progression in treatment of EGFR -mutant non‐small cell lung cancer. Clinical Translational Sci 15, 1916–1925 (2022). von Elm, E. et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol 61, 344–349 (2008). van Walraven, C., Austin, P. C., Jennings, A., Quan, H. & Forster, A. J. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care 47, 626–633 (2009). Gauthier, J., Wu, Q. V. & Gooley, T. A. Cubic splines to model relationships between continuous variables and outcomes: a guide for clinicians. Bone Marrow Transplant 55, 675–680 (2020). Zhou, S. & Wolfe, D. A. ON DERIVATIVE ESTIMATION IN SPLINE REGRESSION. Statistica Sinica 10, 93–108 (2000). Shepherd, B. E., Rebeiro, P. F., & Caribbean, Central and South America network for HIV epidemiology. Brief Report: Assessing and Interpreting the Association Between Continuous Covariates and Outcomes in Observational Studies of HIV Using Splines. J Acquir Immune Defic Syndr 74, e60–e63 (2017). Huttenlocher, J., Haight, W., Bryk, A., Seltzer, M. & Lyons, T. Early vocabulary growth: Relation to language input and gender. Developmental Psychology 27, 236–248 (1991). Fan, X. & Fan, X. Power of Latent Growth Modeling for Detecting Linear Growth: Number of Measurements and Comparison with Other Analytic Approaches. The Journal of Experimental Education 73, 121–139 (2005). Muthén, B. O. & Curran, P. J. General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods 2, 371–402 (1997). Additional Declarations Competing interest reported. All manuscript authors are employed by Guardant Health. Each receive an annual salary, bonus, and stock options that are commensurate with the author’s job description, experience, and level of education. However please note this is not a formal study but instead a proof-of-concept manuscript with the intent to showcase a novel statistical methodology. Supplementary Files SupplementarymaterialFileLongitudinalAssessmentofCirculatingTumorDNAAProposedStatisticalFramework.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3788054","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":270966259,"identity":"3b348707-bc2a-4383-9a4e-7dee9bfbfe7a","order_by":0,"name":"Christopher R. Pretz","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBUlEQVRIiWNgGAWjYPACGzApAcQJDAzMDcwENRxgSINpMQBqYWxsJkLLYRK08M/uffj4w5/zdhuOn3144+eOP3n87AfbHxcw2MnpNmDXInHnuLHBAZ7byRvOpBtb9p4xKJbsSWxsnsGQbGx2ALsWA4k0NokDEreTDQ4AGbxtBokbDgC18DAcSNyGWwv7jwMG55INzj9jk/wL1LL//EOCWtgYDiQcsDO4kcYmDbZFgoAtEneOMUucOZCcIHnjGbO1bJtx4owbDxtn8xjg9gv/7DbGDxV/7Oz5zqcx3nzbJpfY35984DNPhZ0cLi3gGAeCxAY0B+NQjqTFHo+SUTAKRsEoGOkAAIpbYzgOpVVbAAAAAElFTkSuQmCC","orcid":"","institution":"Guardant (United States)","correspondingAuthor":true,"prefix":"","firstName":"Christopher","middleName":"R.","lastName":"Pretz","suffix":""},{"id":270966260,"identity":"38c50eb5-0bc3-4e3d-8e2a-a40f36787499","order_by":1,"name":"Jiemin Liao","email":"","orcid":"","institution":"Guardant (United States)","correspondingAuthor":false,"prefix":"","firstName":"Jiemin","middleName":"","lastName":"Liao","suffix":""},{"id":270966261,"identity":"087bdf9e-7261-4153-96b9-2ac1d5457c5a","order_by":2,"name":"Caroline Weipert","email":"","orcid":"","institution":"Guardant (United States)","correspondingAuthor":false,"prefix":"","firstName":"Caroline","middleName":"","lastName":"Weipert","suffix":""},{"id":270966262,"identity":"fc5219f5-3194-4311-ba14-9ed9a3404128","order_by":3,"name":"Leslie Bucheit","email":"","orcid":"","institution":"Guardant (United States)","correspondingAuthor":false,"prefix":"","firstName":"Leslie","middleName":"","lastName":"Bucheit","suffix":""},{"id":270966263,"identity":"014a63b3-29de-4d65-b7b3-7121ef83a980","order_by":4,"name":"Leylah Drusbosky","email":"","orcid":"","institution":"Guardant (United States)","correspondingAuthor":false,"prefix":"","firstName":"Leylah","middleName":"","lastName":"Drusbosky","suffix":""},{"id":270966264,"identity":"3d3eeff4-6d78-4aef-8ac8-4162b91b7d00","order_by":5,"name":"Amar Das","email":"","orcid":"","institution":"Guardant (United States)","correspondingAuthor":false,"prefix":"","firstName":"Amar","middleName":"","lastName":"Das","suffix":""}],"badges":[],"createdAt":"2023-12-21 16:59:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3788054/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3788054/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":50752410,"identity":"10770def-0afd-4a62-ac65-269ffb8914c7","added_by":"auto","created_at":"2024-02-06 17:48:27","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":525433,"visible":true,"origin":"","legend":"\u003cp\u003eDistribution of ctDNA levels and Logit Transformed ctDNA Levels and Corresponding Spaghetti Plots.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/ca6353578b5f1a8fdc352324.png"},{"id":50752408,"identity":"61f70563-f23c-47ae-9abf-f6118a7be9e5","added_by":"auto","created_at":"2024-02-06 17:48:27","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":155284,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eUnconditional Model Fit with and without Datapoints for the NSCLC Cohort. The black curve denotes the response pattern for the cohort, while each black dot indicates a ctDNA level value. The purple region represents the 95% confidence bands of the estimated trajectory.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/986e21a91f3af5b5631a143e.png"},{"id":50752821,"identity":"4e3038ea-e672-413e-8e49-156fbf76037d","added_by":"auto","created_at":"2024-02-06 17:56:27","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":290207,"visible":true,"origin":"","legend":"\u003cp\u003eResponse Patterns for Different Values of Baseline Age and ELIX Scores for Alive vs Deceased Patients for Female Non-Smokers Receiving their First Line of anti-EGFR Treatment.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/897859aba99da2f6d0156e00.png"},{"id":50752409,"identity":"a557f1fe-e9d4-44b6-80d4-642386aa2055","added_by":"auto","created_at":"2024-02-06 17:48:27","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":260591,"visible":true,"origin":"","legend":"\u003cp\u003eVelocity (IRC) Plots for Different Values of Baseline Age and ELIX Scores for Alive vs Deceased Patients for Female Non-Smokers Receiving their First Line of anti-EGFR Treatment.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/3c0b4c8150094ae32d797608.png"},{"id":79675183,"identity":"31ef22d6-101f-4397-bba4-fed4776e2fc7","added_by":"auto","created_at":"2025-04-01 12:01:35","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1678361,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/a315a63a-e152-461e-9ddd-b2c43fb25fe7.pdf"},{"id":50752411,"identity":"fe1f8f5c-4378-4cc8-8015-b36eee3190ac","added_by":"auto","created_at":"2024-02-06 17:48:27","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":125963,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementarymaterialFileLongitudinalAssessmentofCirculatingTumorDNAAProposedStatisticalFramework.docx","url":"https://assets-eu.researchsquare.com/files/rs-3788054/v1/8514dc6210a7de3544e7e897.docx"}],"financialInterests":"Competing interest reported. All manuscript authors are employed by Guardant Health. Each receive an annual salary, bonus, and stock options that are commensurate with the author’s job description, experience, and level of education. However please note this is not a formal study but instead a proof-of-concept manuscript with the intent to showcase a novel statistical methodology.","formattedTitle":"Longitudinal Assessment of Circulating Tumor DNA: A Proposed Statistical Framework","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003e Plasma-based next generation sequencing for comprehensive genomic profiling of oncology patients with advanced solid tumors is recommended by numerous guidelines and expert consensus statements including those from the National Comprehensive Cancer Network, the American Society of Clinical Oncology, the European Society for Medical Oncology, and the International Association for the Study of Lung Cancer. Plasma-based \u0026lsquo;liquid biopsies\u0026rsquo; have many clinical applications in advanced cancers including genotyping, identification of actionable biomarkers with targeted therapeutic options, early assessment for treatment efficacy, monitoring of response to treatment and identification of acquired resistance mechanisms. Liquid biopsies have also been applied in cancer screening and early detection, detection of molecular residual disease after curative intent and predicting patients at risk of recurrence.\u003csup\u003e1,2\u003c/sup\u003e Additional studies suggest serial measurements of circulating tumor DNA (ctDNA) can provide insight into disease progression. For instance, Sanz-Garcia et al. discussed how temporal ctDNA can have widespread utility in disease management across cancer types, while McLaren et al. proposed serial ctDNA testing be used to inform radiotherapy dosing.\u003csup\u003e3,4\u003c/sup\u003e In addition to illustrating how serial ctDNA testing can support disease management, these studies highlight the power of providing a custom-tailored patient-level approach to oncology, a topic central to goals of this manuscript.\u003c/p\u003e \u003cp\u003eInterpreting serial ctDNA results is currently challenging. ctDNA levels can vary substantially within and between patients, posing distinct obstacles when analyzing these types of data.\u003csup\u003e4\u003c/sup\u003e Similarly, because temporal ctDNA patterns might be impacted by patient characteristics such as age, gender, etc. these factors should be accounted in analyses.\u003csup\u003e5,6\u003c/sup\u003e To handle the complexities inherent to temporal ctDNA patterns while simultaneously considering how patient characteristics might be associated with these patterns, we propose the adoption of a hierarchical cubic spline random effects model (HCSREM)\u0026mdash;an extension of a hierarchal linear mixed model. While the model has been modified to meet the challenges posed by serial ctDNA data, the foundations of the model have been widely discussed and applied.\u003csup\u003e7\u0026ndash;17\u003c/sup\u003e. As these prior works point out, this family of models has many advantages in comparison to traditional longitudinal analyses such as use of change scores, calculating ratios between baseline measures and future time points, or response profile analysis. In summary, the main benefits of HCSREM are: (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e) The ability to capture biomarker evolution more accurately, (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e) patient characteristics can be directly incorporated, (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e) time points between measures need not be equally spaced and missing data is less problematic, (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) sample size can be preserved, that is, instead reducing sample size by performing sub-group analyses, a covariate that captures different strata can be integrated, and (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e) longitudinal projections of existing, and new patients (those that do not exist in the data) can be generated.\u003c/p\u003e \u003cp\u003eTo the best of our knowledge, HCSREM or similar models have not been utilized in analyzing complex longitudinal genomic data. As such, the main purpose of this paper is to discuss how to apply this model to serial ctDNAand to demonstrate the benefits of this framework by way of a specific application. Consequently, our results are for a proof-of-concept study to showcase the methodology and are not intended as a formal analysis to explore a predetermined clinical hypothesis.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cp\u003eGeneral Approach\u003c/p\u003e\n\u003cp\u003eIn this study the details of the HCSREM are discussed by demonstrating the model\u0026rsquo;s utility through analysis of a retrospective real-world cohort of patients who were diagnosed with advanced non-small cell lung cancer (NSCLC). Although the proposed framework can be applied to an assortment of longitudinal biomarkers, the biomarker of interest is ctDNA level as measured by the maximum variant allele fraction of all somatic variants detected through liquid biopsy. Since a major advantage of the method is the ability to incorporate patient information, several relevant covariates were considered. Finally, to enhance interpretation, model results are displayed graphically in the form of estimated longitudinal projections, each based upon a patient\u0026rsquo;s set of distinct traits. Within this process, patient-level projections are directly compared where comparisons are enhanced by velocity plots which are defined subsequently.\u003c/p\u003e\n\u003cp\u003eData Source and Patient Cohort\u003c/p\u003e\n\u003cp\u003eThe cohort used to illustrate the utility of the methodology is based on observational data and was sourced from the GuardantINFORM anonymized clinical-genomic database, which includes structured commercial payer claims collected from inpatient and outpatient facilities in both academic and community settings. Due to the observational nature of the data, the methods in this study adhered to the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) guidelines where applicable.\u003csup\u003e18\u003c/sup\u003e The GuardantINFORM database is fully deidentified and complies with Sections 164.514 (a)\u0026ndash;(b)1ii of the US Health Insurance Portability and Accountability Act (HIPAA) regarding the determination and documentation of statistically deidentified data. The generation of de-identified data sets by Guardant Health for research purposes was approved by the Advarra Institutional Review Board; patient identity protection was maintained throughout the study in a de-identified database. Retrospective analysis of de-identified data is approved by Advarra IRB number \u003cem\u003ePro00034566.\u003c/em\u003e As the data is de-identified, obtaining patient consent was waived by the approving ethics committee.\u003c/p\u003e\n\u003cp\u003ePatients selected for the cohort were diagnosed with advanced non-small cell lung cancer (NSCLC) and had at least three Guardant360 (G360) liquid biopsy tests in the US between June 1st, 2014 and June 30th, 2023. Only patients receiving targeted therapies for \u003cem\u003eEGFR\u003c/em\u003e mutations were included, with the following therapies considered: osimertinib, afatinib, dacomitinib, erlotinib, gefitinib, and amivantamab. All patients were required to retain at least three blood samples while on a specific line of anti-EGFR therapy, or within 30 days prior to line of therapy initiation and 30 days post end line of therapy. Patients whose first G360 test on the line of therapy was more than 120 days after the start of line of therapy were excluded. For patients with multiple lines of therapy meeting these criteria, the earliest line of therapy was selected for study inclusion. Finally, patients with suspected germline mutations were removed from the cohort.\u003c/p\u003e\n\u003cp\u003eResponse Variable and Study Covariates\u003c/p\u003e\n\u003cp\u003eThe response variable, ctDNA measurements captured over time, is reported as a percentage. In instances where samples contained ctDNA levels below the assay\u0026rsquo;s limit of detection, values were replaced with ctDNA levels of 0.04%, the lowest value in the cohort and consistent with the limit of detection of the test. All covariates except mortality were captured at baseline where the baseline period is defined as six months prior to the index date, i.e., the date of the patient\u0026rsquo;s first G360 test. Baseline covariates include age (in years), line of anti-EGFR therapy, smoking status (yes/no), gender (female/male), and the Van Walraven Elixhauser Comorbidity (ELIX) score specific to lung cancer patients (expressed as a weighted measure across multiple common comorbidities\u003csup\u003e19\u003c/sup\u003e). As the cohort is based on real-world-data, it is not possible to directly align treatment start date with the patient\u0026rsquo;s first G360 as is achievable in a prospective study. Therefore, days between the first G360 test and start of treatment was added as a covariate to serve as a statistical control and was set to zero days in the analysis to mimic a post treatment scenario. Patient mortality captured as alive vs deceased within the study timeframe was also included.\u003c/p\u003e\n\u003cp\u003eProposed Statistical Model\u003c/p\u003e\n\u003cp\u003eIn this section we provide an overview of the mathematical details of the HCSREM. As previously stated, this model was selected as it is malleable enough to capture variable nonlinear trends, and it allows for the direct incorporation of patient characteristics in the form of covariates. In addition to these properties, this model can provide a unique corresponding temporal ctDNA pattern for each combination of covariate values. It is the ability to provide this type of patient-specific information that makes this methodology attractive in targeted oncology efforts.\u003c/p\u003e\n\u003cp\u003eThe model is partitioned into first- and second-level equations, which create the hierarchical structure. The first-level equation assumes the form of a truncated cubic spline and captures how a particular patient\u0026rsquo;s ctDNA levels change over time (see Eq.\u0026nbsp;(\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e)). At a high level this is achieved by creating a function that is split into various pieces that traverse the abscissa. Within each piece, a cubic polynomial is used to fit the data where the ends of the consecutive cubic polynomials are connected by knots. Knot location, as well as the number of knots can be strategically devised based on data inspection, though \u0026ldquo;automated\u0026rdquo; methods for determining knot quantity and placement exist. Ultimately, a cubic spline model combines the separate pieces to form a single uniform function to represent the data.\u003csup\u003e20\u003c/sup\u003e\u003c/p\u003e\n\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e$${Y}_{ij}={\\pi }_{0i}+{\\pi }_{1i}{t}_{ij}+{\\pi }_{2i}{t}_{ij}^{2}+{\\pi }_{3i}{t}_{ij}^{3}+\\sum _{k=1}^{K}{\\pi }_{(k+3)i}{\\left({t}_{ij}-{ϵ}_{k}\\right)}_{+}^{3}+{\\epsilon }_{ij}$$\u003c/div\u003e\n \u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003ewhere\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\left(t-ϵ\\right)}_{+}=\\left\\{\\begin{array}{cc}0\u0026amp; if t\\le ϵ\\\\ t-ϵ\u0026amp; if t\u0026gt;ϵ\\end{array}\\right\\}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\n\u003cp\u003eIn Eq.\u0026nbsp;(\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e), ctDNA measurements (or a transformation thereof) captured over time are represented by the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({Y}_{ij}{\\prime }s\\)\u003c/span\u003e\u003c/span\u003e, where \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(i\\)\u003c/span\u003e\u003c/span\u003e is used to index patients and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(j\\)\u003c/span\u003e\u003c/span\u003e indexes the measurement occasion. Timepoints captured within the patient are given by \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({t}_{ij}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(ϵ\\)\u003c/span\u003e\u003c/span\u003e is the value of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({k}^{th}\\)\u003c/span\u003e\u003c/span\u003e knot, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\pi }_{ri}{\\prime }s\\)\u003c/span\u003e\u003c/span\u003e are the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(r\\)\u003c/span\u003e\u003c/span\u003e response parameters, where each i.e. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\pi }_{0i}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\pi }_{1i}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\cdots , {\\pi }_{(k+3)i}\\)\u003c/span\u003e\u003c/span\u003e varies across patients i.e. the random effects, and, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\epsilon }_{ij}\\)\u003c/span\u003e\u003c/span\u003e is the error term and is assumed to be normally distributed with a mean of 0 and variance \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\sigma }^{2}\\)\u003c/span\u003e\u003c/span\u003e. The response parameters are especially important as they collectively govern the shape of each patient\u0026rsquo;s unique longitudinal ctDNA trajectory and serve to bridge the first and second-level equations.\u003c/p\u003e\n\u003cp\u003eThe significance of the second-level equations is they contain information about individual patient characteristics and associate these characteristics with the response parameters themselves. The second-level equations are given below.\u003c/p\u003e\n\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e$${\\pi }_{ri}={\\beta }_{r0}+\\sum _{c=1}^{{C}_{r}}{\\beta }_{rc}{X}_{ci}+{e}_{ri}$$\u003c/div\u003e\n \u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003ewhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({X}_{ci}\\)\u003c/span\u003e\u003c/span\u003e represents a desired patient characteristic, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\beta }_{rc}\\)\u003c/span\u003e\u003c/span\u003e captures the linear relationship between the response parameter and the patient characteristic, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\beta }_{r0}\\)\u003c/span\u003e\u003c/span\u003e is the intercept for each corresponding \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\pi }_{ri}\\)\u003c/span\u003e\u003c/span\u003e, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({e}_{ri}\\)\u003c/span\u003e\u003c/span\u003e represents a random component and is assumed to adhere to the following multivariate normal distribution:\u003c/p\u003e\n\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\u003cimg src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAasAAABzCAYAAAAmCy2bAAAgAElEQVR4Ae2di7EkOw2GbwrEQArkQAjEQApkQAZkQAREQAIkQAbkcKhvuf9eHeG35Z7uGbmqq7v9kKVfL9s9Z/eXryyJQCKQCCQCicDNEfjl5vwle4lAIpAIJAKJwFcmqzSCRCARSAQSgdsjkMnq9ipKBhOBRCARSAQyWaUNJAKJQCKQCNwegUxWt1dRMpgIJAKJQCKQySptIBG4GQL//Oc/v/785z9//e53v/v65Zdfvn7/+99//fWvf72cy3/84x8/+PjDH/7wgw94sRftn14So+ssIJPVdVh/xEx/+tOfvv71r399hKynhCQhkCD+/e9/f/3nP//5AlPqrsKVOf/4xz/+SJYkSfj4+9///iNp0pbl64dePhUj7AHZr7aFTFbpeWEIENjYBVxtxFECwL/dOegZx7yyMK/dtbDTou6K3RW6I1Gyq7PJkWd4+Nvf/nYlFLec650xGvUBfIJF1JUlk9WVaL/xXApmBNanFjnq1cmphxeJi0Rhk0dvzGq7MCjpER7uhs2qnDvjPgkjyer1zu6KBY1dVO1gOjI2k9UIStmni8ArVlpdpiY71Bxzkkxod1bx7Fb5hnVFIQCxs/JFuzsftHy/T3j/JIxaPkEbWFx1kpLJ6hO867CMfM/AaFlt7RQCIsGQFTzX1d+/Wo65I9fqWB03XZWolJDAwRd0jE6u4sXPf5f3T8Oo5xMspP7yl79cop5MVpfA/L6TEFBJVKUANyO1jroUDEl8pW8nMzRn+/Ycc5beTv+rExW8tgKxfuRB0vrk8mkY9XxCi5jdheqITb0kWXH2TlAiyLFa4/7pTjCirDv2wZjR3+5RACs0LlsUGK46euo5puXt9DOJmmsX1xk+9d0RHGxRfel4kH60K5nhz1Er7VN0rWyzz8LiUzAa8Qn89gofvTxZKRNj3GRjnBFhMfIsz0IA3UXuqkpBTguaK1ZuI465oiFwwu6xedk69m4vm5TAwScqfoXnA+QKL70xzAuPwpsdLzrgIlD7oh0xvCEDfCJXqa8f23o/Rbc152jbJ2E04hOK6SwuT5awDCGhrAPybB1MBojT2qLvFHIQ2zb7PMLHLM3sX0ZAWPf0pn412yA401baXcs2sB1Kj1aZ07Fa0Y5cJRK0Cf5cyEdA51TB+4A4BEuPk97h73QRf1okwDf82mQqHqijn5VFPt6zCdHgDn27a4uiC++lBZCde+X5XTCSrcm3SliM+gRYWzso0dqtC09WLUdHIBzPGjKKp57rRFGwu8LRT/B/V5oKKASaXukZvHRUWplBH5up6U9ja+093mx7j0/bd+SZREUwL+2SkGl39zHCw8k+2kXhz1wkY+T1NqHFCDKPJI9RuszJXBG6P4XTqCynMNqVa9QnrJy7c9bGX5astOJS9pVx48gY+KrjtgIZQkcGsxqIn1g/s/XvGbx0VEpWGlsLSBpba5/RjeaCZkSBDrZNYrdF3+IieLZ0r37Gl/E/LuREXr87RkYWomCAj9PX9tF4q/sRugR3EhXzehyFL7RfXUZkOYERWMv+eNZCQjtY5gSfnq2P9kO/0PMLlUj8w7TZE0rtMk4MGKDIyN6ZERBw6UN/7qWtqoyy1CaQmAMazJ8lDgGMHr2MFOm+5hjSkQ1YoquxNf1pbK1ddEbumqvG5wgN9VFgLu0kNE8Ez5rvFXdw8ljhy9afsRMrpx+jRawd4/sgm6cLvhT6WvrCAdwVmFX3ivuILCcwUlyFNjwQT8FamwWwoK1knxYn2So0ekWLB6vL3piZ9suTVSkgeYaVhNQXEEg4evf9W++AnMmqhdB8mwIxDjFSegYvHZX0q7GlgMTcGltrH+FPfTTXiGNqTO0uWiWZZM92h1Gjc+d6Ah0LFk5JKARD3pVIqPO+By4WX2zIJ5URusIFWiXdE5R7gVg0Tt5HZDmJkadtZWVX2lro01d2bHVmadhnLTxO2fXLkxVZGCBsNsbQLDi0AbrdYtJOne1ngdOz+pUMWn3yPoeAgq2CVG90z+Clo1Jgx9lbDqexEfrt8dmT07aLVkkmggQyjeJn6d7pWb6LLMiEXXiZvO7AxfosY6xfI98IXeEArZLuSZqngqbmHrmPyHIKIy0qvU7gm7oRG5QdW5215MYO/OKj1X+m7bJkhdMCDoaJAilaiXmj8sqjL2BZwKBBv97qiTElejMgZd/vCMwaZM/gsQl0VNqpSX+loC+7iNJvj8/vKLTfkAW+PN+qr9ktPoGzMxacvW+0Z71fK0nDJhMWovYoivaS3kclwT4sfcYpENsd3ii9V/Q7hRG4QrtUsCvsq1dmfUK+fAL7y5IVoCg54YhcGK13ZvrR5g0Qo7TgKllBs1UU7Dy91phsqyMA3uhnJsD0DF4B3K+w4QJnYz4tcDxnkfrt8ennbr0TMLFXbBzeuSQnyagkDwnMJijGWptvzXfXNnSKDpGXAIY81mfRLTEAbEqYtOSiP7Rt8qM/9GQzM3bamutk2ymMoOuxkRzYOr6DnbYWRLM+ofhQW4xp/pX7pclqlEEMDZBsAVi7ShAogN0qkcGsNc+ntOEA6KeHu8WjZ/DQgqYPzDrGYM5aidRvj88aD7V6+CdYIBsXScrbtcYqwNrgisNbm1ff2l042mRQ63tVPQlFGKBfLz+YUN8KmCVepXdhy120wQGaYFda4QtraNgiWrZO84i2bYt6PoUR8tf4VvxEvtYigfHg4rFqyS7sW31W2m6brHxmBiwL2KgjM8Ya8gpIOeY3BFYMccTg1YfEhPMQcBTIWs4UqV/xYO3sN8nPPhHQlZgIsOLFBnHq6YM91wLxWS7fg/rdktWdUZUdzviEFielhcKOrLdMVjgigcoWHNQmMMADlF6JDGa9ud69Xaux1k6nhMGowbOrUDAmKTIPSatVIvU7ymeLn9U25MXGlYiwbYKqCgmbPtp52eRGn1fyLh6fcs9kNa6pFbtigYUdMzay3DJZCSCtKnnHUW3gUvIie9sk5sGJDGae9qe9kzzAffaYSfpEF9ElUr8n+ezJXXJua+9aKIiOAq5NaCzwWr6gsXlPBEYRWPEJ7BZ79huO0Tlr/W6ZrGBWCUpC+y2lAicrzKuOiWogfkq9dj0tvEtYrBh8iU6p7l2SFY4tWwZfdlDUCWthaDHAN2yyYkE3u5Cw9PI5EfAIyO5mF5qrscLPb99vm6wskzvPkcFsh4+nj91ZLa0a/Ahmkfo9yWdPFv89Cl6UqBgr3iwdm6ykH7sbs33zORFYQUB2N5ustJmIXDxlslrR4AeO0Tk0RjhbVg1+ZJ53SVY9WYW/+pHcSFY6caCdnVWWRCASgVXf5WQA+1yJFzX+M1nVkMn6bwhopaTviN8aOy+rBt8h+6P5U5KVdk76gQX6sN8EwBgs6LeioxGss8/nIbDqu1pMWRvdRS+T1S6CHzJeZ9Bayc+IvWrwI3N8SrICC5KQ9IDcVhf6AQb19vhwBMPskwjUENjxXXZWXFH2mMmqpqWs/4mAVvUY3krZMfjefJ+UrHpYPKUde9LfSWJTHF+yU4wKalfiwCJBu1wFZ3uP/GZzpVyaa8d35ZtRGKxFH0li7jtCGTLhjwIM/rKsIWBX7SsUTtpGpH5P8rmC27uOQWckKP2SUbjriPMJcpNYJQf86/iVne8Tk24Nc+kGWWeLPh1AI6JksopA8c1pyGAxvpWi8SsG35sPmqxkIxziJJ89OT6pHZ35vwdDhyfs4wSuJCO+xZBw7VGsvtM8Ken28NnxCf3IIkqvmax62sr2H0FkJyHsGHwP/kxWPYTu307wx76eEuRlz9oZWoSflHQt37VnybqScMAHPKJ+pZrJqqalrP+JgD7ql5zzZ6fGw47BN8j+aMpk1UPo/u38MbT9A+i7c0zwLf3KTcF5JbDfVeYd39UihIQVcTSayequVnIjvjA2LnvkMcPeqMHj7Eo+zEcQ682p/syxW0b53J0nx/+GgH6cEBHMfqN67kkJqWRv/FoTu109Lj/H9TrlXZ9Q7Fhd6FrOM1lZNPL5/xCQc2J0q2XE4PUjDjk6H6xL3wU8D5msPCLPeX9aogJZ+UMpWbG4wk/e6e/cRny3ZXHyz4gj3vUI5DjcFcqRC3sVWCXjCpvkjQkpiXAUuFpGbAP6fg4FBnRYK5H6HeGzxkfWzyHADyz8DxTmKLymt35E4eOJ6kvHg3BKu5IZCc3/wGRVmlN0xc+uT0T650uSFVt++3cWkcoTyLpHgiWan3TfNVaw6tFQQiw5MAEN+2CnVSqR+u3xaefHhllBE4D0TQ8+7fWUoy0r1+ozOtROyWKgZ9pVePaJiqCLLp9QSEjoXDYpebxMkoV2cMC+sAn9Sg6Zd8opupanGZ+w4/S8O150uF+erFAQSkXhOsfUikPvlsHd58hgtsvLE8cTgHA0dLRaegZLkmKO0vGJ9IdjloramWO39PgUfWxYO0F4JgCB0w5Gov20O7KjA3wa/Eb+3qiW3KHzhCJ9IzN2izzon3pfqKOftQ0lGSU7P6b0roWA2iLoMj/813yLuUZ9Qnz5++54S+/SZAU4KA7lWsVKoFKwssyuPEcGs5X5nz4mAj/ptxaMNEdpsaJkCY1S0dhae2lMra7HJ+PsYsvacNRqucbbHeuRv/RdEYwIghHfKe4o9wxPsgtiHxcxjhiIXdsCltifTWq23T+P0vXjZt9HfKJFU4kZe9gt+xR+5WBEKAUen5RQEMKgzJmibxqlICc6kcFMND/pHoFfzzY0R0mPGsu9VDS21l4aU6vTXNCsFdoINjZR0Ve2GMFHbe671Quvkt7w5xaOd5PlFD+KbeCB3YCJj3+804/EX8KMsVwW5x5dFgyyVZ6VJJmDIt2V5rNYjPazY+yz/AL+d8s+hV856AmFc0thDOGdrCvQvQLpA00UXCua0wcO2x9lMC99s8wjAHZcraOCHlXpqeYY0pF1RtHU2Jr+NLbWLjojd81V41M7htK3NY2N4GOE1zv0wTcV/Cw/ClA1HG3fd38GA48D8crGLOyKgu34vtRrd2LH9Oiy86K/EiDxVfFWmNNWsmW1iyf8v8SX7Vd7li1AY3Yz4mlelqws0zCuVQZgeSEAFiDpQ9+dAsjQ+KQgsoOXHwt2XKVE4vvW3hXIawYvHZXm0Nia/jS21l7jqVSvuWp8qr3EZ+3UoDTPO9TJn0u447/YjD/qege5Z2UgvvHZQzGOhMG7EpSlB5Yl2yPx+EXBKN1W7CO+9hahsvkSX5b31nNEDIH+XiYwHPaEahm3IfNDqdBiVSCatl3PAoA+rRIZzFrzvGubcC4F6FGZpceawUtHpTlwypbDaWzPDkZ47fGp9hKfWlgpKI3M9+Q+LX9unZYQpNWOXtHvOxfimGwUGyGB12wE+yr5CGN84h+hq5OA0nzUgX+pzepDNl/iy/ZrPUfEEOjfIlmhiFIAEFAlILQ17q0MZCjQOl1kHFJOS8EYiQIcd8kvmUWDuy2Su9bueWBV5oswEQ3frncFJPpBd7VIphoe6J85WrwKH8+DZInQb49P+INPz4vqa4EXnbEyZiy6Lh15e7lG3k/RHZlbduZxV73fCUBTtssYgq1w27GtEV6f0gdcSj7CTqzkGz25GMPYUsEGscVe6flEbzzt2H3Jb0bG2j7fI6FtmXzuCYVxAg5GrGyOkaIcv2rQ1KKpd3uX4UO3VSKDWWsetYkvBSbV+7tdXfrgpTbwElZ2vHYbtJcc3fJQChpghhHX6Gsum6xUt3KXHkuOCD0FrZIdwCdY1vQcqd8en1pgoB/44RLv4FziEV2Bs3TMWN53yym6M3whM/qRjWJ3yMbl7RJsqEd+Fdmpxqv+U+9gA57ejhTosTXf1sIKf7J4277YOr4D9rJN267nnk+oX+sO/1zQ2imXJSuYVHIS8xj7CFAlAXFWFNsrkcGsNxftCvCatzSGPjgu8oOFN0BLozReARJnLxWNx1ChXwoGYFdKDpae6EBjp/QMHv6YA0xswV6ob/EpnHcdgXl7fNIHnoQrvKHD2tzCD32pjNqt+nNHV4xTiaArzGs2pLlad+wW3aA3sJBNeXuGhmyWebnwe8a1dFua22NR6tOrE37Yji3IwGVLpH1ZuvZZdqf5uVvesDGwasVKS0/PYFWzTS0UmKekL9EQb5YftY3eozD8rpnR2Qv9IoTyZEXT1/OOAkcMPQqoEg+lOgyKOQkuGJ1fYTJGDkc7z77UnEn9oM9VK+KBO3PYYKkx1PeMX3zQd6dIjy2e1Qed4jwENDlpy5mgCX+M3y3iocXnzBxaKTMGOxB9izuySYbRgDRClznRH7S5v7rYBI+c8GVxEH/oHRuI0Kdo+rvs2usZO/K2Lt2c5Mfzd6d32azHaobHKAz3opDhOEIoQ+7Ho2j6et4xqlIQ9n2jgPJ0a+84mpwNHn2gQCYSlFY29PUFh2VsyUA0rpQERUc8iA5B3xY5K+2ton7egVtjSm3SY0ke2x99gg3zadXd4zFSv6N8Wp5bz8iALFzIRcD29gD/1FPs7oP3Gv49uiRAaGIHzO3njJazhYHakNPrHz7tQoQFHjwjHzzaUsPC9hl9Fi3Pj3Rl6dCHes+P7fPOzxG2EoXhbZOVXXF6Z5OxEbhZnZVWaDKgKKBEr3cnMRB04Akjt7wReHFE+JcR1BIuY70zMTdBzx4JlfgRD7TxDC0b9JkbPnpFODN+p0jWkjw7dBkbqd9oPsENmrZYPWDj9LH27cega7/Y8H2g7+nq3dMXL9Ds2ZH6RtyZC9sVX/gu73bRpWd06nGDhxIWEbwljToCET4R5aN7UcjIGCGUyOFgpUtOjZPjbATcnsNFASXeWncbfBTordOx2tUqWnzJQT1d5PcJBVrUMU+tWB7oo9W6TYqWjxod6iUDvOyUSNvwfAhHi7PvM/oezSc2CtbohAsdUCf9CV/ZNXwij5WF8d7Ge3StvOjO0lcbdkTCuKogs3TF3OyglLg8Dx4DtZewUFvezyAQ4RPSu7XrFW73opCZMUIoQy7scQYonEf9WfXZXZFnyAZ/temIjneck0CBU1IIGDipHJTnVhKgzbZDjzEtnpjH8sA780GHAKcCnRHDUTC1fIjGzP2kbUhfI/L0eI7mk4UIdgR+3KGvRAUvwtcmE+SxspTssEfXysnclj5tsgnZou1/h2ePgXgqYaG2vJ9BIMInonw0k9WvOiaIENDl2CQFgrpWxtYUSFQowBdWwLaeQKF3BSvGEGxsm6fDu09mJD3RKvVXneeBeuRiPoKTApXk1LjSnT6M49opEQZfmx9M4M8G+FrfXv1JPktzl3SB3rUQUntt912i6evAxutatu373uUdnXp9RmBxF/mexEeET0T56F4UMqhHCGXIhT2OAkWQV5DQ5DgIgZ5EI4fnzjttvjCXPbJhLBfYMEaFYNELsOKbMUpu4kF0SnfPA32Qi/m4a+7SWF/HfIzj2iknbUM4McduOclnjTfsggURRUlEtiX8WUh526zRs/WyGz8WOcGNeZjzTgVZLSbibRcL0cn7HAIRPhHlo3tRyMgdIZQhF/Y4ChQObY9oxAB1yKadDvRqK12Cuv0OoLkZaxMNuyTfV/PprrG886zjRLXX7iW6BCXqoYMsJNCRogDB2J1y0jaEE3PslpN81njDltAHGBOkrf2gN2yH+prN1ehCz1+yQR0Vg13J5ms0T9dLl5Zv6XUHi9N8n6CPjtD9qN+f4AGaET4hvUqXq7zuRSEza4RQhlzYYxRQPYYU2HEqFXZZOJ5WzqpXcGoFCvEto231Fd0SD2rTnNzt7k/tpbvoIcNOOWkbwmnXEZDvJJ87+OXYz0NAtrjre7vIiQ/8bLVE+eheFDLcRwhlyIU9RgHVY0jy237UsTqyCYykgwFS3yokOPXzxzi1cSUe1Bca0OMaPfp5p2QlO2jtUITfjmMK77wnAjsIaJGaO6vfUMxk9RsWy08kII5qSAS9HZB2WySrVl8FTnZCI6XHAwlTycomzxbtd0xWyFQrwjyTVQ2hrP80BCJ8QnHHL7q1gByNR7dMVgJodAfQMiABAs1TRUdsKKW1Y5JcUh4JrlbUtxVc7dgRHpivxZ+lxzO7EPHa2pH4cf5dspxIAqP6BZ/RBcIJPj0m+Z4IPAGBCN9VDPGxTL776GQlISKChmgBepZ5BGqGNkMpwuBr843oV0m3t/g5yWeN/6xPBO6MQIRPRMQQMMqd1Z0t5Qa8RRjajMHPntWPJCu+/0G3V2b47NHK9kRgB4FZP9iZqzV21yf4PBERQ+DxlsmqBd5s20gwm6X5Sf05OsPYRoJ9DZcZg1df5hwpkfrV3NDMcg4Bjn04kuZotvXd9hwH/6Psj6VOzzdDX7Y46gcztGf6io9VnwBjJauZeUt9xyJCaaSr2xXKkQt7jQxmYUw9iFAEfjO2MbuijOBP6pjhU2NO3GcxOMHDSZo2gL0qYeiHTv7PSk7KPUP7Ljaw6xNW1zPyl/pmsiqhknU/EYhIBrsG/5OZwkMEfyJ7kk/NMXIXH69eVY/wutLnDjsr2Q33LHUEZIurOPGdGDtu/ZisPvv3llsmKwHU+yD+XZTym4wSmlnmEZAudlagorFq8C2uI/V7ks+WDL7tLqtqz9c7vfOjG2wnIsa8Ey5ell2f2B1v+bllslIAighuogVoWeYRiDC2CBo1ziP1e5LPGv9ZnwjcGYFdn9Bxa8QfN98yWQmgiFVPZDC7s1Gd4k1nzjN/n+V5kT5HFh+zu4pI/c7w6WXM92cjgN2h/1f+4MMiOOsHdmzk865PRPrnLZNVJNiRYEXy9RRa+hulne8nMwavvqPzRepXc0MzyzkE7vDNykunX71iA3cossVRPzjFs/hY9Qn9YwUk392SyWoXwQ8Yj8Nwrf5ya8bgZ1eU75isZjF4mglqt75jU9Ey8wMA+PH/JFD0PKP07mIDM75bkk2xY+dfwBHdTFZCIu9VBHZXR7sGX2Xs1/8+BYdgjt1yks8Z3sQHcr1juePOCp4iPju8m75kiys7q4hTGYtnmDfsCGUZ4lm0IowncuXt+fyUd/0L8OhlpUifKwbfmy9Svyf57Mlh2++yqrY85fNnIrDjE9gxCy4WuxHllslKASgiuInWaqCNAPnpNGSwqz9f1/gIfXosI/V7kk/Pd77fGwECLN+x2HF9ctnxCY2N+CUgOrhlspKQubO6h5voG8PqCkn6HElWs7uKTFb3sJF34kL2HnW8/GRsZnzXy6kTmajvgLdMVl7onffIYLbDx9PH4rir31BmDF59R+eK1K/mHkmqT9fnK/m/4zcrj0furP6HyI5P6Ecrqz/M8jrJZOURyfciAjgvCWTF8GYMPndWXz/+0WCOoKKOT4oKfWGl3bms2NMLWf+4qWd814Jj/7V1W7/znMlqB70PGqu/RF/Z0q8a/Ai877izEl6ju8sRnO7U5wk7qzvh9UpeZIuzpw1akMyOa8mayaqFTrb9REC/7Fn5kcWqwf+cvPHwjslqdnfZgCebHooACZ0Foo7StNO++l/YWPVdjeMeVTJZRSH55nS0rV/515NluJGrLMH9jslKsuX9cxHArklQOiaVD62cbOygqHlnfVc/rhD/OzxobCYrIZH3LgL6bjX7c95Vg+8y9KZ/FDwid/Z5bwRIDuysbOFYeDZp2PErz6u+S6KNPsbOZLWiwQ8do+9Ws39SsGrwIzDjvDgFc+yWk3zu8pbjPxsBnWw8YWelf7li5ZNBS8uZrFroZNs3BGSEs79SO5kEMll9U1G+vCkCBH5ONp7wzYqEygIyOrG+JFkBuP14iGB+yxtlc5HBLIqnJ9Nhe881UzJZzaCVfROB7wiwOHxFooKLFd/V96rZzwXfpf7/t8uTFatzgh3g6+ObhNP7/7O5XpPJah270kgch8UFehwtowaP/qUv5sAuevOoP3PsllE+d+fJ8a9FADvDvrAdW6jjsiXSvizd0edXJip4XPEJMCS+R5fvmtmgPiIUmZZExS/K7HZWY2e/hYyw+2pjG+HxSX30E3Z0NlqkXx8c7HjR1REjtoLBYy+thBWp3xE+Lc/5/EwEnpKsOG3q2f9pDcz6hPw4+ggQOS9NVlqV+6SkndXstlFGx71WIoNZbY5Pq9fOeFTuEYNnAeN/Fi/9tpJcpH5H+LQy45ha+WpVbu+0P7mwSCBgCmMrG8+nju5PY1azK8ln55fs2MZumbEX+vpEhT5avrDLX2n8rE8oxs/G8tLcvu6yZKVfs6AACu8oRInKJzDPaOldQNpdmu8XaWye9qe+E6Rw7FGDlJ5qjoYd1IIf9tKaK1K/PT6lb+yNeeGNMeCA/foTA/V/4l06JvgQJJERPbR2uU+U8wqeV+wFWwJvf9V86JQcoz6h+fGJUzxelqy0mhH4EgqnGA16AmTmHhnMZuZ9574ELPSIIY+UnsErMJYWLNIfCa1U1D7KS4mG6np80o/AUzqeFCYnjj/E31V3rY69PvBZFpdZxhF4ur2M+ITQ0KLT243ad++XJ6uRoKLgpY90BAAchQCpxKak16MXGcx2wX6n8ejGH9vV5OsZvHRUOs5V4KzpWWNr7TWeSvU9PhmjPiVesUn4eXLRorKEp7B+snxX8/50exH/I3bNQoY4TYI+UW6RrAhIcn6yMpeyNGDxTLsNjmrn3ipysJLztcZlWxsBdERwlt5avXsGLx2VaGlsTX8aW2tv8eXbNFfLMXFGLaLseHh/h2SlY/lSwNEC0cqdz20Enm4vIz4BAtgL9sFG41S5LFkhjBSn3RFHJwQGkpUv7KYQvnasomRVcipLKzKYWbr5/PVDnyXdeWx6Bi8d3T1ZKSGVEqOS9wgeHp87vdcSrr5Z2QXjnfi+Iy/vYC893xXu6qfYrvrI+2XJCqaVnLRCY4WKk5cKK7yWY5DBW+2iqUBYCjDqk/c1BEYNVP3QRalIR09OVtqR1Oy5JPcd62rJSovHpyfjKzFvJaun2EvPd4Unsfi0bVyarCTYyB3hW1tKEt0IOAqEmV80f1wAAAUOSURBVKxGUJ/ro91yS09Q7Bm8dFRKVtAmgNb0p7G19hmJenzqRxR+LtWXjgeZn3YFJ2Tp4TXK8wm6nH6AqS3oGX+krbRyph1fpF3JrnfiYenXnk/IV5vrRL3s4i72siJjzyegqVOFkm2szFkbE56sMFZ7eUXVGLH1OnJorVKZo3REKHAtDzyv8GF5yucyAuBKkGoFp55OCHQ1fSoZKZH1aJW5HKsVbR+s7WgSEoFbjslxNPJzEZx80XE1tMFIO5RSXz+29X6KrhYHwhs+kRn9lPwRHJCdZMyzAnTJN1vy+LZT8vl5Tr/fxV5W5RzxiSt2VfAflqxWwSiNU6auObS21xg0fUtOVKKbdfEIEIAJVhj1alEAL+2UcQQCZSsZrs7rx404JnzYXYQctcSfsCGQqygIK9mpvnVnDrsbi6DL/OAKP76Ag3BXIqr5IomdvpKfey2x+Xn0Dp52V3paPs17xR1ZrrSXll5X5O35hGL1jD2v8MGYWyYrHBMnqRUMAOOmj3XiWv+sP4sABosuVg1WDuZ1rlV6KYmdkKjnmLNzKgkjH5dwsvJgy9o9Iv/IwmuE7iyvK/2lH3hGDhIffsnFuwrPYGuTttpK97vIV+LtZN0d5e75hF9IncTnlsnqpMBJ+wwCBFwbhGdnkVNAg+BGcNeCxAa+Wboz/cVD6xhwhp79TkUigq5PRtQpiNtgxTw6QWCnYkuPLkkEuszJs5IkeFKi5BS/8MfFnNC2+mJu+GVu2n3RWB070t6TD/rYCWOhj4zaCdIWJZ/n9fR7T+6r9GrlbGFJGzZm9W3HRj9/94Jo6knvYxDAkQgeNujMCk/wU9DBCQhIJK2rSssxV3ggOPsAjWPLubl7zHiHDxVODpRkVNejC47QVoIgoLPrIRiq0LZ7KiG8RFN3qzPsgkJfjwX1OhYVJtT15GMMdIUNd8ZjMyoR8onWVfee3Ffp1corHXvdoWPwRhdXlUxWVyH9AfNg2Pb7xdNErjnmqhwEUfBQ8MaxeVcA187JJniCAnyokGB8UunR1Vif+FTPPSLQKNGIf+QiSZAcfUEmH/DoQwBmjC2j8pWCu+hEyCdaV91H5T6tVytvzSewS7v4sWNOPWeyOoXsh9LFgBWMnwaBHJNgYK9SkB2RjdU+Y6FF8PQ7xZFkRXLzwb9HF97QAfMqUVp+qau12X4jzwRYYYXulbj82FqyAhMuW0bkoz+Yemyoj5TP8nX6eUTu03od8QHwxa7h98qSyepKtHOuRMAgoKBqAzwBmN0GRe0ryR8aJLpSIcAzz5WllqzgUfLO8NMK2q+Qb4b3nb530+uOLLNjM1nNIpb9E4FABAjWOk5RkCVJUbTzYgU7G9DZrYiuZ1eJg3lKOxPfP+IdXpDVr8bZlSEn8vm21rzCqtTnFfKV+DhRdze9npCxRjOTVQ2ZrE8ELkCAHQLfbAjaBHP7wZpkwg6I+tndFWMI2qWib01XHOXAg44JdbfHqsheO84r8a46gralo3ruV8pn573i+S56vUJWP0cmK49IvicCiUAikAjcDoFMVrdTSTKUCCQCiUAi4BHIZOURyfdEIBFIBBKB2yGQyep2KkmGEoFEIBFIBDwCmaw8IvmeCCQCiUAicDsEMlndTiXJUCKQCCQCiYBHIJOVRyTfE4FEIBFIBG6HQCar26kkGUoEEoFEIBHwCGSy8ojkeyKQCCQCicDtEMhkdTuVJEOJQCKQCCQCHoH/AoHIE1KJ48dnAAAAAElFTkSuQmCC\" width=\"427\" height=\"115\"\u003e\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003eWhen the model contains covariates, it is referred to as a conditional model, otherwise it is an unconditional model. The unconditional model provides results at the cohort level and the conditional model is responsible for producing patient-level results.\u003c/p\u003e\n\u003cp\u003eA final, yet important element of the proposed methodology comes in the form of velocity plots, which are useful when examining the direction and speed in which ctDNA levels change at a given point in time i.e. the instantaneous rate of change (IRC) is of interest. Each model generated patient trajectory has at its heart, a cubic spline. An advantageous property of cubic splines is they are twice differentiable, thus, the IRC at a given time point can be calculated.\u003csup\u003e21\u003c/sup\u003e In the case of the adopted spline model, this amounts to taking the first derivative of Eq. (\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e) with respect to time resulting in:\u003c/p\u003e\n\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e$$\\frac{{dY}_{ij}}{d{t}_{ij}}={\\pi }_{1i}+2{\\pi }_{2i}{t}_{ij}+3{\\pi }_{3i}{t}_{ij}^{2}+\\sum _{k=1}^{K}{3\\pi }_{(k+3)i}{\\left({t}_{ij}-{ϵ}_{k}\\right)}_{+}^{2}$$\u003c/div\u003e\n \u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\n\u003c/div\u003e\n\u003cp\u003eThe value of the IRC is given by the slope of the line tangent to the patient trajectory, where positive values correspond to an increasing IRC, negative values to decrease, and IRC values of zero indicate either a peak or trough was reached, or that the trajectory is flat. The further the IRC value is from zero, the more extreme the rate of change is.\u003c/p\u003e\n\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003eSTATISTICAL ANALYSIS and RESULTS\u003c/h2\u003e\n \u003cp\u003eData were extracted using SAS software package 9.4 (SAS Institute, Cary, NC, USA) and all statistical analysis for the HCSREM was performed using R version 4.1.3. A total of 400 patients with advanced NSCLC were identified from the GuardantINFORM database as having at least three G360 tests. Seventy-three patients were excluded as their first test was more than 120 days after therapy initiation and five were excluded due to germline mutations. Of the remaining patients, 163 received anti-EGFR therapy with a total of 561 ctDNA longitudinal measurements, where these 163 patients defined the cohort used in the analysis. The average age of these patients was 62 years, 66% of them were females, average line of anti-EGFR therapy was 1 and the average time between G360 test and treatment initiation was 0 days (range \u0026minus;\u0026thinsp;115 days to 30 days) (Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\n \u003cp\u003e\u003c/p\u003e\u0026nbsp;\u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eSummary of Patient Characteristics\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCharacteristics (Total N\u0026thinsp;=\u0026thinsp;163)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eN/Mean\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e%/Standard Deviation\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge (years)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e61.18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e10.88\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eFemale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e108\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e66%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eELIX score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.89\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.86\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCurrent or prior smoker\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e123\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e75%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eLine of anti-EGFR therapy\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.44\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTime between G360 test and treatment initiation (days)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e31.98\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ectDNA (%)*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5.66\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e10.59\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eDeceased at the end of study period\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e55\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e33%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003ctfoot\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\"\u003e*ctDNA value was extracted from each test and summarized, thus includes multiple ctDNA values for each patient.\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tfoot\u003e\n \u003c/table\u003e\n \u003cp\u003e\u003c/p\u003e\n \u003cp\u003e[Insert Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003cp\u003eTo meet model assumptions, ctDNA levels, expressed as a percentage, were transformed into logits. Histograms in Fig. \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e show that logit transformation of ctDNA level alleviated the extreme skewness of the raw data. Likewise, the logit transformed spaghetti plot accentuates the complexity and variability of ctDNA level over time, both within and between patients.\u003c/p\u003e\n \u003cp\u003e[Insert Fig. \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e\n \u003cp\u003eIn our example we concentrate on conducting an exploratory investigation of the data. To begin, an unconditional model was fit to the transformed data using knots set at 50, 125, 250, 500, 750, 1000, and 1250 days respectively. To ensure consistency, other knot orientations were explored, although different orientations did little to alter results. Results are presented graphically as spline model parameter estimates are difficult to interpret although parameter estimates and related output are provided in the supplemental information for reference.\u003csup\u003e22\u003c/sup\u003e The graphical manifestation of the unconditional model, referred to as a response pattern, is presented in Fig. \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e\n \u003cp\u003e[Insert Fig. \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e]\u003c/p\u003e\n \u003cp\u003eThe response pattern (given in black) suggests ctDNA levels drop substantially between the first G360 test and 30 days, then rise rapidly until 150 days, at which point ctDNA levels dip slightly and rise again at around 300 days, although at a less extreme rate. Additionally, from 550 days to 1000 days ctDNA levels drop, and then rise again from 1000 to 1600 days. The corresponding 95% confidence band expand over time as the number of datapoints decreases. The flexibility built into the unconditional model revealed details hidden within the data that simpler models would not detect. Despite this, the unconditional model only estimates the response pattern for the cohort and does not account for the contingency that patients with different characteristics may exhibit different response patterns. To assess the impact of incorporating patient characteristics, a conditional model that incorporated all baseline covariates was fit to the data. As is typical in hierarchical models, all numerical covaries were centered about their respective means (Hofmann and Gavin (1998)). Figure \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e shows how baseline age and health status, as measured by the ELIX score, impact response patterns in female non-smokers receiving their first line of EGFR-TKI treatment. Results are separated by patients who are alive vs deceased. As data becomes sparse after 400 days, we examine the first 400 days only.\u003c/p\u003e\n \u003cp\u003e[Insert Fig. \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e]\u003c/p\u003e\n \u003cp\u003eExamples presented above reveal that patients with different characteristics have different response patterns. In the top-left panel, the response curve for a 30 and 80-year-old with average ELIX scores are contrasted. These results suggest 80-year-old patients did not exhibit initial post treatment drop in ctDNA levels in comparison with 30-year-old patients who demonstrated a rapid decrease followed by a rapid increase. The top-middle panel indicates response patterns for patients with an average age and a maximum ELIX score of 13 appear to be quite different compared to the same patient with a minimum ELIX score of 0, implying patients with many comorbidities exhibited a delayed treatment response. In the top-right panel, response patterns are displayed for older patients with high comorbidity burden and younger patients who are otherwise healthy, illustrating how the age/health status combination amplifies the disparity in response patterns. Though not shown, beyond 400 days, a decreasing trend in ctDNA values is observed for patients who remained alive at the end of the study while the trend increases for patients who died before study end.\u003c/p\u003e\n \u003cp\u003eTo focus on the response pattern\u0026rsquo;s behavior, velocity plots that display the IRC for a corresponding response pattern were generated (Fig. \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e\n \u003cp\u003e[Insert Fig. \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e\n \u003cp\u003eIn general, information presented in a velocity plot can be gleaned from the response patterns themselves, but the differences in the response patterns are accentuated when examining them through an IRC lens. Thus, comparing velocity plots can provide additional clues as to where response patterns are similar and where they diverge based on the IRC value. Another advantage of utilizing velocity plots occurs when baseline values between response patterns are dissimilar and therefore differences between response patterns may be due to the fact that biomarker values were different at the onset. In these instances, using velocity plots to make comparisons may be more appropriate as the IRC is invariant to the biomarker\u0026rsquo;s baseline value.\u003c/p\u003e\n \u003cp\u003eInterpreting a velocity plot is relatively straightforward. To demonstrate, we focus on the far-left panels. In the first 100 days, velocity plots for 80-year-old alive and deceased patients (red curves) exhibited different patterns. For survivors, the IRC was initially positive but slowed to zero around 20 days (indicating a peak in the corresponding response curve as referenced by the dashed line), and then decreased, where the fastest rate of decrease (-0.026 logits per day) occurred around 43 days. Beyond 43 days the IRC continued to decrease and remained relatively flat past 100 days. In contrast, the velocity plot for 80-year-old deceased patients displays a nearly opposite pattern.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eThe purpose of this paper is to introduce a statistical methodology that can accommodate the analysis of complex longitudinal genomic data. In our example we applied the methodology to observational data and used it for exploratory purposes. However, this methodology is applicable in different data settings and can be used for hypothesis generation, statistical inference, and patient monitoring. In this section we discuss these applications while considering their limitations and conclude with discussing model modifications and sample size requirements.\u003c/p\u003e \u003cp\u003eIn observational data settings, as was the case in our example, the proposed method is often leveraged as an exploratory tool. When data exploration is the aim, the 95% confidence bands fail to retain their traditional inferential meaning and instead are used as \u0026lsquo;guidelines\u0026rsquo; in identifying differences in response patterns. Since thousands of response patterns are available (there are 58,464 possible response patterns in our example), exploring results can be daunting. To alleviate this issue, RShiny, Excel, or a similar platform can be used to create an interactive tool that generates a response pattern for each covariate combination, allowing for the interrogation of a large volume of results. During this process, caution should be taken in extrapolating beyond the data range or investigating nonsensical covariate combinations. Many observations may be made in the data exploration process. For instance, results may support predefined conjecture, or may reveal unanticipated findings prompting further investigation in a follow-up study designed to evaluate the finding. For instance, results presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e may motivate a study that assess whether older patients with more comorbidities exhibit a delayed treatment response in comparison to the response of younger healthier patients.\u003c/p\u003e \u003cp\u003eThis paper has focused on analyzing observational data, but this framework can be applied to representative cohorts as well. If statistical inference is the goal, since the potential exists to generate and compare numerous response patterns, the number of comparisons should be minimized, based on a priori hypotheses, and common considerations such as controlling for type-I error should be made. Hypotheses may include comparing response patterns between patients with pre-determined sets of covariate values (where other study covariates can be used as statistical controls) but can also include hypothesizing about the nature of the relationship between response pattern behavior and the covariate values themselves.\u003c/p\u003e \u003cp\u003ePatient monitoring is another application of our proposed framework. The general idea is each response pattern is a reasonable portrayal of a patient as described by his or her own unique set of characteristics\u0026mdash;and\u0026mdash;in this way\u0026mdash;the same response pattern can serve as a reference for a new patient that shares these characteristics. Additionally, if survival status (deceased or not) is incorporated into the model, a reference response pattern for survivors and non-survivors can be created. Thus, if the response pattern of a new patient is consistent with that of a survivor, intervention is unnecessary, but if the response pattern mirrors that of a non-survivor, intervention may be required. Utilizing velocity plots to compare response patterns can further enhance this process-especially if baseline values between response patterns are dissimilar. To ensure reliable classification, such a monitoring system should undergo internal and external validation. Internal validation may be achieved by creating training and test datasets and then apply say k-fold cross-validation to assess classification accuracy. If an acceptable level of accuracy is achieved, external validation can be accomplished if new patients i.e. not involved in cross-validation are also classified with a high degree of accuracy.\u003c/p\u003e \u003cp\u003eModifications can be made to improve model performance. For instance, the relationship between the response parameter and covariate(s) may be non-linear, and therefore, imposing a linear restriction as done in Eq.\u0026nbsp;(\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) will fail to correctly specify the model. In such cases, a scatterplot can be used to identify the correct relationship and appropriate adjustments can be made. Model efficiency (i.e. models with smaller standard errors) can also be improved via model reduction-especially when multicollinearity is present. Although model reduction can be an arduous task due to the complicated nature of these models, the reduction process can be guided by utilizing information criteria such as the Akaike information criteria (AIC) or Bayesian information criteria (BIC) which balance model fit with model simplicity. Finally, applicable when timepoints are fully captured and equally spaced (not the case in our example), is to account for the correlation structure of the repeated measures within the model. Possible correlation structures include the first order autoregressive, compound symmetry, and spatial power structures, where the AIC or BIC can facilitate the identification of an optimal structure. Finally, selecting an appropriate sample size is an important consideration when conducting an analysis. Minimum sample size recommendations for hierarchical models are around 100 patients with at least 3 measurements per patient, though these types of models have been fit with sample sizes as low as 22 individuals.\u003csup\u003e23\u0026ndash;25\u003c/sup\u003e\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eChanges in ctDNA levels can fluctuate significantly over time from patient to patient, and the results can be difficult to interpret. In this paper we introduce a statistical framework that is flexible and therefore capable of capturing these complexities while accounting for a diverse set of patient traits. Furthermore, we apply the modeling to analyze an observational dataset consisting of patients with NSCLC who received anti-EGFR therapy. Analytic results are presented graphically and demonstrate the utility of the approach in acquiring a comprehensive understanding of how response patterns evolve and how different patient characteristics influence these evolutions. In our example we demonstrate how the method can be used as a \u0026ldquo;high powered\u0026rdquo; exploratory technique, although we have outlined many other applications. Regardless of the desired purpose, a major advantage of our proposed framework is the ability to generate patient-level results, where such results can add to our understanding of ctDNA dynamics and enhance our ability to integrate ctDNA into clinical decision-making.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eSOFTWARE\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eR Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank Dr. Aaron Hardin, PhD for insights regarding ctDNA capture and the limit of detection of the G360 test. \u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eC.R.P. conceived of the presented model and computational framework, analyzed the data, presented the results, and authored the manuscript with critical input from all co-authors. J.L. created the cohort, aided in the development of the model and computational framework, provided analytic support, and authored the manuscript. C.W., L.D., L.B., \u0026amp; A.D. provided critical content knowledge expertise, guidance, and input, and authored the manuscript. \u0026nbsp; \u0026nbsp;\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability Statement\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated during and/or analyzed during the current study are not publicly available and cannot be shared due to the use of a third-party healthcare claims database. Researchers interested in replicating our study or pursuing new research topics should contact Guardant Health (https://guardanthealth.com/products/biopharma-solutions/real-world-evidence/) directly. \u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAdditional Information\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eCompeting Interests: All manuscript authors are employed by Guardant Health. Each receive an annual salary, bonus, and stock options that are commensurate with the author\u0026rsquo;s job description, experience, and level of education. \u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eRolfo, C. \u003cem\u003eet al.\u003c/em\u003e Liquid Biopsy for Advanced NSCLC: A Consensus Statement From the International Association for the Study of Lung Cancer. Journal of Thoracic Oncology 16, 1647\u0026ndash;1662 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePascual, J. \u003cem\u003eet al.\u003c/em\u003e ESMO recommendations on the use of circulating tumour DNA assays for patients with cancer: a report from the ESMO Precision Medicine Working Group. Annals of Oncology 33, 750\u0026ndash;768 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcLaren, D. B. \u0026amp; Aitman, T. J. Redefining precision radiotherapy through liquid biopsy. Br J Cancer 129, 900\u0026ndash;903 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSanz-Garcia, E., Zhao, E., Bratman, S. V. \u0026amp; Siu, L. L. Monitoring and adapting cancer treatment using circulating tumor DNA kinetics: Current research, opportunities, and challenges. Sci. Adv. 8, eabi8618 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Oslash;rntoft, M.-B. W. \u003cem\u003eet al.\u003c/em\u003e Age-stratified reference intervals unlock the clinical potential of circulating cell-free DNA as a biomarker of poor outcome for healthy individuals and patients with colorectal cancer. Int J Cancer (2020) doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/ijc.33434\u003c/span\u003e\u003cspan address=\"10.1002/ijc.33434\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang, R. S. P. \u003cem\u003eet al.\u003c/em\u003e Circulating Cell-Free DNA Yield and Circulating-Tumor DNA Quantity from Liquid Biopsies of 12 139 Cancer Patients. Clinical Chemistry 67, 1554\u0026ndash;1566 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRaudenbush, S. W. \u0026amp; Bryk, A. S. \u003cem\u003eHierarchical Linear Models: Applications and Data Analysis Methods\u003c/em\u003e. (Sage Publ, Thousand Oaks, Calif., 2010).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSinger, J. D. \u0026amp; Willett, J. B. \u003cem\u003eApplied Longitudinal Data Analysis: Modeling Change and Event Occurrence\u003c/em\u003e. (Oxford University PressNew York, 2003). doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/acprof:oso/9780195152968.001.0001\u003c/span\u003e\u003cspan address=\"10.1093/acprof:oso/9780195152968.001.0001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHedeker, D. \u0026amp; Gibbons, R. D. \u003cem\u003eLongitudinal Data Analysis.\u003c/em\u003e xx, 337 (Wiley-Interscience, Hoboken, NJ, US, 2006).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFitzmaurice, G. M., Laird, N. M. \u0026amp; Ware, J. H. \u003cem\u003eApplied Longitudinal Analysis\u003c/em\u003e. (Wiley, Hoboken, 2011).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWelham, S., Cullis, B., Gogel, B., Gilmour, A. \u0026amp; Thompson, R. Prediction in linear mixed models. Aust NZ J Stat 46, 325\u0026ndash;347 (2004).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMackenzie, M. L., Donovan, C. R. \u0026amp; McArdle, B. H. Regression Spline Mixed Models: A Forestry Example. Journal of Agricultural, Biological, and Environmental Statistics 10, 394\u0026ndash;410 (2005).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStraube, J., Gorse, A.-D., PROOF Centre of Excellence Team, Huang, B. E. \u0026amp; L\u0026ecirc; Cao, K.-A. A Linear Mixed Model Spline Framework for Analysing Time Course \u0026lsquo;Omics\u0026rsquo; Data. \u003cem\u003ePLoS ONE\u003c/em\u003e 10, e0134540 (2015).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePretz, C. R., Kozlowski, A. J., Chen, Y., Charlifue, S. \u0026amp; Heinemann, A. W. Trajectories of Life Satisfaction After Spinal Cord Injury. Archives of Physical Medicine and Rehabilitation 97, 1706\u0026ndash;1713.e1 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGrajeda, L. M. \u003cem\u003eet al.\u003c/em\u003e Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines. Emerg Themes Epidemiol 13, 1 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu, Z. \u003cem\u003eet al.\u003c/em\u003e Beyond t test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron 110, 21\u0026ndash;35 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJanssen, J. M. \u003cem\u003eet al.\u003c/em\u003e Longitudinal nonlinear mixed effects modeling of EGFR mutations in ctDNA as predictor of disease progression in treatment of EGFR -mutant non‐small cell lung cancer. Clinical Translational Sci 15, 1916\u0026ndash;1925 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evon Elm, E. \u003cem\u003eet al.\u003c/em\u003e The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol 61, 344\u0026ndash;349 (2008).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan Walraven, C., Austin, P. C., Jennings, A., Quan, H. \u0026amp; Forster, A. J. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care 47, 626\u0026ndash;633 (2009).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGauthier, J., Wu, Q. V. \u0026amp; Gooley, T. A. Cubic splines to model relationships between continuous variables and outcomes: a guide for clinicians. Bone Marrow Transplant 55, 675\u0026ndash;680 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou, S. \u0026amp; Wolfe, D. A. ON DERIVATIVE ESTIMATION IN SPLINE REGRESSION. Statistica Sinica 10, 93\u0026ndash;108 (2000).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShepherd, B. E., Rebeiro, P. F., \u0026amp; Caribbean, Central and South America network for HIV epidemiology. Brief Report: Assessing and Interpreting the Association Between Continuous Covariates and Outcomes in Observational Studies of HIV Using Splines. J Acquir Immune Defic Syndr 74, e60\u0026ndash;e63 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuttenlocher, J., Haight, W., Bryk, A., Seltzer, M. \u0026amp; Lyons, T. Early vocabulary growth: Relation to language input and gender. Developmental Psychology 27, 236\u0026ndash;248 (1991).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan, X. \u0026amp; Fan, X. Power of Latent Growth Modeling for Detecting Linear Growth: Number of Measurements and Comparison with Other Analytic Approaches. The Journal of Experimental Education 73, 121\u0026ndash;139 (2005).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMuth\u0026eacute;n, B. O. \u0026amp; Curran, P. J. General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods 2, 371\u0026ndash;402 (1997).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-3788054/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3788054/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAs circulating tumor DNA (ctDNA) levels can reflect disease progression, achieving a comprehensive understanding of the temporal evolution of ctDNA is key to informing clinical decision making. However, temporal changes can exhibit complex non-linear patterns and differ substantially across patients. Additionally, patient characteristics and outcomes may impact temporal change. Thus, traditional statistical approaches may be inadequate in characterizing ctDNA evolution over time. In this proof-of-concept study, we propose utilizing a new approach using a hierarchical random effects cubic spline model, which is sufficiently flexible to capture complex temporal ctDNA patterns while supporting the integration of patient characteristics. To demonstrate the benefits of the approach, a retrospective cohort of non-small cell lung cancer patients who received anti-EGFR therapies was analyzed. Model results are presented graphically in the form of patient-level response patterns, where each combination of patient characteristics produces a unique pattern. Patients with various ages, levels of health status, as well as mortality status were contrasted, where results provide examples of how the model can further our conceptualization of ctDNA dynamics and demonstrates how results can be used in targeted, patient-centered, clinical decision-making.\u003c/p\u003e","manuscriptTitle":"Longitudinal Assessment of Circulating Tumor DNA: A Proposed Statistical Framework","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-02-06 17:48:22","doi":"10.21203/rs.3.rs-3788054/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"704cc67b-2f6e-4b04-b5ad-817ace33f960","owner":[],"postedDate":"February 6th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":28558115,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":28558116,"name":"Biological sciences/Computational biology and bioinformatics/Statistical methods"}],"tags":[],"updatedAt":"2025-04-01T11:53:21+00:00","versionOfRecord":[],"versionCreatedAt":"2024-02-06 17:48:22","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3788054","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3788054","identity":"rs-3788054","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.