Abstract
Background Quantitative bias analyses often rely on unrealistic assumptions and do not fully reflect the complexities of healthcare data.
Methods
We describe a ‘plasmode’ simulation-based bias analysis for residual confounding from unmeasured variables by leveraging granular information from a subset of cohort members. We generated 500 simulated cohorts based on individual-level claims and linked electronic health record (EHR) data identifying new users of varenicline and bupropion from the Mass General Brigham site of the FDA Sentinel Real World Evidence Data Enterprise. Two adverse outcomes were simulated: 1) neuropsychiatric hospitalizations and 2) major adverse cardiovascular events (MACE), and measured confounding factors, identified from information available in claims including demographics, comorbid conditions, and comedications, were tailored to each outcome. Residual confounding was simulated using potential confounders measured in EHRs but unmeasured in claims including suicidal ideation for the neuropsychiatric outcomes and body mass index (BMI), blood pressure (BP), and smoking pack-years for the MACE outcome. These simulations retained the correlation between claims and EHR-based confounders observed in empirical data for realistic reflection of proxy adjustment of unmeasured confounders. Analyses were conducted in simulated data with and without adjustment for the EHR-based covariates to evaluate the extent of residual confounding in claims-only analyses.
Results
After 500 simulations, the median absolute standardized mean difference (ASMD) between treatment groups in the unadjusted sample was 0.16 for suicidal ideation; while <0.1 for BMI, BP, and smoking pack-years. For both outcomes, adjustment using claims-based variables provided relative bias close to 0, leading to the conclusion that EHR-measured confounders that were unmeasured in claims were unlikely to result in strong residual confounding within realistic simulations informed by empirical data.
Conclusion
The proposed approach provides a method for quantifying bias in non-randomized studies threatened by unavailability of potentially important confounding variables.
Key points
Residual confounding by unmeasured factors is a central threat in pharmacoepidemiology that is almost always acknowledged in published studies but seldom quantified.
We describe a plasmode-simulation based approach to systematically design quantitative bias analyses that reflect the complexities of routinely collected healthcare data by leveraging detailed electronic health records from a subset.
We provide open-source software code to enable other researchers to adopt this method in future studies and improve the reliability of their findings.
Plain language summary This study introduces a new way for researchers to better understand and measure bias caused by missing health information in large insurance databases. Using detailed hospital records alongside insurance claims data, we created realistic computer simulations to test how much of the observed risk in safety studies could be explained away by missing important health factors, like depression or smoking habits, that aren’t always recorded in insurance data. The approach is flexible, uses real patient data, and helps researchers make stronger, more reliable conclusions about risks and benefits of treatments, even when some patient information is not available in all records.
Competing Interest Statement
Dr. Desai reports serving as Principal Investigator on investigator-initiated grants to the Brigham and Womens Hospital from Novartis, Vertex, and Bayer on unrelated projects. Dr. Ball is an author on US Patent 9,075,796-Text mining for large medical text datasets and corresponding medical text classification using informative feature selection. At present this patent is not licensed and does not generate royalties. Dr. Wang reports ad hoc consulting to Exponent Inc, Cytel Inc, and MITRE an FFRDC for the Centers for Medicare and Medicaid Services for unrelated work.
Funding Statement
This project was supported by Task Order 75F40123F19010 under Master Agreement 75F40119D10037 from the US Food and Drug Administration (FDA).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Data used for the study were simulated from an identifiable individual-level patient cohort. Use of this data was approved by the Brigham & Women's Hospital IRB as Public Health Surveillance activity per Health and Human Services regulations set forth in 45 CFR 46 supporting the FDA Sentinel Initiative.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* Primary Investigator contact information: rdesai{at}bwh.harvard.edu
Joint Authorship: N/A
Funding: This project was supported by Task Order 75F40123F19010 under Master Agreement 75F40119D10037 from the US Food and Drug Administration (FDA).
Conflict of Interest: Dr. Desai reports serving as Principal Investigator on investigator-initiated grants to the Brigham and Women’s Hospital from Novartis, Vertex, and Bayer on unrelated projects. Dr. Ball is an author on US Patent 9,075,796, “Text mining for large medical text datasets and corresponding medical text classification using informative feature selection.” At present this patent is not licensed and does not generate royalties. Dr. Wang reports ad hoc consulting to Exponent Inc, Cytel Inc, and MITRE an FFRDC for the Centers for Medicare and Medicaid Services for unrelated work.
Disclaimer: Representatives of the FDA reviewed a draft of the manuscript for presence of confidential information and accuracy regarding statement of any FDA policy. The views are those of the author(s) and do not necessarily represent the official views of, nor an endorsement, by FDA/HHS, or the U.S. Government.
Ethics: This Sentinel project is a public health surveillance activity conducted under the authority of the Food and Drug Administration and, accordingly, is not subject to Institutional Review Board oversight per Basic HHS Policy for Protection of Human Research Subjects, 45 CFR §46.102(l)(2).
Data Availability
All data produced in the present study are available upon reasonable request to the authors
file:///C:/Users/hp814/AppData/Local/Microsoft/Windows/INetCache/Content.Outlook/EIRXYKJU/Appendix%20-%20plasmode_simba%20demo.html