{"paper_id":"4c601dc7-75ae-4790-b42e-4fa1e6883f46","body_text":"npj | women's health Perspective\nhttps://doi.org/10.1038/s44294-024-00019-x\nData-driven insights can transform\nwomen’s reproductive health\nCheck for updates\nTomiko T. Oskotsky1 , Ophelia Yin2, Umair Khan1, Leen Arnaout1 &M a r i n aS i r o t a1,3\nThis perspective explores the transformative potential of data-driven insights to understand and\naddress women’s reproductive health conditions. Historically, clinical studies often excluded women,\nhindering comprehensive research into conditions such as adverse pregnancy outcomes and\nendometriosis. Recent advances in technology (e.g., next-generation sequencing techniques,\nelectronic medical records (EMRs), computational power) provide unprecedented opportunities for\nresearch in women ’s reproductive health. Studies of molecular data, including large-scale meta-\nanalyses, provide valuable insights into conditions like preterm birth and preeclampsia. Moreover,\nEMRs and other clinical data sources enable researchers to study populations of individuals,\nuncovering trends and associations in women ’s reproductive health conditions. Despite these\nadvancements, challenges such as data completeness, accuracy, and representation persist. We\nemphasize the importance of holistic approaches, greater inclusion, and re ﬁning and expanding on\nhow we leverage data and computational integrative approaches for discoveries so that we can beneﬁt\nnot only women ’s reproductive health but overall human health.\nMedicine involves evidence from research to guide its practice, but\nhistorically, clinical studies routinely excluded women for reasons\nincluding hormonal variability, potential harm to fetuses, and the belief\nthat ﬁndings from research on men could be extrapolated to women\n1.\nThese rationales and assumptions have hindered the study of how\nconditions like heart disease, diabetes, and Alzheimer ’s Disease may\naffect women differently than men, as well as the study of conditions\nassociated with women ’s reproductive health, including adverse preg-\nnancy outcomes, infertility, preterm birth (PTB), pre-eclampsia,\nrecurrent pregnancy loss, endometriosis, adenomyosis, ﬁbroids, and\nothers\n1,2. In addition, representation of women in clinical trials has been\ntraditionally lacking. Policy change is gradually resulting in improved\nrepresentation of women in clinical trials 3; nevertheless, research on\nwomen’s health conditions, particularly women ’s reproductive health,\nremains underfunded and underprioritized 4–8.\nWith advances in technology over time, ever-growing amounts of data\nhave become available for basic science and translational research, such as\nmolecular measurements—genomics, bulk and single-cell transcriptomics,\nproteomics, and also epidemiological and clinical data, including electronic\nmedical records, clinical notes, images, and clinical trial data. Moreover,\nsigniﬁcantly greater computational power has allowed faster processing and\nanalysis of large amounts of data. These advances provide tremendous\nopportunities to investigate a myriad of scientiﬁc questions in order to better\nunderstand the disease, discover novel diagnostics and therapeutics, make\nstrides in precision medicine and m ore within many areas, including\nreproductive health sciences and women’s health, more broadly.\nThe advent of next-generation sequencing techniques and public data-\nsharing repositories have led to vast amounts of molecular data becoming\nwidely available in recent years, enab ling numerous studies and meta-\nanalyses to gain insights into women ’s health conditions (Fig. 1). For\nexample, transcriptomics analyses have helped to enhance our under-\nstanding of endometriosis, a disorder affecting approximately 10% of\nwomen with pelvic pain and/or infertility whose diagnoses are made on\naverage a decade after onset of their pain\n9. A study of eutopic endometrial\ntranscriptomics data leveraging whole tissue deconvolution and single-cell\nRNA sequencing (scRNAseq) analytic techniques shed light into the\nimmune as well as non-immune cells thatmost likely contribute to the pro-\ninﬂammatory nature associ ated with this disorder\n10. This endometrial\nexpression data has been used to query the repository of drug expression\ndata to identify and validate therapeutic candidates to treat endometriosis\nbased on expression reversal. Fenoprofen, a non-steroidal anti-inﬂamma-\ntory drug (NSAID) rarely prescribed for endometriosis, was identiﬁed as a\ntop candidate and tested in an animal model of endometriosis, which\ndemonstrated its ability to successfully alleviate endometriosis-associated\n1Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA. 2Maternal–Fetal Medicine, Department of\nObstetrics, Gynecology & Reproductive Sciences, University of California, San Francisco, San Francisco, CA, USA. 3Department of Pediatrics, University of\nCalifornia, San Francisco, San Francisco, CA, USA. e-mail: tomiko.oskotsky@ucsf.edu; marina.sirota@ucsf.edu\nnpj Women's Health |            (2024) 2:14 1\n1234567890():,;\n1234567890():,;\n\nvaginal hyperalgesia11. With regard to PTB, a condition that affects ~10% of\ninfants born each year and is the leading cause of infant morbidity and\nmortality worldwide\n12, a meta-analysis of maternal and fetal transcriptomics\ndata found that immune signals are largely misregulated in women who end\nup delivering preterm with a reversed signal observed in babies 13.T h i s\nmaternal expression signature was further used to query a repository of drug\nexpression data to identify and validate therapeutic candidates to prevent\nPTB based on expression reversal. The study focused its validation efforts on\nlansoprazole, a proton-pump inhibitor, which has a strong reversal score\nand a good safety pro ﬁle. Lansoprazole was tested in an animal in ﬂam-\nmation model using LPS, which showed a signiﬁcant increase in fetal via-\nbility compared with LPS treatment alone\n14.\nThere are a number of large-scale genetics studies to explore genomic\nloci associated with PTB, including a landmark study by Zhang et al., which\nconsisted of 43,568 women of European ancestry using gestational duration\nas a continuous trait and term or preterm birth as a binary outcome\n15.I nt h e\ndiscovery and replication data sets, four loci (EBF1, EEFSEC, AGTR2, and\nWNT4) were signiﬁcantly associated with gestational duration, and func-\ntional analysis showed that an implicated variant in WNT4 alters the\nbinding of the estrogen receptor. To probe the role of environmental\nexposures in pregnancy outcomes, an analysis of 590 matched maternal and\ncord blood samples (total 295 pairs) using non-targeted analysis (NTA) was\nable to examine the differences in chemical abundance between maternal\nand cord blood samples, hypothesizing which are able to cross the\nplacenta\n16. This has inspired further large-scale integrative analyses of\nwhole-genome sequences, RNAseq, and DNA methylation data to identify\ngenomic variants and biomarker genes associated with PTB, such as\nKnijnenburg et al.’s study of 270 PTB and 521 control family trios\n17.I nt h i s\nstudy, they identiﬁed 72 candidate biomarker genes for very early PTB,\nassociated with growth signaling and immunity-related pathways such as\nNotch1 and IFN-γ signaling. In addition, they identiﬁed PTB-associated\ngenes RAB31 and RBPJ from all three data modalities.\nIn the microbiome space, there has been increased interest in the past\nfew decades to characterize microbiome pro ﬁles across body sites in the\ncontext of pregnancy outcomes and identify speciﬁc microbes that can be\nassociated with PTB. A meta-analysis of vaginal microbiome 16S rRNA\nsequencing data fromﬁve different studies conﬁrmed that multiple known\nbacteria (e.g. Atopobium spp. and Prevotella spp.) and some novel organ-\nisms (Clostridium sensu strictoand Olsenella) are associated with PTB, and\ndetermined that diversity in the composition of the microbiome early\nduring pregnancy was associated with PTB\n18.As t u d yb yH u a n ge ta l .\nintegrated cross-sectional and longitudinal vaginal microbiome data from\n12 previously published datasets and leveraged machine learning (ML)\nmodels to predict PTB from vaginal m icrobiome compositions, showing\nthat the vaginal microbiome is a strong predictor of early PTB\n19.\nA microbiome project led by our team applied the novel technique\nMaLiAmPi20 to aggregate and harmonize vaginal microbiome 16S rRNA\nsequencing data from a total of 11 different studies to see if PTB could be\nsuccessfully predicted from microbiome data. The ability to harmonize 16S\nd a t aa c r o s sv a r i o u ss t u d i e sm a r k sam a j o rc o n t r i b u t i o nt ot h eﬁeld, allowing\nresearchers to collate larger datasetsand ask more advanced questions about\nthe effect of other factors, such as race and sampling time, on PTB. A\ncrowdsourcing strategy in the form of a DREAM challenge invited the\ncomputational and scienti ﬁc communities to develop and apply ML\nFig. 1 | Data-driven approach to women’s health. This diagram showcases a number of types of data that can be leveraged to improve women ’s health research, including\ngenomics, transcriptomics, proteomics, microbiome, sociocultural, environmental exposures, EMRs and imaging. Created with BioRender.com.\nhttps://doi.org/10.1038/s44294-024-00019-x Perspective\nnpj Women's Health |            (2024) 2:14 2\n\nalgorithms using this vaginal microbiome data to predict PTB. Model\nperformance was assessed by challenge organizers using a held-out vali-\ndation dataset not available to challenge participants. Over 300 individuals\nengaged in this challenge, and top-performing models from this challenge\nachieved excellent prediction performance with an area under the receiver\noperator characteristic (AUROC) curve of up to 0.87. Moreover, features\nsuch as alpha diversity, VALENCIA community state types, and microbial\ncomposition were found to be important for the top-performing models\n21.\nThe above serves as a model for the translation of both new and publicly\navailable molecular data into clinica lly relevant predictive models and a\nbetter understanding of the treatment and prevention of PTB. Moreover,\nstudies are expanding beyond the associations between PTB and the vaginal\nmicrobiome: for example, DiGiulio et al. studied the dynamics of vaginal,\ndistal gut, saliva, and tooth/gum microbiota throughout pregnancy in PTB\nvs. TB cohorts\n22. In addition, other cohorts a nd studies have been estab-\nlished, supplementing vaginal microbiome data with investigations of oral\nand gut microbiome changes, among other microbiomes, in PTB vs. TB\npregnancies23,24. Advancements in genomic sequencing, such as whole-\ngenome shotgun sequencing, allow scientists to go beyond ecological\ncommunity characterization in PTB-a ssociated microbiomes, exploring\nspecies-level genetic proﬁles and trends that may be associated with PTB.\nLiao et al. introduced the term “microdiversity” to describe genomic\nmolecular diversity in their study thatexplored how evolutionary processes\ndrive mutagenesis, nucleotide diversi ty, and antimicrobial resistance in\nspeciﬁc species and in the vaginal microbial ecosystem\n25.\nBeyond preterm birth, other reproductive health conditions have\ngained greater understanding from analyses of molecular data, including\npreeclampsia. The pregnancy-speci ﬁch y p e r t e n s i v ed i s o r d e r so fp r e -\neclampsia, severe preeclampsia, and eclampsia affect ~6% of the US\npopulation and confer signi ﬁcant obstetric morbidity and mortality\n26.\nEfforts to ﬁnd accurate diagnostic tools, preventative measures, and ther-\napeutic treatments for preeclampsia have been elusive in part due to het-\nerogeneity in its clinical presentation. Recently, computational approaches\nhave made great strides in differentiating subtypes of preeclampsia using\ntranscriptional analyses, effectively grouping the disorder into maternal,\nimmunologic, and canonical groups based on gene expression\n27 as well as\nearly (before 34 weeks gestation) vs. late (at or after 34 weeks gestation) onset\npreeclampsia28. Another recent study identi ﬁed 946 unique differentially\nexpressed genes in preeclampsia cited by prior microarray studies, deﬁned\nthe “ignorome”, which included 445 candidate genes that had never been\nexperimentally explored, and utili zed a biomedical knowledge graph to\nreveal 53 clinically relevant and biologically actionable mechanistic\nassociations\n29. As technology has advanced from large chip microarray to\nbulk RNA sequencing and now to single-cell RNA sequencing, so too has\nour ability to develop greater granularity into the disorder. Most recently,\nimmune proﬁling of peripheral blood mononuclear cells in preeclampsia\nand single-cell analyses of preeclampsia placentas offer mechanistic insight\ninto individual cell-type contributions to the disorder\n30, lending hypotheses\nthat can be tested in cell culture or animal models of preeclampsia. Taken\ntogether, the approaches demonstrate our ability to leverage molecular data\nto better understand the nature of this complicated condition.\nA growing amount of clinical data has become available in this mil-\nlennium since 2004, when the Bush administration outlined the Health\nInformation Technology plan to assure Americans would have electronic\nhealth records to enable improved quality, affordability, and efﬁciency of\nhealth care\n31, and 2009 when the Obama administration prioritized and\nﬁnancially incentivized the transition from written to digital medical records\nas part of the Health Information T echnology for Economic and Clinical\nHealth (HITECH) Act32. Like written medical records, electronic medical\nrecords (EMR) capture clinical data o n patient populations, including\ndemographics, diagnosis codes, medication orders, and laboratory tests for\npatient care purposes. However, unliketheir written counterpart, electronic\nrecords can be more readily de-identi ﬁed and analyzed. Together with\nadvanced computational approaches, researchers have been able to leverage\nbillions of data points on millions of patients from sources such as EMRs,\nregistries, and claims databases for clinical and translational research. Access\nto de-identiﬁed health records of individuals is currently limited and can be\nexpensive to acquire through commercial sources. The availability of EMR\ndata currently tends to be restricted to those who have af ﬁliations with\nhealthcare institutions, although therea r ee f f o r t st oh a v eh e a l t hr e c o r d sd a t a\navailable more broadly to those outside these settings33.\nAnalyses of EMRs have provided critical information about the inci-\ndence and prevalence of women’s health conditions and revealed associated\ndiagnoses. With respect to endometriosis, EMR studies have delivered new\ninsights across all these fronts. A deca de-long retrospective cohort study\ncompleted using EMR found that the incidence rate of endometriosis\ndeclined from 2006 to 2015 while the frequency of chronic pelvic pain\ndiagnoses increased, indicating a potential shift in diagnosis patterns or a\nrelative change in the percentage of patients with endometriosis-associated\nconditions\n34. Another study investigating the validity of self-reported\nendometriosis by comparing it againstmedical record data found that self-\nreported diagnoses were reasonably accurate, ranging from 72% to 95%\nconcordance across four international cohorts 35. Towards phenotypic\nefforts, an analysis of medical record data from several hundred patients\nfound a number of composite “pointers”, such as the onset of pain and\nmenstrual symptoms within the same year, as signiﬁcantly correlated with\nendometriosis years before an of ﬁcial diagnosis\n36. Moreover, when the\nCOVID-19 global pandemic arose and dramatically changed clinical\npractice as well as the health of a po pulation, researchers were able to\np r o m p t l ye x p l o r eE M R sa n di n v e s t i g a t eh o wt h ep a n d e m i ci m p a c t e d\nwomen’s health. As pregnancy was a concern for being a risk factor for\nsevere COVID-19, one cohort study analyzed EMRs of over 20,000 women\nfrom 82 healthcare centers across the U.S. during theﬁrst several months of\nthe pandemic and found no difference in the risk of severe COVID-19 or\nmortality in pregnant versus non-pregnant women\n37. Another study\nexplored pregnancy-related complications and maternal death in a\nhealthcare database of 463 hospit als, with 849,544 women who were\npregnant before the pandemic and 805,324 women who were pregnant\nduring the pandemic. This study foundthat while the rates of several out-\ncomes, including preterm birth, fetal deaths, and stillbirths, were unchan-\nged, there were increases in maternal mortality during delivery\nhospitalization, pregnancy-related hypertensive disorders (i.e., gestational\nhypertension, pre-eclampsia, and eclampsia), and hemorrhage during the\npandemic compared with before\n38. With regard to preventive care, the effect\nof COVID-19 stay-at-home orders on the rate of cervical cancer screening\ntests was explored in a large EMR database of nearly 1.5 million women that\nfound that cervical cancer screening rate decreased signiﬁcantly by ~80%\nduring the lockdown compared to the year before the pandemic but\nreturned to near baseline levels after the stay-at-home orders were lifted\n39.\nEMRs have also been leveraged to study the effects of various therapeutics in\nthe context of pregnancy outcomes. For instance, a recent study explored the\npotential effects of serotonin selective reuptake inhibitor (SSRI) medications\nfor the treatment of depression, whichhave been previously associated with\nPTB. This retrospective cohort studyutilizing a sizeable primary care EHR\ndataset that included 216,070 deliveries of 176,866 patients over a 23-year\nperiod and a large-scale propensity score matching method that included all\ndemographic and clinical covariatesfound that the risk of PTB is associated\nmore so with depression rather than treatment with antidepressants\n40.\nWhile some previous observational studies found associations between\nexposure to antidepressants during pregnancy and increased risk of PTB41,42,\nthe ﬁndings from this larger observational study could provide hope for\nthose concerned about continuing an tidepressant therapeutic regimen\nduring pregnancy and motivate additionalstudies, particularly clinical trials,\nfor further investigation. EMR data have also been used in efforts to predict\noutcomes of interest. One EMR-based study successfully leveraged the\nrecords of over 35,000 deliveries and found that when machine learning\nmodels were applied to this data, the models could not only successfully\npredict singleton PTB but outperform comparable models trained using\nonly known PTB risk factors. Moreover, the prediction models were vali-\ndated on a cohort of nearly 6000 deliveries from a different healthcare center\nhttps://doi.org/10.1038/s44294-024-00019-x Perspective\nnpj Women's Health |            (2024) 2:14 3\n\nwith accuracy of the models maintained in this independent cohort43.O f\ncourse, there are many limitations to leveraging EMR data, including data\nmissingness. Nonetheless, it is an incredible opportunity to leverage real-\nworld patient data to impact disease diagnostics and therapeutics, as\ndemonstrated by the examples abov e, especially in the area of women ’s\nreproductive health.\nOther sources of data have been investigated to better understand\nwomen’s reproductive health conditions, including patient registries and\nenvironmental exposure databases. Huang et al. linked the birth cohortﬁle\nmaintained by the California Of ﬁce of Statewide Health Planning and\nDevelopment across 1.8 million birthsand the CalEnviroScreen 3.0 dataset\nfrom California Communities Environmental Health Screening Tool and\nfound an association between Pollution Burden, particulate matter≤2.5 μm\n(PM2.5), and Drinking Water Scores and PTB. Additionalﬁndings suggest\nthat certain drinking water contaminants, such as arsenic and nitrate, are\nassociated with higher rates of PTB in California\n44.\nThere is great potential in the landscape of preterm birth, preeclampsia,\nendometriosis, and other women’s reproductive health disorders and the\nutility of molecular, clinical, and other data. Advanced computational\nmodels, machine learning approaches, and drug treatment identi ﬁcation\nenable researchers and clinicians to gain a better understanding and\nimprove outcomes for these conditions. However, there are some limita-\ntions that should be recognized. Public data often suffers from incomplete\nand sometimes inaccurate meta-data. The populations that are captured in\nthese datasets are often not representative of the general population.\nTherefore, we need data collection efforts to prioritize having an adequately\nbroad representation of people from different backgrounds to reduce dis-\nparities and ensure that research ﬁndings and any resulting advances in\nhealthcare practices beneﬁt not just a subset of individuals but everyone\n45,46.\nPregnant and lactating individuals should be speciﬁcally included in pro-\nspective studies and clinical trials, asour experience with the recent COVID-\n19 pandemic has attested to their exclusion in almost all vaccine and\ntreatment trials and the subsequent gaps in data to provide counseling in\npregnancy\n47. Other areas in which we lack data in pregnancy include\nimmunologics utilized for autoimmune disease and organ transplant, as well\nas the best treatment for the pregnant person with signi ﬁcant medical\ncomorbidities. As there is a lack of diversity not just among those who\nparticipate in and are represented in research but those who conduct\nresearch work, there should also be ef forts to train, recruit, and support\nresearchers from underrepresented backgrounds\n48.\nIt is also important to note that i ssues of data quality and bias,\nwhich must be tackled in all data-driven efforts, are equally relevant in\nwomen ’s health research. In observati onal studies, selection bias\n(which itself has historically led to the exclusion of women in health\nresearch, as discussed in this persp ective) can skew the composition of\nstudy populations along any number of demographic or clinical axes\nand profoundly affect the generalizability of ﬁndings\n49.B o t hc l i n i c a l\nand experimental efforts can be p rone to measurement errors, stem-\nming from myriad causes such as mistakes in preparation or data\ncollection and instrumentation ﬂaws, which can then lead to deceptive\nconclusions 50. Furthermore, confounding variables present a perva-\nsive challenge throughout science, po tentially masking the true effects\nof the variable of interest by being associated with both the exposure\nand the outcome\n51. In response to these challenges, we advocate for the\ncontinued improvement of resear ch methods through the develop-\nment and incorporation of standardized protocols 52 and validation\nefforts 53. Moreover, the adoption of transparent reporting practices,\nsuch as those laid out by CONSORT and STROBE initiatives 54 or the\nCell Press STAR Methods model 55, will enhance reproducibility and\nunderpin the integrity and credibility of data-driven ﬁndings in\nwomen ’s reproductive health.\nWhile advancements on the data collection and technical analysis\nmethods fronts are essential to exploring concerns in women’sh e a l t h ,i ti s\ncrucial to consider the impact of social determinants of health on patients’\npresentations and clinical outcom es. For example, patients from low\nsocioeconomic status who rely on Medicare or Medicaid or are under- or\nuninsured may not have reliable a ccess to a physician to help manage\ngynecological conditions, causing adverse health outcomes56.I na d d i t i o n ,\nmedical racism is a culprit in the increased preterm birth rates in non-white\nwomen in the US57, and inequalities that can manifest in different forms—\nsuch as maternal stress and environmental exposure to toxins due to his-\ntorical redlining—can contribute to preterm birth risk, as surveyed by\nepigenetic and gene-environment interaction studies58. Thus, it is crucial to\nadopt an intersectional approach to studying women’s health conditions,\ntaking into account how cultural, soc ioeconomic, geographic, and racial\ndisparity factors inﬂuence patients’ outcomes and healthcare experiences,\nwhich can inform a more holistic understanding of disease and contribute to\nimproved approaches to care. A good ﬁrst step would be to recruit larger,\nmore diverse cohorts for studies to represent more realistic patient popu-\nlations. Studies of women’s reproductive health should not focus solely on a\nperson’s ability to have children or not but consider the individual holi-\nstically, including mental health and quality of life.\nChallenges going forward will not necessarily be generating sufﬁcient\namounts of data for computational a nalyses but accurate phenotyping\nstrategies, reﬁning the analytical methods to gain greater biological insights,\nexpanding on computational drug discovery opportunities for the\nadvancement of therapeutics,ﬁnding ways that large language models and\nother new technological developments can enable discoveries, and bringing\ncloser to reality the promise of precision medicine. Integrating and ana-\nlyzing different types of -omics data to study women’sh e a l t hc o n d i t i o n sc a n\nprovide revelations in causes of disease and targets for treatment\n59.M u l t i -\nomics approaches have resulted in greater insights into biological signals\nassociated with term and preterm birth 60,61 and could be increasingly\nleveraged to better understand pregnancy and other women’sh e a l t hc o n -\nditions. Moreover, digital twins c an provide a data-driven way of mon-\nitoring, modeling, and managing conditions that can be tailored to an\nindividual’s speciﬁc needs by integrating real-time data from various sources\n(e.g., clinical records, sensors, mobilehealth tracking applications, wearable\ndevices) and artiﬁcial intelligence\n62. Digital twin technology could offer a\ntransformative approach to women’s reproductive health, from identifying\npotential pregnancy complication s early to managing endometriosis\nsymptoms,ﬁnding optimal drugs and doses for treatments, and more. It is\nimperative, however, that we ensure discoveries from future research and\ntechnologies developed for women’s reproductive health do not widen the\ngap between those who are well-represented and privileged and those from\nunder-represented and under-resourced backgrounds. Expanding on how\nwe leverage molecular, clinical, sociocultural, and other data combined with\nrobust computational integrative a pproaches for discoveries while we\nprioritize broader representation in studies will beneﬁtn o tj u s tw o m e n’s\nreproductive health but all areas of human health for everyone.\nReceived: 1 February 2024; Accepted: 20 April 2024;\nReferences\n1. Institute of Medicine, Board on Population Health and Public Health\nPractice, & Committee on Women ’s Health Research. Women’s\nHealth Research: Progress, Pitfalls, and Promise(National Academies\nPress (US), Washington (DC), 2010).\n2. Institute of Medicine (US) Committee on Understanding the Biology of\nSex and Gender Differences.Exploring the Biological Contributions to\nHuman Health: Does Sex Matter? (National Academies Press (US),\nWashington (DC), 2001).\n3. Of ﬁce of Research on Women ’s Health. History of Women’s\nParticipation in Clinical Research. https://orwh.od.nih.gov/toolkit/\nrecruitment/history (2019).\n4. Institute of Medicine (US) Committee on Women ’s Health Research.\nIntroduction. In Women’s Health Research: Progress, Pitfalls, and\nPromise. (ed. Grossblatt, N.) (National Academies Press (US),\nWashington, DC, 2010).\nhttps://doi.org/10.1038/s44294-024-00019-x Perspective\nnpj Women's Health |            (2024) 2:14 4\n\n5. Smith, K. Women’s Health Research Lacks Funding —these Charts\nShow How. https://www.nature.com/immersive/d41586-023-01475-\n2/index.html (2023).\n6. Mirin, A. A. Gender disparity in the funding of diseases by the U.S.\nNational Institutes of Health. J. Womens Health 2002 30,\n956–963 (2021).\n7. Fisk, N. & Atun, R. Systematic analysis of research underfunding in\nmaternal and perinatal health. BJOG Int. J. Obstet. Gynaecol 116,\n347–356 (2009).\n8. Rice, L. W. et al. Increasing NIH funding for academic departments of\nobstetrics and gynecology: a call to action. Am. J. Obstet. Gynecol.\n223, 79.e1–79.e8 (2020).\n9. Giudice, L. C. Clinical practice. Endometriosis. N. Engl. J. Med. 362,\n2389–2398 (2010).\n10. Bunis, D. G. et al. Whole-tissue deconvolution and scRNAseq analysis\nidentify altered endometrial cellular compositions and functionality\nassociated with endometriosis. Front. Immunol. 12, 788315 (2022).\n11. Oskotsky, T. T. et al. Identifying therapeutic candidates for\nendometriosis through a transcriptomics-based drug repositioning\napproach. iScience 109388 https://doi.org/10.1016/j.isci.2024.\n109388 (2024).\n12. Blencowe, H. et al. National, regional, and worldwide estimates of\npreterm birth rates in the year 2010 with time trends since 1990 for\nselected countries: a systematic analysis and implications. Lancet\nLond. Engl. 379, 2162–2172 (2012).\n13. Vora, B. et al. Meta-analysis of maternal and fetal transcriptomic data\nelucidates the role of adaptive and innate immunity in preterm birth.\nFront. Immunol. 9, 993 (2018).\n14. Le, B. L., Iwatani, S., Wong, R. J., Stevenson, D. K. & Sirota, M.\nComputational discovery of therapeutic candidates for preventing\npreterm birth. JCI Insight 5, e133761, 133761 (2020).\n15. Zhang, G. et al. Genetic associations with gestational duration and\nspontaneous preterm birth. N. Engl. J. Med. 377, 1156–1167 (2017).\n16. Panagopoulos Abrahamsson, D. et al. A comprehensive non-targeted\nanalysis study of the prenatal exposome. Environ. Sci. Technol. 55,\n10542–10557 (2021).\n17. Knijnenburg, T. A. et al. Genomic and molecular characterization of\npreterm birth. Proc. Natl Acad. Sci. USA 116, 5819–5827 (2019).\n18. Kosti, I., Lyalina, S., Pollard, K. S., Butte, A. J. & Sirota, M. Meta-\nanalysis of vaginal microbiome data provides new insights into\npreterm birth. Front. Microbiol. 11, 476 (2020).\n19. Huang, C. et al. Meta-analysis reveals the vaginal microbiome is a better\npredictor of earlier than later preterm birth.\nBMC Biol. 21, 199 (2023).\n20. Minot, S. S. et al. MaLiAmPi enables generalizable and taxonomy-\nindependent microbiome features from technically diverse 16S-\nbased microbiome studies. Cell Rep. Methods 3, 100639 (2023).\n21. Golob, J. L. et al. Microbiome preterm birth DREAM challenge:\ncrowdsourcing machine learning approaches to advance preterm\nbirth research. Cell Rep. Med. 101350 https://doi.org/10.1016/j.xcrm.\n2023.101350 (2023).\n22. DiGiulio, D. B. et al. Temporal and spatial variation of the human\nmicrobiota during pregnancy. Proc. Natl Acad. Sci. USA 112,\n11060–11065 (2015).\n23. Corwin, E. J. et al. Protocol for the Emory University African American\nvaginal, oral, and gut microbiome in pregnancy Cohort study. BMC\nPregnancy Childbirth 17, 161 (2017).\n24. Ye, C. et al. The periodontopathic bacteria in placenta, saliva and\nsubgingival plaque of threatened preterm labor and preterm low birth\nweight cases: a longitudinal study in Japanese pregnant women.Clin.\nOral Investig. 24, 4261–4270 (2020).\n25. Liao, J. et al. Microdiversity of the vaginal microbiome is associated\nwith preterm birth. Nat. Commun. 14, 4997 (2023).\n26. Rana, S., Lemoine, E., Granger, J. P. & Karumanchi, S. A.\nPreeclampsia: pathophysiology, challenges, and perspectives. Circ.\nRes. 124, 1094–1112 (2019).\n27. Leavey, K. et al. Unsupervised placental gene expression pro ﬁling\nidentiﬁes clinically relevant subclasses of human preeclampsia.\nHypertension Dallas, TX 1979 68, 137–147 (2016).\n28. Broekhuizen, M. et al. The placental innate immune system is altered\nin early-onset preeclampsia, but not in late-onset preeclampsia.\nFront. Immunol. 12, 780043 (2021).\n29. Callahan, T. J. et al. Knowledge-driven mechanistic enrichment of the\npreeclampsia ignorome. In Biocomputing 2023 (eds Altman, R. B.\net al.) 371–382 (World Scientiﬁc, 2022).\n30. Admati, I. et al. Two distinct molecular faces of preeclampsia revealed\nby single-cell transcriptomics. Medicine 4, 687–709.e7 (2023).\n31. The White House Of ﬁce of the Press Secretary to President George W.\nBush. A New Generation of American Innovation. https://\ngeorgewbush-whitehouse.archives.gov/infocus/technology/\neconomic_policy200404/chap3.html (2004).\n32. Adler-Milstein, J. & Jha, A. K. Sharing clinical data electronically: a\ncritical challenge for ﬁxing the health care system. JAMA 307,\n1695–1696 (2012).\n33. All of Us Research Program NIH. All of Us Seeks Input on Broadening\nParticipants’ Electronic Health Record Data. https://allofus.nih.gov/\nnews-events/announcements/all-us-seeks-input-broadening-\nparticipants-electronic-health-record-data (2022).\n34. Christ, J. P. et al. Incidence, prevalence, and trends in endometriosis\ndiagnosis: a United States population-based study from 2006 to\n2015. Am. J. Obstet. Gynecol. 225, 500.e1–500.e9 (2021).\n35. Shafrir, A. L. et al. Validity of self-reported endometriosis: a\ncomparison across four cohorts. Hum. Reprod. 36,1 2 6 8–1278\n(2021).\n36. Burton, C. et al. Pointers to earlier diagnosis of endometriosis: a\nnested case-control study using primary care electronic health\nrecords. Br. J. Gen. Pract. 67, e816–e823 (2017).\n37. Hsu, A. L. et al. Coronavirus disease 2019 (COVID-19) disease\nseverity: pregnant vs. nonpregnant women at 82 facilities.Clin. Infect.\nDis 74, 467–471 (2022).\n38. Molina, R. L. et al. Comparison of pregnancy and birth outcomes\nbefore vs. during the COVID-19 pandemic. JAMA Netw. Open 5,\ne2226531 (2022).\n39. Miller, M. J. et al. Impact of COVID-19 on cervical cancer screening\nrates among women aged 21 –65 years in a large integrated health\ncare system—Southern California, January 1 –September 30, 2019,\nand January 1–September 30, 2020. Morb. Mortal. Wkly. Rep. 70,\n109–113 (2021).\n40. Amit, G. et al. Antidepressant use during pregnancy and the risk of\npreterm birth – a cohort study. npj Womens Health 2,1 –7 (2024).\n41. Ross, L. E. et al. Selected pregnancy and delivery outcomes after\nexposure to antidepressant medication: a systematic review and\nmeta-analysis. JAMA Psychiatry 70, 436–443 (2013).\n42. Eke, A. C., Saccone, G. & Berghella, V. Selective serotonin reuptake\ninhibitor (SSRI) use during pregnancy and risk of preterm birth: a\nsystematic review and meta-analysis. BJOG Int. J. Obstet. Gynaecol.\n123, 1900–1907 (2016).\n43. Abraham, A. et al. Dense phenotyping from electronic health records\nenables machine learning-based prediction of preterm birth. BMC\nMed. 20, 333 (2022).\n44. Huang, H. et al. Investigation of association between environmental\nand socioeconomic factors and preterm birth in California. Environ.\nInt. 121, 1066–1078 (2018).\n45. Oh, S. S. et al. Diversity in clinical and biomedical research: a promise\nyet to be ful ﬁlled. PLoS Med. 12, e1001918 (2015).\n46. Ibrahim, H., Liu, X., Zariffa, N., Morris, A. D. & Denniston, A. K. Health\ndata poverty: an assailable barrier to equitable digital health care.\nLancet Digit. Health 3, e260–e265 (2021).\n47. Kons, K. M. et al. Exclusion of reproductive-aged women in COVID-19\nvaccination and clinical trials. Women’s Health Issues 32,\n557–563 (2022).\nhttps://doi.org/10.1038/s44294-024-00019-x Perspective\nnpj Women's Health |            (2024) 2:14 5\n\n48. Oskotsky, T. et al. Nurturing diversity and inclusion in AI in\nBiomedicine through a virtual summer program for high school\nstudents. PLoS Comput. Biol. 18, e1009719 (2022).\n49. Rothman, K. J. Epidemiology: An Introduction (Oxford University\nPress, 2012).\n50. Innes, G. K. et al. The measurement error elephant in the room:\nchallenges and solutions to measurement error in epidemiology.\nEpidemiol. Rev 43,9 4–105 (2022).\n51. Greenland, S. & Morgenstern, H. Confounding in health research.\nAnnu. Rev. Public Health 22, 189–212 (2001).\n52. Mahajan, R. et al. Standardized Protocol Items Recommendations for\nObservational Studies (SPIROS) for observational study protocol\nreporting guidelines: protocol for a Delphi Study.JMIR Res. Protoc. 9,\ne17864 (2020).\n53. Ehrenstein, V. et al. Helping everyone do better: a call for validation\nstudies of routinely recorded health data. Clin. Epidemiol. 8,\n49–51 (2016).\n54. Bolignano, D. et al. The quality of reporting in clinical research: the\nCONSORT and STROBE initiatives. Aging Clin. Exp. Res. 25,\n9–15 (2013).\n55. Tonzani, S. & Fiorani, S. The STAR methods way towards\nreproducibility and open science. iScience 24, 102137 (2021).\n56. Fourquet, J. et al. Disparities in healthcare services in women with\nendometriosis with public vs private health insurance. Am. J. Obstet.\nGynecol. 221, 623.e1–623.e11 (2019).\n57. Balascio, P. et al. Measures of racism and discrimination in preterm\nbirth studies. Obstet. Gynecol. 141,6 9–83 (2023).\n58. Hong, X., Bartell, T. R. & Wang, X. Gaining a deeper understanding of\nsocial determinants of preterm birth by integrating multi-omics data.\nPediatr. Res. 89, 336–343 (2021).\n59. Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease.\nGenome Biol. 18, 83 (2017).\n60. Ghaemi, M. S. et al. Multiomics modeling of the immunome,\ntranscriptome, microbiome, proteome and metabolome adaptations\nduring human pregnancy. Bioinformatics 35,9 5–103 (2019).\n61. Espinosa, C. A. et al. Multiomic signals associated with maternal\nepidemiological factors contributing to preterm birth in low- and\nmiddle-income countries. Sci. Adv. 9, eade7692 (2023).\n62. Sun, T., He, X. & Li, Z. Digital twin in healthcare: recent updates and\nchallenges. Digit. Health 9, 20552076221149651 (2023).\nAcknowledgements\nThe authors would like to thank Jean Costello, Claire Dubin, and Boris\nOskotsky for their helpful discussion and advice. This work was funded by\nthe National Institutes of Health (NIH) Eunice Kennedy Shriver National\nInstitute of Child Health and Human Development (NICHD) [P01 HD106414-\n01, P01 HD106414-02, R01 HD105256] and the March of Dimes Prematurity\nResearch Center at UCSF [60982053-50185]. The funders played no role in\nthe study design, data collection, analysis and interpretation of data, or the\nwriting of this manuscript.\nAuthor contributions\nT.T.O., O.Y., U.K., L.A. and M.S. wrote the main manuscript text, and T.T.O.\nand M.S. prepared Fig. 1. All authors reviewed the manuscript.\nCompeting interests\nThe authors declare no competing interests.\nAdditional information\nCorrespondenceand requests for materials should be addressed to\nTomiko T. Oskotsky or Marina Sirota.\nReprints and permissions informationis available at\nhttp://www.nature.com/reprints\nPublisher’s noteSpringer Nature remains neutral with regard to jurisdictional\nclaims in published maps and institutional afﬁliations.\nOpen Access This article is licensed under a Creative Commons\nAttribution 4.0 International License, which permits use, sharing,\nadaptation, distribution and reproduction in any medium or format, as long\nas you give appropriate credit to the original author(s) and the source,\nprovide a link to the Creative Commons licence, and indicate if changes\nwere made. The images or other third party material in this article are\nincluded in the article ’s Creative Commons licence, unless indicated\notherwise in a credit line to the material. If material is not included in the\narticle’s Creative Commons licence and your intended use is not permitted\nby statutory regulation or exceeds the permitted use, you will need to\nobtain permission directly from the copyright holder. To view a copy of this\nlicence, visit http://creativecommons.org/licenses/by/4.0/\n.\n© The Author(s) 2024\nhttps://doi.org/10.1038/s44294-024-00019-x Perspective\nnpj Women's Health |            (2024) 2:14 6","source_license":"CC-BY-4.0","license_restricted":false}