Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions

doi:10.21203/rs.3.rs-4452118/v1

Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions

2024 · doi:10.21203/rs.3.rs-4452118/v1

preprint OA: closed

Full text JSON View at publisher

Full text 219,496 characters · extracted from preprint-html · click to expand

Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions Rheanna M Mainzer, Margarita Moreno-Betancur, Cattram D Nguyen, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4452118/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 04 Sep, 2024 Read the published version in BMC Medical Research Methodology → Version 1 posted 10 You are reading this latest preprint version Abstract Background Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions (“missing completely at random”, “missing at random” [MAR], “missing not at random”) are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. Methods We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically. Results Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis. Conclusion Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data. Missing data causal inference missingness mechanism Figures Figure 1 Background Observational studies in medical and health-related research often aim to answer causal questions, which are sharply defined as the estimation of an average causal effect (ACE) of an exposure on an outcome in a population of interest.( 1 , 2 ) Missing data in observational studies often occurs in multiple variables required for the estimation of ACEs, such as the exposure, the outcome and/or the covariates used to control for confounding. Applying standard methods for ACE estimation (e.g., outcome regression with covariate adjustment) using data from complete records (“complete cases analysis” [CCA]) may lead to selection bias.( 3 ) Therefore, missing data need to be carefully considered and addressed to minimise the potential for selection bias. To date most reviews of the handling of missing data have been carried out in the context of trials (see ( 4 ) and references therein). In contrast, there has been little attention given to how missing data are handled in observational studies, a context in which multivariable missingness is often encountered. One flexible and widely recommended approach for estimation in the presence of multivariable missingness is multiple imputation (MI).( 5 – 7 ) In the first stage of MI, missing data are imputed multiple times with random draws from the predictive distribution of the missing values given the observed data and a specified imputation model. In the second stage, the statistical analysis of interest (e.g., outcome regression with covariate adjustment) is applied to each imputed dataset and the results are combined to obtain a single estimate with associated standard error.( 5 ) Mackinnon (2010) and Hayati Rezvan et al. (2015) reviewed the implementation and documentation of MI in both trials and observational studies,( 8 , 9 ) and Karahalios et al. (2012) reviewed how missing exposure data are reported in large cohort studies with one or more waves of follow-up.( 10 ) However, none of these reviews focussed on complexities that arise due to multivariable missingness. The aim of the current study was to review the handling of missing data in observational studies that address causal questions using MI. A scoping review was conducted to systematically benchmark the current state of practice, focussing on four key areas: missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. In the next section we describe considerations for transparent reporting within each of these four areas to provide context for our review. We then describe our scoping review methodology and present our results. We end with a discussion of our findings and key messages. Considerations for reporting ACE estimation with MI from incomplete observational data Missing data summaries Describing the amount of missing data is an important first step for transparent reporting as the potential for selection bias will generally increase with larger proportions of missing data. When data are missing in a single variable, the number (%) of completely observed values for that variable also summarises the number (%) of complete cases. In contrast, when multiple variables required for analysis are incompletely observed, the number (%) of observed values for each variable may vastly differ from the number (%) with complete cases because of the pattern of missing data, that is, the way in which the variables are jointly missing. In the latter context, a complete description of the missing data would include summaries of the missing data for each variable, as well as summaries of each missing data pattern. Such summaries can be easily obtained in statistical software. Missing data assumptions Describing the relationship between the missing and observed data, i.e., the “missing data mechanism”, is important because the performance of any estimation method depends critically on the missing data mechanism. Sometimes this will be known (e.g. a machine used for measurement temporarily stopped working), but in most cases the missing data mechanism will be unknown and assumptions about the mechanism, along with a justification for these assumptions, are required. Missingness assumptions are often expressed using the classification of missing data patterns as “missing completely at random” (MCAR), “missing at random” (MAR) or “missing not at random” (MNAR).( 11 , 12 ). However, assessing the plausibility of the MCAR/MAR/MNAR assumptions in the context of multivariable missingness is difficult, partly due to the existence of several different, often imprecise, definitions of MCAR, MAR and MNAR in the literature and the difficulty in interpreting these definitions,( 13 ) and partly because assessment involves making a judgement about the dependence (or lack thereof) of the distribution of the missing data pattern on the observed and missing data.( 11 , 14 ) An attractive alternative to using the MCAR/MAR/MNAR assumptions is to view missing data as a causal problem and consider causes of missingness for each incompletely observed variable using missingness directed acyclic graphs (m-DAGs).( 3 , 15 ) m-DAGs are an extension to standard causal diagrams (DAGs) that include nodes to represent missingness in each incomplete variable, thereby allowing for the clear and transparent specification of assumptions about the causes of missing data, as well as the causal relationships amongst the main variables of interest. Although developing a realistic m-DAG can be challenging and time-consuming, m-DAGs lead to assumptions that are more transparent and easier to assess than assumptions expressed using the MCAR/MAR/MNAR framework. Uncertainty about the assumptions depicted in the m-DAG can be assessed using a sensitivity analysis (see next section). Primary and sensitivity analyses The next important area for reporting is to justify and describe an appropriate primary method for estimation of the ACE, given the missingness assumptions. It is well known that both a CCA and standard MI (an implementation of MI that does not incorporate an external (to the data) assumption about a difference between the distribution of the observed and missing data) can provide consistent estimation of the ACE when data are MCAR, that standard MI can provide consistent estimation when data are MAR, and that both approaches may provide biased estimation when data are MNAR. However, in the context of multivariable missingness, a CCA can also provide consistent estimation under missingness mechanisms that could be classified as MAR, and both CCA and MI have been shown in theory and simulations to provide unbiased or approximately unbiased estimation of ACEs across a range of missingness mechanisms that could be classified as MNAR.( 16 ) Therefore, it is not straightforward to justify an estimation approach even if it is believed that data are MAR or MNAR. In contrast, for a given m-DAG, graph theory can be used to establish whether the ACE is recoverable (that is, whether it can be estimated unbiasedly from the observed data). If the ACE is recoverable, the process of establishing recoverability can aid in determining whether a CCA and/or standard MI would be appropriate for estimation (see, e.g., the worked example provided by Lee et. al. ( 11 )). If the ACE is not recoverable, neither standard MI nor a CCA can be used for unbiased estimation, and a more sophisticated approach that incorporates an assumption about a difference in distribution between the missing and observed values is needed. For example, the not-at-random-fully-conditional-specification (NARFCS) procedure extends standard MI to incorporate such assumptions through the inclusion of a sensitivity parameter “delta”, elicited from external information, that represents the difference between the distributions of the observed and missing values.( 17 ) The assumptions made about the missing data and how this justifies the choice of analytic method for the primary analysis should be carefully described. Sensitivity analyses to reflect uncertainty due to assumptions made about the missing data for the primary analysis are strongly recommended.( 18 , 19 ) There are two types of missing data sensitivity analyses to consider; the first is to examine the sensitivity of estimates to the assumptions made about the causes of missing data, e.g. the existence or strength of arrows in the m-DAG. The second type of missing data sensitivity analysis is to examine the sensitivity of estimates to assumptions made for modelling the missing data, e.g., the form of the imputation model. As with the primary analysis, sensitivity analyses should be justified and described in enough detail that the analysis could be reproduced. MI implementation When using standard MI for estimation, quantities that need to be described to ensure that the analysis could be reproduced include, but are not limited to: the imputation method, e.g., multivariate normal imputation or multivariate imputation by chained equations; the imputation model, e.g., which variables are included and in what form; if using multivariate imputation by chained equations, the models/methods that are used to impute each incomplete variable, e.g., linear or logistic regression; the number of imputations conducted; the analysis model that is fitted to obtain estimates within each imputed dataset; and the method for combining estimates across imputed datasets.( 12 ) If using an approach that incorporates an assumption about a difference in distribution between the missing and observed values (e.g., a NARFCS procedure), then, in addition to the above quantities, it is important to describe how the assumption is incorporated in the models used for the estimation procedure. Methods The protocol for this scoping review has been published previously.( 4 ) Briefly, we included observational studies that aimed to answer at least one causal research question using MI, published in International Journal of Epidemiology, American Journal of Epidemiology, European Journal of Epidemiology, Journal of Clinical Epidemiology and Epidemiology between January 2019 and December 2021. These journals were chosen as they are high ranking, general journals in epidemiology that should capture current best practices in the use of MI for estimating ACEs from observational data. A full text search for the term “multiple imputation” was conducted on the journal websites, following the methodology of Hayati Rezvan et al .( 8 ) Causal questions were identified if the study authors explicitly stated that they were estimating an ACE or if the study authors estimated an effect that was given, at least implicitly, a causal interpretation. Studies were excluded from the review if they met any of the following criteria: the study did not aim to answer a causal question, a clear research goal could not be identified, the primary purpose of the article was methodological development, the analysis was based on aggregated data, the article reported qualitative research, the study exposure was assigned to participants by investigators (i.e. a trial), or the study was retracted. The most recent search was performed on 10th June 2022. A random sample of 10 articles were independently screened and reviewed by two reviewers (RM and KL) to develop the data collection instrument. One reviewer (RM) screened and reviewed all articles. Double data extraction was independently completed for 10% of articles (RM and KL). In addition, a second reviewer (CN or KL) screened articles when there was uncertainty about the inclusion criteria and reviewed articles when there was uncertainty about the information being extracted. Disagreements between reviewers were resolved via discussion with a third reviewer. A summary of the data extraction items and a copy of the data extraction questionnaire are provided in Table 1 and the Supplementary Material, respectively, of Mainzer et al .( 4 ) Briefly, for each study included in the review, data were extracted on the following: study characteristics; the quantity of missing data; the missing data assumptions made and whether these assumptions were justified; details of the primary analysis and whether or not the primary analysis was justified based on missing data assumptions; details of any secondary/sensitivity analysis conducted that handled the missing data differently from the primary analysis and its justification; and details of the MI implementation. For each study, we defined the “inception sample” as the set of participants who met eligibility criteria for inclusion in the study to answer the research question of interest, where eligibility criteria do not include any requirement for variables to be complete, and the “analysis sample” as the participants who were included in the analysis to answer the research question of interest. Extracted items were summarised using descriptive statistics. Data cleaning and analysis was performed in R.( 20 ) Reporting follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews checklist.( 21 ) Results Screening process Figure 1 presents a flow diagram of the article screening process. Of the 304 papers that met the inclusion criteria, 130 papers were included in this review.( 22 – 151 ) There were 14 articles that were screened by a second reviewer due to uncertainty about inclusion criteria. Double data extraction was completed for a further 14 articles. All disagreements were resolved via discussion. Minor changes were made to the review protocol to accommodate unanticipated challenges in data extraction (described in Additional file 1). [Figure 1 about here.] Study characteristics Study characteristics are summarised in Table 1 . Most papers included in this review were published in American Journal of Epidemiology (38%) or International Journal of Epidemiology (26%). The most common study design was a prospective longitudinal study (65%), followed by a retrospective analysis of routinely collected data (12%). The most common outcomes used for analyses were binary (35%) and time to event (38%). Few studies made their causal aim explicit (25%) or presented a DAG to depict causal assumptions (31%). However, most studies identified a set of variables to control for confounding (82%) and almost all studies estimated an effect using a regression model (or a more sophisticated causal effect estimation method such as g-computation) with adjustment for a set of covariates, implicitly or explicitly assumed to be confounders (99%). Table 1 Summary of study characteristics for the 130 included papers. Characteristic n (%) Publication year 2019 47 (36%) 2020 45 (35%) 2021 38 (29%) Journal 1 American Journal of Epidemiology 50 (38%) Epidemiology 24 (18%) European Journal of Epidemiology 21 (16%) International Journal of Epidemiology 34 (26%) Journal of Clinical Epidemiology 1 (1%) Study design Prospective longitudinal study 85 (65%) Retrospective analysis of routinely collected data 15 (12%) Pooled cohort analysis 9 (7%) Case-control study 7 (5%) Cross-sectional study 5 (4%) Case-cohort study 2 (2%) Other 2 7 (5%) Type of outcome used for analysis Binary 45 (35%) Categorical (excluding binary) 3 (2%) Continuous 33 (25%) Time to event 49 (38%) Causal question inclusion criteria 3 Explicitly stated interest in a causal effect 33 (25%) Estimate was given a causal interpretation 130 (100%) Typical signals of a causal analysis 3 A directed acyclic graph was used to depict causal assumptions 40 (31%) A set of variables were identified to control for confounding 106 (82%) Effect was estimated using a regression model with adjustment for a set of covariates 4 129 (99%) 1 Number of papers published between January 2019 and December 2021 based using a Pub Med search for (("2019/01/01"[Date - Publication] : "2021/12/31"[Date - Publication])) AND (" Journal name "[Journal]): American Journal of Epidemiology, 876; Epidemiology, 496; European Journal of Epidemiology, 370; International Journal of Epidemiology, 814; Journal of Clinical Epidemiology, 996. 2 Secondary analysis of trial data (n = 2); prospective follow-up of cohort recruited for trial (n = 2); pooled analysis of data from case-control and cohort studies (n = 1); pooled analysis of data from case-control studies (n = 1); transportability study using data from 4 clinical trials and 1 observational cohort (n = 1). 3 Categories are not mutually exclusive. 4 One study used structural equation modelling seemingly without adjustment for covariates, although a causal conclusion was made. [Table 1 about here] Missing data summaries The reported quantity of missing data is summarised in Table 2 . The size of the inception sample could not be established in 38% of studies, and 83% of studies derived an analysis sample by excluding individuals with missing data in specific variables. The percentage of complete cases could be established in just 33/130 (25%) studies (median, 25th – 75th percentiles: 85%, 75% – 92%), although an upper bound on the percentage of complete cases that was tighter than 100% (indicating the maximum possible percentage of complete cases given the missing data summaries provided) could be established for another 81/130 (62%) studies (median upper bound, 25th -75th percentiles: 81%, 64% – 91%). Almost all studies (88%) incurred missing data in multiple variables in the analysis sample (despite most studies already arriving at an analysis sample by excluding individuals with missing data in specific variables). Table 2 Amount of missing data. Summaries are n (%) unless stated otherwise. Characteristic Summary Able to establish the size of the inception sample 1 Yes 81 (62%) No 2 49 (38%) Analysis sample was defined by excluding individuals with missing data in specific variables Yes 108 (83%) No 22 (17%) Complete cases Able to establish the % of complete cases % of complete cases, median (25th − 75th percentiles) 33 (25%) 85% (75% – 92%) Only able to establish an upper bound on the % of complete cases Upper bound on % of complete cases, median (25th − 75th percentiles) 81 (62%) 81% (64% – 91%) Not able to establish the percentage of complete cases 16 (12%) Missing values in the exposure Yes, and able to establish the % of missing values % of missing values, median (25 th – 75th percentiles) 39 (30%) 11% (3% – 16%) Yes, but only able to establish a lower bound on the % of missing values Lower bound on % of missing values, median (25 th – 75th percentiles) 4 (3%) 31% (5% − 67%) Yes, but unable to establish the % or a lower bound on the % 7 (5%) No 70 (54%) Unclear 10 (8%) Missing values in the outcome 3 Yes, and able to establish the % of missing values % of missing values, median (25 th – 75th percentiles) 19 (15%) 9% (5% – 28%) Yes, but only able to establish a lower bound on the % of missing values Lower bound on % of missing values, median (25 th – 75th percentiles) 7 (5%) 6% (3% – 23%) Yes, but unable to establish the % or a lower bound on the % 5 (4%) No 91 (70%) Unclear 8 (6%) Missing values in the covariates Yes, in 2 or more 109 (84%) Yes, in 1 covariate only 7 (5%) No 7 (5%) Unable to establish 7 (5%) Multivariable missingness within analysis sample Yes 114 (88%) No 8 (6%) Unable to establish 8 (6%) 1 Inception sample defined as the participants who met eligibility criteria for inclusion in the study to answer the research question of interest, where eligibility criteria do not include any requirement for variables to be complete. 2 Includes 5 studies where analyses were conducted separately by sub-groups (e.g., sex), but the inception sample for the sub-group could not be identified even though the inception sample for the entire group may have been provided. 3 Time-to-event outcomes were not considered to be missing data (we did not treat censored data as missing) except for in two studies where authors explicitly stated that the outcome was imputed. [Table 2 about here] Missing data assumptions Missing data assumptions are described in Table 3 . Most studies (66%) omitted a statement about missing data assumptions entirely. Of the 44 studies that did provide an explicit or indirect statement about missing data assumptions, 35/44 (80%) stated the MAR assumption, 2/44 (5%) stated the MCAR assumption and 6/44 (14%) alluded to data being “not MCAR” but did not distinguish between MAR and MNAR. Eleven of the 44 (25%) studies that provided a statement about missing data assumptions provided a justification for their missing data assumptions (described in Table 3 , footnote 3). Of the 130 studies in the review, 31 (24%) linked the justification for the primary analysis to the missing data assumptions. Table 3 Assumptions about the missing data mechanism. Characteristic Summary Missing data assumptions 1 No statement of missing data assumptions was provided 86 (66%) Data were assumed to be MAR 35 (27%) Data were assumed to be not MCAR 6 (5%) Data were assumed to be MCAR 2 (2%) A comprehensive description of missing data assumptions was provided, e.g., using an m-DAG 0 (0%) Other 2 1 (1%) Justification provided for missing data assumptions (as % of papers that made a statement about missing data assumptions, n = 44) Yes 3 11 (25%) No 33 (75%) Justified the primary analysis using missing data assumptions Yes 31 (24%) No 98 (75%) Other 4 1 (1%) Abbreviations: MAR, missing at random; MCAR, missing completely at random; MNAR, missing not at random; m-DAG, missingness directed acyclic graph. 1 The assumption may have been stated explicitly or made indirectly. For example, explicit statements of the MAR assumption include: “We assumed the missing at random assumption held and is reasonable”,( 107 ) and “We imputed data using multiple imputation by chained equations under the assumption that data were missing at random”.( 135 ) Indirect statements of the MAR assumption include “This multiple imputation approach assumes missing at random”,( 88 ) and “We first imputed missing values using multiple imputation by chained equations, which assumes the data are missing at random conditional on the variables in the imputation model”.( 115 ) 2 Data assumed to be “MCAR, conditional on age and ethnicity” (n = 1). 3 Two studies justified assuming that data were MCAR; justifications included adding the questionnaire to the study after the study began (n = 1) and a lack in data registration (n = 1). Three studies justified assuming that data were not MCAR; justifications included clinicians ordering tests according to glucose level (n = 1), and describing characteristics associated with missingness (n = 2). Six studies justified assuming that data were MAR; justifications included describing characteristics associated with missingness and/or conducting formal hypothesis tests (n = 4), examining the missingness pattern (n = 1) and because children moved homes and/or were impossible to locate (n = 1). 4 Justified MI to improve efficiency in the estimators. [Table 3 about here] Primary and sensitivity analyses Details of the primary and secondary/sensitivity analyses are described in Table 4 . Most studies (79%) used MI as the primary analysis method and approximately half (69/130, 53%) of the studies conducted a secondary analysis that handled the missing data differently. Of the 69 studies that conducted a secondary analysis, 70% of studies either provided no justification for conducting the secondary analysis or justified the secondary analysis as a sensitivity analysis without describing to what aspect of their primary analysis they were assessing sensitivity. A further 17/69 (25%) studies provided a vague justification for the secondary analysis, including to examine the influence of missing data (6%), to examine the impact of the missing data method (10%), and to address possible selection bias (9%). 88% of studies that conducted a secondary analysis performed both a CCA and an MI analysis; of these, only 3 studies (5%) observed a substantial difference between CCA and MI estimates. One study (1%) conducted an “extreme case” analysis that involved single imputation of the outcome under two extreme scenarios, thereby incorporating an external assumption about a difference in distribution between the missing and observed outcome data. However, no studies used a model-based approach such as a NARFCS procedure or elicited external information from subject-matter experts about the difference in distribution between the missing and observed data. Table 4 Primary and secondary analyses. Characteristic n (%) Method used for the primary analysis Standard MI 80 (62%) Standard MI, combined with weighting 1 23 (18%) CCA 21 (16%) Other 2 6 (5%) Secondary analysis conducted that handled the missing data differently Yes 69 (53%) No 61 (47%) Method used for the secondary analysis (as % of papers that conducted a secondary analysis, n = 69) Standard MI 27 (39%) Standard MI, combined with weighting 1 2 (3%) CCA 26 (38%) CCA, combined with weighting 1 6 (9%) Conducted more than two secondary analyses 3 8 (12%) Justification for the secondary analysis, (as % of papers that conducted a secondary analysis, n = 69) Not provided 25 (36%) As a sensitivity analysis (without further justification) 23 (33%) To examine the influence of missing data 4 (6%) To examine the impact of the missing data method 7 (10%) To address possible selection bias 6 (9%) To examine robustness to parametric modelling assumptions 1 (1%) To examine robustness to causal assumptions about the missing data mechanism 0 (0%) Other 4 3 (4%) Conducted both a CCA and an MI analysis, regardless of whether weighting was used or not (as % of papers that conducted a secondary analysis, n = 69) Yes 61 (88%) No 5 8 (12%) Observed a substantial difference between MI and CCA estimates, (as % of papers that conducted both a CCA and a MI analysis, n = 61) Yes 3 (5%) No 53 (95%) 1 Weights that were used to address selection bias due to loss to follow up or censoring. Excluding that were used to address confounding bias. 2 Treated “missing” as an additional category, with or without weighting (n = 3); Single median imputation to obtain exposure (n = 1); Single mean imputation for variables with > 25% missing values (n = 1); MI for one covariate with > 25% missing values and single (median/mode) imputation for variables with less than 5% missing values (n = 1). 3 Described in Additional file 1, Supplementary Table 1. 4 As a sensitivity analysis to examine the robustness of findings to statistical assumptions without stating which statistical assumptions (n = 1); as a sensitivity analysis to address possible selection bias and to exploit information in incomplete record participants (n = 1); Estimates were presented from both MI and a CCA after fitting two different models (one weighted and one unweighted). No justification was provided for conducting both MI and CCA analyses, but fitting two models was justified by seeing whether the choice of model impacted results and fitting models with and without weights was conducted to see how weighting affected the results (n = 1). 5 Standard MI, with and without weighting (n = 4); Standard MI, with weighting, with and without imputation of exposure (n = 1); Standard MI, with and without inclusion of the outcome in the imputation model (n = 1); Three versions of single imputation and standard MI (n = 1); Missing treated as an additional category, last value carried forward and standard MI (n = 1). [Table 4 about here] MI implementation The details of the MI implementation are described in Table 5 . Most studies (71%) reported the number of imputations (median, 25th -75th percentiles: 20, 3–100). Multivariate imputation by chained equations was the most used imputation method (67% of studies), but the imputation method was unclear for a further 25% of studies. MI was most often conducted in Stata or R. In more than half of the studies it was unclear whether all analysis variables were included in the imputation procedure (58%), whether auxiliary variables were used in the imputation procedure (55%), and whether interactions were included in the imputation model (57%). Of the 87 studies that reported using multivariate imputation by chained equations, 18 (21%) reported the type of models that were used in the imputation procedure. In approximately two-thirds (65%) of studies, the method that was used to obtain a final MI estimate and its standard error was not stated and could not be deduced from the description in the paper. Table 5 Multiple imputation implementation. Summaries are n (%) unless stated otherwise. Characteristic Summary Reported number of imputations Yes 92 (71%) No 38 (29%) Number of imputations, median (range) 20 ( 3 – 100 ) Multiple imputation method Multivariate imputation by chained equations 87 (67%) Multivariate normal imputation 6 (4.6%) Other 1 4 (3.1%) Unclear 33 (25%) Software package used for conducting the multiple imputation analysis Stata 40 (31%) R 33 (25%) SAS 26 (20%) SPSS 1 (0.8%) Other 2 14 (11%) Unclear 16 (12%) All analysis variables included in imputation model Yes 35 (27%) No 20 (15%) Unclear 75 (58%) Auxiliary variables included in imputation model Yes 42 (32%) No 16 (12%) Unclear 72 (55%) Interactions included in imputation model Yes 2 (1.5%) No 54 (42%) Unclear 74 (57%) Reported type of models used for imputation (as % of papers that used multivariate imputation by chained equations, n = 87) Yes 18 (21%) No 69 (79%) Stated how a final estimate and standard error were obtained Either stated, provided code or method could be deduced from software description 45 (35%) Not stated 85 (65%) 1 Imputation performed using a bootstrapping-based algorithm for panel data in R package Amelia II (n = 1), imputation performed in the pan package mitml for multilevel data (n = 1), referenced a paper where the MI methods are described rather than providing a description (n = 1), used a multiple imputation analysis for exposure and covariates without stating what the analysis was, and used Kaplan-Meier multiple imputation for the outcome as part of a sensitivity analysis (n = 1). 2 Study used two software packages for analysis but it was not clear which package was used for MI (n = 13), NORM software (n = 1). [Table 5 about here] Discussion We systematically reviewed the literature to assess the current state of practice in using MI for estimation of causal effects from incompletely observed observational data. We focussed on four key areas: missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. Overall, we found that most studies are not reporting missing data, and missing-data-related assumptions, decisions, or analyses with sufficient clarity. An unexpected although perhaps unsurprising finding from this review was that the analytical sample is often arrived at by excluding individuals with missing data in specific variables, for example, by using eligibility criteria that require key variables to be completely observed. This means the full extent of missing data is difficult to quantify due to difficulty in identifying the inception sample. Therefore, for the purposes of reporting the amount of missing data in this review, we considered the amount of missing data within the analysis sample only. However, identifying the exact amount of missing data within the well-defined analysis sample was also often difficult because summaries were frequently reported per variable without describing missing data patterns. Details of the assumptions made about the missing data mechanism were often lacking and, when provided, not justified appropriately. A statement of assumptions about the missingness mechanism was provided for just one-third (33%) of studies. This is, however, an improvement over what was found in the reviews conducted by Mackinnon (2010), where 8/50 (16%) observational studies provided a statement that data were MAR,( 9 ) and Rezvan et al. (2015), where 7/30 (23%) observational studies stated or described the assumed missing data mechanism.( 8 ) When a statement about the missing data mechanism was provided, most studies said they assumed data were MAR, but justifications for missingness assumptions were provided in just 11 studies. The most common justification for the MAR assumption included participant characteristics differing between those with and without complete data, determined by an investigation of summary statistics or by conducting formal hypothesis tests. However, it is impossible to distinguish between MAR and MNAR using data-based assessments, so these justifications are not complete. As described in the Introduction, the MCAR/MAR/MNAR assumptions are difficult to interpret and assess in the context of multivariable missingness, so it is not surprising that we found lacking or incomplete justifications for these assumptions. Of note, no study provided a comprehensive description of missing data assumptions, for example, using an m-DAG. Furthermore, the omission of a statement of missing data assumptions entirely from most studies suggests that the critical link between missing data assumptions and estimation methods is not generally appreciated. When missing data assumptions were used to guide the choice of MI as the primary analysis, the most common justification for using MI was because data were assumed to be MAR (without justifying the MAR assumption). Most studies in this review used standard MI for the primary analysis. Approximately half of the studies conducted a secondary analysis that treated the missing data differently from the primary analysis, but the reason for doing so was almost always omitted or unclear. When studies did carry out two analyses that handled the missing data differently, it was common to conduct both a CCA and MI. Without justification, it is not clear why such an analysis is warranted. It may be to examine the sensitivity of ACE estimates to causal assumptions made about the missing data mechanism for the primary analysis. We speculate another motivation for such an analysis may be the misconception that a CCA is the “normal” approach to dealing with missing data while standard MI provides a more sophisticated analysis that allows you to assess whether the missing data were really an issue or not. However, if under plausible missingness assumptions neither standard MI nor CCA can provide unbiased estimation, then it would be incorrect to conclude that the missing data “had little impact” on the results. In other words, when there is no unbiased estimate to compare against, the impact of the missing data remains unknown. Of the 61 studies that conducted both a CCA and MI analysis, only 3 (5%) studies observed a substantial difference between MI and CCA estimates. Just one study conducted an analysis that incorporated assumptions about a difference between the missing and observed data distributions. Despite being an area of recent methodological development, our finding that such analyses are not being performed often is similar to findings from previous reviews, see e.g. ( 8 , 152 ). MI is increasingly recognised as a method for estimation that needs to be carefully tailored to the target analysis.( 7 ) However, the findings from the current review suggest that there is room for improvement in the reporting of MI implementation. For example, certain aspects of the imputation model form were reported just over half of the time despite being needed to judge the appropriateness of the MI model and ensure the analysis can be reproduced. The strengths of this review are that it documents the current practices in the use of MI for estimating ACEs from incomplete observational data. Our review followed a clear, pre-specified protocol,( 4 ) and, by including articles in top general epidemiology journals, our review reflects current best practice. Furthermore, the analysis conducted for the current study is entirely reproducible as all data and code are available on GitHub: github.com/rheanna-mainzer/MI-scoping-review. This review has several limitations. Authors may have chosen not to provide details on all aspects of handling missing data that we examined, for example, due to strict journal word limits. However, all accompanying supplementary material was also reviewed and used for data extraction. Most of the data extraction was performed by a single reviewer (RM), with double data extraction performed for 10% of studies, so there may be some extraction errors. Also, it may have been useful to extract additional items or extract items in more detail to better capture the variety of analyses undertaken. However, additional notes on each paper were recorded and are available as part of the complete dataset on GitHub. Lastly, by limiting to five top general epidemiology journals, our results may not reflect papers published in other journals, but it seems unlikely that less highly regarded journals would exhibit higher standards in this area of practice. Conclusion The message from our review is clear: there is a need for greater clarity in the conduct and reporting of causal effect estimation using MI with incomplete observational data. Researchers are encouraged to move beyond the MCAR/MAR/MNAR framework and adopt a more transparent approach for outlining missing data assumptions, to use missing data assumptions to justify the estimation method, and to report their assumptions, methods and results systematically. The development of guidelines that journals can adopt is a key step needed to improve practice. List Of Abbreviations ACE Average causal effect CCA Complete case analysis DAG Directed acyclic graph MAR Missing at random MCAR Missing completely at random m-DAG Missingness directed acyclic graph MI Multiple imputation MNAR Missing not at random NARFCS Not-at-random fully conditional specification Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Availability of data and materials The datasets supporting the conclusions of this article are available from RM’s GitHub repository: github.com/rheanna-mainzer/MI-scoping-review. Competing interests The authors declare that they have no competing interests. Funding This work was supported by Australian National Health and Medical Research Council (NHMRC) Investigator Grant Leadership Level 1 grants (grant 1196068 awarded to JS and 2017498 to KL), an NHMRC Investigator Grant Emerging Leadership Level 2 (grant 2009572 awarded to MM-B) and an NHMRC Project Grant (grant 1166023). Research at the Murdoch Children’s Research Institute is supported by the Victorian Government’s Operational Infrastructure Support Program. Authors’ contributions RM: Conceptualization, Software, Validation, Formal analysis, Writing – Original Draft, Writing – Review & Editing. MM-B: Conceptualization, Methodology, Writing – Review & Editing. CN: Validation, Writing – Review & Editing. JS: Conceptualization, Methodology, Writing – Review & Editing. JC: Conceptualization, Methodology, Validation, Writing – Review & Editing. KL: Conceptualization, Methodology, Validation, Supervision, Writing – Review & Editing. Acknowledgements Not applicable. References Hernán MA. The C-word: Scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108(5):616–9. Lederer DJ, Bell SC, Branson RD, Chalmers JD, Marshall R, Maslove DM, et al. Control of confounding and reporting of results in causal inference studies. Guidance for authors from editors of respiratory, sleep, and critical care journals. Annals Am Thorac Soc. 2019;16(1):22–8. Moreno-Betancur M, Lee KJ, Leacy FP, White IR, Simpson JA, Carlin JB. Canonical causal diagrams to guide the treatment of missing data in epidemiologic studies. Am J Epidemiol. 2018;187(12):2705–15. Mainzer R, Moreno-Betancur M, Nguyen C, Simpson J, Carlin J, Lee K. Handling of missing data with multiple imputation in observational studies that address causal questions: protocol for a scoping review. BMJ Open. 2023;13(2):e065576. Rubin DB. Multiple imputation for nonresponse in surveys. Wiley; 2004. Van Buuren S. Flexible imputation of missing data. CRC; 2018. Meng X-L. Multiple-imputation inferences with uncongenial sources of input. Stat Sci. 1994:538–58. Hayati Rezvan P, Lee KJ, Simpson JA. The rise of multiple imputation: a review of the reporting and implementation of the method in medical research. BMC Med Res Methodol. 2015;15:1–14. Mackinnon A. The use and reporting of multiple imputation in medical research–a review. J Intern Med. 2010;268(6):586–93. Karahalios A, Baglietto L, Carlin JB, English DR, Simpson JA. A review of the reporting and handling of missing data in cohort studies with repeated assessment of exposure measures. BMC Med Res Methodol. 2012;12(1):96. Lee KJ, Carlin JB, Simpson JA, Moreno-Betancur M. Assumptions and analysis planning in studies with missing data in multiple variables: moving beyond the MCAR/MAR/MNAR classification. Int J Epidemiol. 2023;52(4):1268–75. Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338. Seaman S, Galati J, Jackson D, Carlin J. What is meant by missing at random? Stat Sci. 2013;28(2). Doretti M, Geneletti S, Stanghellini E. Missing data: a unified taxonomy guided by conditional independence. Int Stat Rev. 2018;86(2):189–204. Mohan K, Pearl J. Graphical models for processing missing data. J Am Stat Assoc. 2021;116(534):1023–37. Zhang J, Dashti SG, Carlin JB, Lee KJ, Moreno-Betancur M. Recoverability and estimation of causal effects under typical multivariable missingness mechanisms. arXiv preprint arXiv:230106739. 2023. Tompsett DM, Leacy F, Moreno-Betancur M, Heron J, White IR. On the use of the not-at-random fully conditional specification (NARFCS) procedure in practice. Stat Med. 2018;37(15):2338–53. Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355–60. Lee KJ, Tilling KM, Cornish RP, Little RJ, Bell ML, Goetghebeur E, et al. Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. J Clin Epidemiol. 2021;134:79–88. R Core Team R. R: A language and environment for statistical computing. 2023. Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and explanation. Ann Intern Med. 2018;169(7):467–73. Agier L, Basagaña X, Hernandez-Ferrer C, Maitre L, Tamayo Uria I, Urquiza J, et al. Association between the pregnancy exposome and fetal growth. Int J Epidemiol. 2020;49(2):572–86. Allison RM, Birken CS, Lebovic G, Howard AW, L’Abbe MR, Morency M-E, et al. Consumption of cow’s milk in early childhood and fracture risk: A prospective cohort study. Am J Epidemiol. 2020;189(2):146–55. Badon SE, Quesenberry CP, Xu F, Avalos LA, Hedderson MM. Gestational weight gain, birthweight and early-childhood obesity: between- and within-family comparisons. Int J Epidemiol. 2020;49(5):1682–90. Barul C, Richard H, Parent M-E. Night-shift work and risk of prostate cancer: Results from a Canadian case-control study, the Prostate Cancer and Environment Study. Am J Epidemiol. 2019;188(10):1801–11. Bell GA, Perkins N, Buck Louis GM, Kannan K, Bell EM, Gao C et al. Exposure to persistent organic pollutants and birth characteristics: The Upstate KIDS Study. Epidemiology. 2019;30. Bell-Gorrod H, Fox MP, Boulle A, Prozesky H, Wood R, Tanser F, et al. The impact of delayed switch to second-line antiretroviral therapy on mortality, depending on definition of failure time and CD4 count at failure. Am J Epidemiol. 2020;189(8):811–9. Bernasconi DP, Antolini L, Rossi E, Blanco-Lopez JG, Galimberti S, Andersen PK, et al. A causal inference approach to compare leukaemia treatment outcome in the absence of randomization and with dependent censoring. Int J Epidemiol. 2022;51(1):314–23. Bhatta L, Cepelis A, Vikjord SA, Malmo V, Laugsand LE, Dalen H, et al. Bone mineral density and risk of cardiovascular disease in men and women: the HUNT study. Eur J Epidemiol. 2021;36(11):1169–77. Bijlsma MJ, Wilson B, Tarkiainen L, Myrskylä M, Martikainen P. The impact of unemployment on antidepressant purchasing: Adjusting for unobserved time-constant confounding in the g-formula. Epidemiology. 2019;30(3). Bjelland EK, Gran JM, Hofvind S, Eskild A. The association of birthweight with age at natural menopause: a population study of women in Norway. Int J Epidemiol. 2020;49(2):528–36. Blouin B, Casapia M, Kaufman JS, Joseph L, Larson C, Gyorkos TW. Bayesian methods for exposure misclassification adjustment in a mediation analysis: Hemoglobin and malnutrition in the association between: ascaris and IQ. Epidemiology. 2019;30(5). Borch KB, Weiderpass E, Braaten T, Hansen MS, Licaj I. Risk of lung cancer and physical activity by smoking status and body mass index, the Norwegian Women and Cancer Study. Eur J Epidemiol. 2019;34(5):489–98. Cepelis A, Brumpton BM, Laugsand LE, Dalen H, Langhammer A, Janszky I, et al. Asthma, asthma control and risk of acute myocardial infarction: HUNT study. Eur J Epidemiol. 2019;34(10):967–77. Chasekwa B, Ntozini R, Church JA, Majo FD, Tavengwa N, Mutasa B et al. Prevalence, risk factors and short-term consequences of adverse birth outcomes in Zimbabwean pregnant women: a secondary analysis of a cluster-randomized trial. Int J Epidemiol. 2021:dyab248. Chen J, van der Duin D, Campos-Obando N, Ikram MA, Nijsten TEC, Uitterlinden AG, et al. Serum 25-hydroxyvitamin D3 is associated with advanced glycation end products (AGEs) measured as skin autofluorescence: The Rotterdam Study. Eur J Epidemiol. 2019;34(1):67–77. Chen R, Tedroff K, Villamor E, Lu D, Cnattingius S. Risk of intellectual disability in children born appropriate-for-gestational-age at term or post-term: impact of birth weight for gestational age and gestational age. Eur J Epidemiol. 2020;35(3):273–82. Chen Y, Kim ES, VanderWeele TJ. Religious-service attendance and subsequent health and well-being throughout adulthood: evidence from three prospective cohorts. Int J Epidemiol. 2020;49(6):2030–40. Chen Z, Glisic M, Song M, Aliahmad HA, Zhang X, Moumdjian AC, et al. Dietary protein intake and all-cause and cause-specific mortality: results from the Rotterdam Study and a meta-analysis of prospective cohort studies. Eur J Epidemiol. 2020;35(5):411–29. Chigogora S, Pearce A, Law C, Viner R, Chittleborough C, Griffiths LJ et al. Could greater physical activity reduce population prevalence and socioeconomic inequalities in children’s mental health poblems? A policy simulation. Epidemiology. 2020;31(1). Cohen JM, Wood ME, Hernández-Díaz S, Ystrom E, Nordeng H. Paternal antidepressant use as a negative control for maternal use: assessing familial confounding on gestational length and anxiety traits in offspring. Int J Epidemiol. 2019;48(5):1665–72. Colen CG, Pinchak NP, Barnett KS. Racial disparities in health among college-educated African Americans: Can attendance at historically black colleges or universities reduce the risk of metabolic syndrome in midlife? Am J Epidemiol. 2021;190(4):553–61. Coulombe J, Moodie EEM, Shortreed SM, Renoux C. Can the risk of severe depression-related outcomes be reduced by tailoring the antidepressant therapy to patient characteristics? Am J Epidemiol. 2021;190(7):1210–9. Crump C, Friberg D, Li X, Sundquist J, Sundquist K. Preterm birth and risk of sleep-disordered breathing from childhood into mid-adulthood. Int J Epidemiol. 2019;48(6):2039–49. Dam V, van der Schouw YT, Onland-Moret NC, Groenwold RHH, Peters SAE, Burgess S, et al. Association of menopausal characteristics and risk of coronary heart disease: a pan-European case–cohort analysis. Int J Epidemiol. 2019;48(4):1275–85. Debras C, Chazelas E, Srour B, Julia C, Kesse-Guyot E, Zelek L, et al. Glycaemic index, glycaemic load and cancer risk: results from the prospective NutriNet-Santé cohort. Int J Epidemiol. 2022;51(1):250–64. Dekhtyar S, Vetrano DL, Marengoni A, Wang H-X, Pan K-Y, Fratiglioni L, et al. Association between speed of multimorbidity accumulation in old ae and life experiences: A cohort study. Am J Epidemiol. 2019;188(9):1627–36. Delaney JA, Nance RM, Whitney BM, Crane HM, Williams-Nguyen J, Feinstein MJ et al. Cumulative human immunodeficiency viremia, antiretroviral therapy, and incident myocardial infarction. Epidemiology. 2019;30(1). Enthoven CA, Tideman JWL, Polling JR, Tedja MS, Raat H, Iglesias AI, et al. Interaction between lifestyle and genetic susceptibility in myopia: the Generation R study. Eur J Epidemiol. 2019;34(8):777–84. Ferraro AA, Barbieri MA, da Silva AAM, Goldani MZ, Fernandes MTB, Cardoso VC, et al. Cesarean delivery and hypertension in early adulthood. Am J Epidemiol. 2019;188(7):1296–303. Flannagan KS, Mumford SL, Sjaarda LA, Radoc JG, Perkins NJ, Andriessen VC et al. Is opioid use safe in women trying to conceive? Epidemiology. 2020;31(6). Fraser GE, Jaceldo-Siegl K, Orlich M, Mashchak A, Sirirat R, Knutsen S. Dairy, soy, and risk of breast cancer: those confounded milks. Int J Epidemiol. 2020;49(5):1526–37. Freedman LS, Agay N, Farmer R, Murad H, Olmer L, Dankner R. Metformin treatment among men with diabetes and the risk of prostate cancer: A population-based historical cohort study. Am J Epidemiol. 2022;191(4):626–35. Garcia-Saenz A, de Miguel AS, Espinosa A, Costas L, Aragonés N, Tonne C et al. Association between outdoor light-at-night exposure and colorectal cancer in Spain. Epidemiology. 2020;31(5). George KM, Lutsey PL, Kucharska-Newton A, Palta P, Heiss G, Osypuk T, et al. Life-course individual and neighborhood socioeconomic status and risk of dementia in the Atherosclerosis Risk in Communities Neurocognitive Study. Am J Epidemiol. 2020;189(10):1134–42. Gerlovin H, Posner DC, Ho Y-L, Rentsch CT, Tate JP, King JT Jr., et al. Pharmacoepidemiology, machine learning, and COVID-19: An intent-to-treat analysis of hydroxychloroquine, with or without azithromycin, and COVID-19 outcomes among hospitalized US veterans. Am J Epidemiol. 2021;190(11):2405–19. Gero K, Hikichi H, Aida J, Kondo K, Kawachi I. Associations between community social capital and preservation of functional capacity in the aftermath of a major disaster. Am J Epidemiol. 2020;189(11):1369–78. Gialamas A, Haag DG, Mittinty MN, Lynch J. Which time investments in the first 5 years of life matter most for children’s language and behavioural outcomes at school entry? Int J Epidemiol. 2020;49(2):548–58. Giorgianni F, Ernst P, Dell’Aniello S, Suissa S, Renoux C. β2-agonists and the incidence of Parkinson disease. Am J Epidemiol. 2020;189(8):801–10. Goin DE, Izano MA, Eick SM, Padula AM, DeMicco E, Woodruff TJ et al. Maternal experience of multiple hardships and fetal growth: extending environmental mixtures methodology to social exposures. Epidemiology. 2021;32(1). Goin DE, Pearson RM, Craske MG, Stein A, Pettifor A, Lippman SA, et al. Depression and incident HIV in adolescent girls and young women in HIV prevention trials network 068: targets for prevention and mediating factors. Am J Epidemiol. 2020;189(5):422–32. Gram IT, Park S-Y, Maskarinec G, Wilkens LR, Haiman CA, Le Marchand L. Smoking and breast cancer risk by race/ethnicity and oestrogen and progesterone receptor status: the Multiethnic Cohort (MEC) study. Int J Epidemiol. 2019;48(2):501–11. Hamad R, Batra A, Karasek D, LeWinn KZ, Bush NR, Davis RL, et al. The impact of the revised WIC food package on maternal nutrition during pregnancy and postpartum. Am J Epidemiol. 2019;188(8):1493–502. Harlow AF, Hatch EE, Wesselink AK, Rothman KJ, Wise LA. Electronic cigarettes and fecundability: Results from a prospective preconception cohort study. Am J Epidemiol. 2021;190(3):353–61. Harlow AF, Wesselink AK, Hatch EE, Rothman KJ, Wise LA. Male preconception marijuana use and spontaneous abortion: A prospective cohort study. Epidemiology. 2021;32(2). He J-R, Hirst JE, Tikellis G, Phillips GS, Ramakrishnan R, Paltiel O et al. Common maternal infections during pregnancy and childhood leukaemia in the offspring: findings from six international birth cohorts. Int J Epidemiol. 2021:dyab199. Hillreiner A, Baumeister SE, Sedlmeier AM, Finger JD, Schlitt HJ, Leitzmann MF. Association between cardiorespiratory fitness and colorectal cancer in the UK Biobank. Eur J Epidemiol. 2020;35(10):961–73. Hjorth S, Pottegård A, Broe A, Hemmingsen CH, Leinonen MK, Hargreave M et al. Prenatal exposure to nitrofurantoin and risk of childhood leukaemia: A registry-based cohort study in four Nordic countries. Int J Epidemiol. 2021:dyab219. Hlaváčová J, Flegr J, Řežábek K, Calda P, Kaňková Š. Male-to-female presumed transmission of toxoplasmosis between sexual partners. Am J Epidemiol. 2021;190(3):386–92. Holzhausen EA, Hagen EW, LeCaire T, Cadmus-Bertram L, Malecki KC, Peppard PE. A comparison of self- and proxy-reported subjective sleep durations with objective actigraphy measurements in a survey of Wisconsin children 6–17 years of age. Am J Epidemiol. 2021;190(5):755–65. Hu MD, Lawrence KG, Bodkin MR, Kwok RK, Engel LS, Sandler DP. Neighborhood deprivation, obesity, and diabetes in residents of the US Gulf Coast. Am J Epidemiol. 2021;190(2):295–304. India-Aldana S, Rundle AG, Zeleniuch-Jacquotte A, Quinn JW, Kim B, Afanasyeva Y et al. Neighborhood walkability and mortality in a prospective cohort of women. Epidemiology. 2021;32(6). Inoue K, Mayeda ER, Paul KC, Shih IF, Yan Q, Yu Y, et al. Mediation of the associations of physical activity with cardiovascular events and mortality by diabetes in older Mexican Americans. Am J Epidemiol. 2020;189(10):1124–33. Inoue K, Ritz B, Ernst A, Tseng W-L, Yuan Y, Meng Q, et al. Behavioral problems at age 11 years after prenatal and postnatal exposure to acetaminophen: Parent-reported and self-reported outcomes. Am J Epidemiol. 2021;190(6):1009–20. Ishii M, Seki T, Kaikita K, Sakamoto K, Nakai M, Sumita Y, et al. Short-term exposure to desert dust and the risk of acute myocardial infarction in Japan: a time-stratified case-crossover study. Eur J Epidemiol. 2020;35(5):455–64. Isumi A, Doi S, Ochi M, Kato T, Fujiwara T. Child maltreatment and mental health in middle childhood: A longitudinal study in Japan. Am J Epidemiol. 2022;191(4):655–64. Janki S, Dehghan A, van de Wetering J, Steyerberg EW, Klop KWJ, Kimenai HJAN, et al. Long-term prognosis after kidney donation: a propensity score matched comparison of living donors and non-donors from two population cohorts. Eur J Epidemiol. 2020;35(7):699–707. Kerschberger B, Boulle A, Kuwengwa R, Ciglenecki I, Schomaker M. The impact of same-day antiretroviral therapy initiation under the World Health Organization treat-all policy. Am J Epidemiol. 2021;190(8):1519–32. Kim K, Browne RW, Nobles CJ, Radin RG, Holland TL, Omosigho UR et al. Associations between preconception plasma fatty acids and pregnancy outcomes. Epidemiology. 2019;30. Lara M, Labrecque JA, van Lenthe FJ, Voortman T. Estimating reductions in ethnic inequalities in child adiposity from hypothetical diet, screen time, and sports participation interventions. Epidemiology. 2020;31(5). Leon ME, Schinasi LH, Lebailly P, Beane Freeman LE, Nordby K-C, Ferro G, et al. Pesticide use and risk of non-Hodgkin lymphoid malignancies in agricultural cohorts from France, Norway and the USA: a pooled analysis from the AGRICOH consortium. Int J Epidemiol. 2019;48(5):1519–35. Lepage B, Colineaux H, Kelly-Irving M, Vineis P, Delpierre C, Lang T. Comparison of smoking reduction with improvement of social conditions in early life: simulation in a British cohort. Int J Epidemiol. 2021;50(3):797–808. Lergenmuller S, Ghiasvand R, Robsahm TE, Green AC, Lund E, Rueegg CS, et al. Sunscreens with high versus low sun protection factor and cutaneous squamous cell carcinoma risk: A population-based cohort study. Am J Epidemiol. 2022;191(1):75–84. Lerro CC, Hofmann JN, Andreotti G, Koutros S, Parks CG, Blair A, et al. Dicamba use and cancer incidence in the agricultural health study: an updated analysis. Int J Epidemiol. 2020;49(4):1326–37. Louie P, Upenieks L, Siddiqi A, Williams DR, Takeuchi DT. Race, flourishing, and all-cause mortality in the United States, 1995–2016. Am J Epidemiol. 2021;190(9):1735–43. Love S-AM, North KE, Zeng D, Petruski-Ivleva N, Kucharska-Newton A, Palta P, et al. Nine-year ethanol intake trajectories and their association with 15-year cognitive decline among black and white adults: The Atherosclerosis Risk in Communities Neurocognitive Study. Am J Epidemiol. 2020;189(8):788–800. Lyall K, Windham GC, Snyder NW, Kuskovsky R, Xu P, Bostwick A, et al. Association between midpregnancy polyunsaturated fatty acid levels and offspring autism spectrum disorder in a California population-based case-control study. Am J Epidemiol. 2021;190(2):265–76. Magnus MC, Fraser A, Rich-Edwards JW, Magnus P, Lawlor DA, Håberg SE. Time-to-pregnancy and risk of cardiovascular disease among men and women. Eur J Epidemiol. 2021;36(4):383–91. Mårild K, Tapia G, Midttun Ø, Ueland PM, Magnus MC, Rewers M, et al. Smoking in pregnancy, cord blood cotinine and risk of celiac disease diagnosis in offspring. Eur J Epidemiol. 2019;34(7):637–49. Mitchell A, Fall T, Melhus H, Wolk A, Michaëlsson K, Byberg L. Is the effect of Mediterranean diet on hip fracture mediated through type 2 diabetes mellitus and body mass index? Int J Epidemiol. 2021;50(1):234–44. Mitha A, Chen R, Johansson S, Razaz N, Cnattingius S. Maternal body mass index in early pregnancy and severe asphyxia-related complications in preterm infants. Int J Epidemiol. 2020;49(5):1647–60. Mollan KR, Pence BW, Xu S, Edwards JK, Mathews WC, O’Cleirigh C, et al. Transportability from randomized trials to clinical care: On initial HIV treatment with efavirenz and suicidal thoughts or behaviors. Am J Epidemiol. 2021;190(10):2075–84. Mooldijk SS, Licher S, Vinke EJ, Vernooij MW, Ikram MK, Ikram MA. Season of birth and the risk of dementia in the population-based Rotterdam Study. Eur J Epidemiol. 2021;36(5):497–506. Naël V, Pérès K, Dartigues J-F, Letenneur L, Amieva H, Arleo A, et al. Vision loss and 12-year risk of dementia in older adults: the 3C cohort study. Eur J Epidemiol. 2019;34(2):141–52. Nøst TH, Alcala K, Urbarova I, Byrne KS, Guida F, Sandanger TM, et al. Systemic inflammation markers and cancer incidence in the UK Biobank. Eur J Epidemiol. 2021;36(8):841–8. O’Brien KM, D’Aloisio AA, Shi M, Murphy JD, Sandler DP, Weinberg CR. Perineal talc use, douching, and the risk of uterine cancer. Epidemiology. 2019;30(6). Ong YY, Sadananthan SA, Aris IM, Tint MT, Yuan WL, Huang JY, et al. Mismatch between poor fetal growth and rapid postnatal weight gain in the first 2 years of life is associated with higher blood pressure and insulin resistance without increased adiposity in childhood: the GUSTO cohort study. Int J Epidemiol. 2020;49(5):1591–603. Oude Groeniger J, de Koster W, van der Waal J. Time-varying Effects of Screen Media Exposure in the Relationship Between Socioeconomic Background and Childhood Obesity. Epidemiology. 2020;31(4). Pedersen KM, Çolak Y, Vedel-Krogh S, Kobylecki CJ, Bojesen SE, Nordestgaard BG. Risk of ulcerative colitis and Crohn’s disease in smokers lacks causal evidence. Eur J Epidemiol. 2021. Pinto Pereira SM, De Stavola BL, Rogers NT, Hardy R, Cooper R, Power C. Adult obesity and mid-life physical functioning in two British birth cohorts: investigating the mediating role of physical inactivity. Int J Epidemiol. 2020;49(3):845–56. Pongiglione B, Kern ML, Carpentieri JD, Schwartz HA, Gupta N, Goodman A. Do children’s expectations about future physical activity predict their physical activity in adulthood? Int J Epidemiol. 2020;49(5):1749–58. Radojčić MR, Perera RS, Chen L, Spector TD, Hart DJ, Ferreira ML, et al. Specific body mass index trajectories were related to musculoskeletal pain and mortality: 19-year follow-up cohort. J Clin Epidemiol. 2022;141:54–63. Ranzani OT, Milà C, Sanchez M, Bhogadi S, Kulkarni B, Balakrishnan K, et al. Association between ambient and household air pollution with carotid intima-media thickness in peri-urban South India: CHAI-Project. Int J Epidemiol. 2020;49(1):69–79. Reese H, Routray P, Torondel B, Sinharoy SS, Mishra S, Freeman MC, et al. Assessing longer-term effectiveness of a combined household-level piped water and sanitation intervention on child diarrhoea, acute respiratory infection, soil-transmitted helminth infection and nutritional status: a matched cohort study in rural Odisha, India. Int J Epidemiol. 2019;48(6):1757–67. Reinhard E, Carrino L, Courtin E, van Lenthe FJ, Avendano M. Public transportation use and cognitive function in older age: A quasiexperimental evaluation of the Free Bus Pass Policy in the United Kingdom. Am J Epidemiol. 2019;188(10):1774–83. Rhee J, Loftfield E, Freedman ND, Liao LM, Sinha R, Purdue MP. Coffee consumption and risk of renal cell carcinoma in the NIH-AARP Diet and Health Study. Int J Epidemiol. 2021;50(5):1473–81. Richardson K, Mattishent K, Loke YK, Steel N, Fox C, Grossi CM, et al. History of benzodiazepine prescriptions and risk of dementia: Possible bias due to prevalent users and covariate measurement timing in a nested case-control study. Am J Epidemiol. 2019;188(7):1228–36. Riddell CA, Goin DE, Morello-Frosch R, Apte JS, Glymour MM, Torres JM, et al. Hyper-localized measures of air pollution and risk of preterm birth in Oakland and San Jose, California. Int J Epidemiol. 2021;50(6):1875–85. Rist PM, Buring JE, Rexrode KM, Cook NR, Rost NS. Prospectively collected lifestyle and health information as risk factors for white matter hyperintensity volume in stroke patients. Eur J Epidemiol. 2019;34(10):957–65. Robert A, Edmunds WJ, Watson CH, Henao-Restrepo AM, Gsell P-S, Williamson E, et al. Determinants of transmission risk during the late stage of the West African ebola epidemic. Am J Epidemiol. 2019;188(7):1319–27. Rogers NT, Blodgett JM, Searle SD, Cooper R, Davis DHJ, Pinto Pereira SM. Early-life socioeconomic position and the accumulation of health-related deficits by midlife in the 1958 British Birth Cohort Study. Am J Epidemiol. 2021;190(8):1550–60. Rogers NT, Power C, Pinto Pereira SM. Birthweight, lifetime obesity and physical functioning in mid-adulthood: a nationwide birth cohort study. Int J Epidemiol. 2020;49(2):657–65. Rovio SP, Pihlman J, Pahkala K, Juonala M, Magnussen CG, Pitkänen N, et al. Childhood exposure to parental smoking and midlife cognitive function: The Young Finns Study. Am J Epidemiol. 2020;189(11):1280–91. Rudolph KE, Gimbrone C, Díaz I. Helped into harm: Mediation of a housing voucher intervention on mental health and substance use in boys. Epidemiology. 2021;32(3). Rudolph KE, Levy J, Schmidt NM, Stuart EA, Ahern J. Using transportability to understand differences in mediation mechanisms across trial sites of a housing voucher experiment. Epidemiology. 2020;31(4):523–33. Salmon C, Song L, Muir K, Pashayan N, Dunning AM, Batra J, et al. Marital status and prostate cancer incidence: a pooled analysis of 12 case–control studies from the PRACTICAL consortium. Eur J Epidemiol. 2021;36(9):913–25. Sangaramoorthy M, Hines LM, Torres-Mejía G, Phipps AI, Baumgartner KB, Wu AH et al. A pooled analysis of breastfeeding and breast cancer risk by hormone receptor status in parous Hispanic women. Epidemiology. 2019;30(3). Sato K, Amemiya A, Haseda M, Takagi D, Kanamori M, Kondo K, et al. Postdisaster changes in social capital and mental Hhealth: A natural experiment from the 2016 Kumamoto earthquake. Am J Epidemiol. 2020;189(9):910–21. Schliep KC, Mumford SL, Silver RM, Wilcox B, Radin RG, Perkins NJ et al. Preconception perceived stress is associated with reproductive hormone levels and longer time to pregnancy. Epidemiology. 2019;30. Schuch HS, Nascimento GG, Peres KG, Mittinty MN, Demarco FF, Correa MB, et al. The controlled direct effect of early-life socioeconomic position on periodontitis in a birth cohort. Am J Epidemiol. 2019;188(6):1101–8. Schwartz GL, Leifheit KM, Berkman LF, Chen JT, Arcaya MC. Health selection into eviction: Adverse birth outcomes and children’s risk of eviction through age 5 years. Am J Epidemiol. 2021;190(7):1260–9. Sellers R, Warne N, Rice F, Langley K, Maughan B, Pickles A, et al. Using a cross-cohort comparison design to test the role of maternal smoking in pregnancy in child mental health and learning: evidence from two UK cohorts born four decades apart. Int J Epidemiol. 2020;49(2):390–9. Shiba K, Hanazato M, Aida J, Kondo K, Arcaya M, James P et al. Cardiometabolic profiles and change in neighborhood food and built environment among older adults: A natural experiment. Epidemiology. 2020;31(6). Shiba K, Hikichi H, Aida J, Kondo K, Kawachi I. Long-term associations between disaster experiences and cardiometabolic risk: A natural experiment from the 2011 great east Japan earthquake and tsunami. Am J Epidemiol. 2019;188(6):1109–19. Shiba K, Torres JM, Daoud A, Inoue K, Kanamori S, Tsuji T et al. Estimating the impact of sustained social participation on depressive symptoms in older adults. Epidemiology. 2021;32(6). Stacy SL, Buchanich JM, Ma Z-q, Mair C, Robertson L, Sharma RK, et al. Maternal obesity, birth size, and risk of childhood cancer development. Am J Epidemiol. 2019;188(8):1503–11. Su Y, D’Arcy C, Meng X. Social support and positive coping skills as mediators buffering the impact of childhood maltreatment on psychological distress and positive mental health in adulthood: Analysis of a national population-based sample. Am J Epidemiol. 2020;189(5):394–402. Sudharsanan N, Ho JY. Rural–urban differences in adult life expectancy in Indonesia: A parametric g-formula–based decomposition approach. Epidemiology. 2020;31(3). Tefft BC, Arnold LS. Estimating cannabis involvement in fatal crashes in Washington State before and after the legalization of recreational cannabis consumption using multiple imputation of missing values. Am J Epidemiol. 2021;190(12):2582–91. Torres JM, Rudolph KE, Sofrygin O, Wong R, Walter LC, Glymour MM. Having an adult dhild in the United States, physical functioning, and unmet Nneeds for care among older mexican adults. Epidemiology. 2019;30(4). Torres JM, Sofrygin O, Rudolph KE, Haan MN, Wong R, Glymour MM. US migration status of adult children and cognitive decline among older parents who remain in mexico. Am J Epidemiol. 2020;189(8):761–9. Tsarna E, Reedijk M, Birks LE, Guxens M, Ballester F, Ha M, et al. Associations of maternal cell-phone use during pregnancy with pregnancy duration and fetal growth in 4 birth cohorts. Am J Epidemiol. 2019;188(7):1270–80. Vable AM, Duarte Cd, Cohen AK, Glymour MM, Ream RK, Yen IH. Does the type and timing of educational attainment influence physical health? A novel application of sequence analysis. Am J Epidemiol. 2020;189(11):1389–401. van der Schaft N, Schoufour JD, Nano J, Kiefte-de Jong JC, Muka T, Sijbrands EJG, et al. Dietary antioxidant capacity and risk of type 2 diabetes mellitus, prediabetes and insulin resistance: the Rotterdam Study. Eur J Epidemiol. 2019;34(9):853–61. van Gennip ACE, Sedaghat S, Carnethon MR, Allen NB, Klein BEK, Cotch MF, et al. Retinal microvascular caliber and incident depressive symptoms: The Multi-Ethnic Study of Atherosclerosis. Am J Epidemiol. 2022;191(5):843–55. van Lee L, Crozier SR, Aris IM, Tint MT, Sadananthan SA, Michael N, et al. Prospective associations of maternal choline status with offspring body composition in the first 5 years of life in two large mother–offspring cohorts: the Southampton Women’s Survey cohort and the Growing Up in Singapore Towards healthy Outcomes cohort. Int J Epidemiol. 2019;48(2):433–44. Wagner M, Grodstein F, Proust-Lima C, Samieri C. Long-term trajectories of body weight, diet, and physical activity from midlife through late life and subsequent cognitive decline in women. Am J Epidemiol. 2020;189(4):305–13. Walsemann KM, Ailshire JA. Early educational experiences and trajectories of cognitive functioning among US adults in midlife and later. Am J Epidemiol. 2020;189(5):403–11. Wang C-R, Hu T-Y, Hao F-B, Chen N, Peng Y, Wu J-J, et al. Type 2 diabetes–prevention diet and all-cause and cause-specific mortality: A prospective study. Am J Epidemiol. 2022;191(3):472–86. Wang H, László KD, Gissler M, Li F, Zhang J, Yu Y, et al. Maternal hypertensive disorders and neurodevelopmental disorders in offspring: a population-based cohort in two Nordic countries. Eur J Epidemiol. 2021;36(5):519–30. Wesselink AK, Bresnick KA, Hatch EE, Rothman KJ, Mikkelsen EM, Wang TR, et al. Association between male use of pain medication and fecundability. Am J Epidemiol. 2020;189(11):1348–59. Wesselink AK, Claus Henn B, Fruh V, Orta OR, Weuve J, Hauser R et al. A prospective ultrasound study of plasma polychlorinated biphenyl concentrations and incidence of uterine leiomyomata. Epidemiology. 2021;32(2). White J, Fluharty M, de Groot R, Bell S, Batty GD. Mortality among rough sleepers, squatters, residents of homeless shelters or hotels and sofa-surfers: a pooled analysis of UK birth cohorts. Int J Epidemiol. 2021:dyab253. Williams-Nguyen J, Hawes SE, Nance RM, Lindström S, Heckbert SR, Kim HN, et al. Association between chronic hepatitis C virus infection and myocardial infarction among people living with HIV in the United States. Am J Epidemiol. 2020;189(6):554–63. Xiao J, Gao Y, Yu Y, Toft G, Zhang Y, Luo J, et al. Associations of parental birth characteristics with autism spectrum disorder (ASD) risk in their offspring: a population-based multigenerational cohort study in Denmark. Int J Epidemiol. 2021;50(2):485–95. Xuan Y, Bobak M, Anusruti A, Jansen EHJM, Pająk A, Tamosiunas A, et al. Association of serum markers of oxidative stress with myocardial infarction and stroke: pooled results from four large European cohort studies. Eur J Epidemiol. 2019;34(5):471–81. Yisahak SF, Hinkle SN, Mumford SL, Li M, Andriessen VC, Grantz KL, et al. Vegetarian diets during pregnancy, and maternal and neonatal outcomes. Int J Epidemiol. 2021;50(1):165–78. Youssim I, Gorfine M, Calderon-Margalit R, Manor O, Paltiel O, Siscovick DS, et al. Holocaust experience and mortality patterns: 4-decade follow-up in a population-based cohort. Am J Epidemiol. 2021;190(8):1541–9. Yu D-W, Li Q-J, Cheng L, Yang P-F, Sun W-P, Peng Y, et al. Dietary vitamin K intake and the risk of pancreatic cancer: A prospective study of 101,695 american adults. Am J Epidemiol. 2021;190(10):2029–41. Yuan J, Hu YJ, Zheng J, Kim JH, Sumerlin T, Chen Y, et al. Long-term use of antibiotics and risk of type 2 diabetes in women: a prospective cohort study. Int J Epidemiol. 2020;49(5):1572–81. Zhou Z, Lin C, Ma J, Towne SD, Han Y, Fang Y. The association of social isolation with the risk of stroke among middle-aged and older adults in china. Am J Epidemiol. 2019;188(8):1456–65. Bell ML, Fiero M, Horton NJ, Hsu C-H. Handling missing data in RCTs; a review of the top medical journals. BMC Med Res Methodol. 2014;14(1):1–8. Additional Declarations No competing interests reported. Supplementary Files Additionalfile124may24.docx PRISMAScRFillableChecklist21may24.docx Cite Share Download PDF Status: Published Journal Publication published 04 Sep, 2024 Read the published version in BMC Medical Research Methodology → Version 1 posted Editorial decision: Revision requested 25 Jun, 2024 Reviews received at journal 22 Jun, 2024 Reviews received at journal 15 Jun, 2024 Reviewers agreed at journal 31 May, 2024 Reviewers agreed at journal 28 May, 2024 Reviewers invited by journal 27 May, 2024 Editor invited by journal 27 May, 2024 Submission checks completed at journal 27 May, 2024 Editor assigned by journal 27 May, 2024 First submitted to journal 21 May, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4452118","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":308844631,"identity":"3dcea8da-c01a-4dfa-bdc0-c9d55731ceee","order_by":0,"name":"Rheanna M Mainzer","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABKUlEQVRIiWNgGAWjYDACCTiLsQFEykEYBQeA7ATitBhDGAZEaYGARLBGfFrMZzcf/Fy4w45Bt/1w28MfFffSN9xubt3wweAOAz97jgHDzzYMLTJ3jiVLzzyTzGB2JrHdQOJMce6GOwfbbs4weMYg2fPGgLEXU4uERI6BNG8bM4PZgcQ2CcO2hNwNNxLbbvMYHGYwuAG0hReblvzPv3nb6hnMzj9sk0j8l5BuANLyB6jFHqiF8S9WW9iAthxmMAOqlDjYkJAA1sIAsgXoAGastqSZWfO2Hecxu/GwTbLhWILhTKCWmz0Gz3gkzjwrOCxzDouW5Me3eduq5czOpz+T/FGTIM93I/3ZjR8Vd+T425M3PnxThjWgQYAHu8gBnBpGwSgYBaNgFOADAFqFcedxfOA/AAAAAElFTkSuQmCC","orcid":"","institution":"The University of Melbourne","correspondingAuthor":true,"prefix":"","firstName":"Rheanna","middleName":"M","lastName":"Mainzer","suffix":""},{"id":308844633,"identity":"23a728b6-4736-4dce-999c-b8103efaeb6f","order_by":1,"name":"Margarita Moreno-Betancur","email":"","orcid":"","institution":"Murdoch Children's Research Institute","correspondingAuthor":false,"prefix":"","firstName":"Margarita","middleName":"","lastName":"Moreno-Betancur","suffix":""},{"id":308844634,"identity":"07fda0ab-d236-4422-a990-bed429d6b82d","order_by":2,"name":"Cattram D Nguyen","email":"","orcid":"","institution":"Murdoch Children's Research Institute","correspondingAuthor":false,"prefix":"","firstName":"Cattram","middleName":"D","lastName":"Nguyen","suffix":""},{"id":308844635,"identity":"5b365428-36a4-46ef-a273-731f3c414932","order_by":3,"name":"Julie A Simpson","email":"","orcid":"","institution":"The University of Melbourne","correspondingAuthor":false,"prefix":"","firstName":"Julie","middleName":"A","lastName":"Simpson","suffix":""},{"id":308844636,"identity":"b723df6f-9aee-43f5-9122-b82d945160dc","order_by":4,"name":"John B. Carlin","email":"","orcid":"","institution":"Murdoch Children's Research Institute","correspondingAuthor":false,"prefix":"","firstName":"John","middleName":"B.","lastName":"Carlin","suffix":""},{"id":308844637,"identity":"5310558b-6a44-471a-a940-4d30a748ff2e","order_by":5,"name":"Katherine J Lee","email":"","orcid":"","institution":"Murdoch Children's Research Institute","correspondingAuthor":false,"prefix":"","firstName":"Katherine","middleName":"J","lastName":"Lee","suffix":""}],"badges":[],"createdAt":"2024-05-21 04:26:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4452118/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4452118/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12874-024-02302-6","type":"published","date":"2024-09-04T16:05:15+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":57518193,"identity":"aa51d705-88d2-45a9-ac73-5e0c4894b441","added_by":"auto","created_at":"2024-05-31 20:29:42","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":181319,"visible":true,"origin":"","legend":"\u003cp\u003eArticle screening process.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-4452118/v1/6908b902b9cc3e819dc5e0a3.png"},{"id":64185708,"identity":"97b18d1e-5590-4761-b4ea-05b28e5f3f32","added_by":"auto","created_at":"2024-09-09 16:21:10","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":967044,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4452118/v1/96afb62f-0abc-46d3-83b5-101c08664c25.pdf"},{"id":57518192,"identity":"4b8ed4fb-53d5-4c65-aad7-96f309af0e03","added_by":"auto","created_at":"2024-05-31 20:29:41","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":19196,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile124may24.docx","url":"https://assets-eu.researchsquare.com/files/rs-4452118/v1/ffad4fc20d9b4fdeefa732de.docx"},{"id":57518191,"identity":"8be65a21-ab3c-4a65-92c3-49e71d0d6224","added_by":"auto","created_at":"2024-05-31 20:29:41","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":86799,"visible":true,"origin":"","legend":"","description":"","filename":"PRISMAScRFillableChecklist21may24.docx","url":"https://assets-eu.researchsquare.com/files/rs-4452118/v1/5c81e7d4166c5f3e368796e9.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions","fulltext":[{"header":"Background","content":"\u003cp\u003eObservational studies in medical and health-related research often aim to answer causal questions, which are sharply defined as the estimation of an average causal effect (ACE) of an exposure on an outcome in a population of interest.(\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e) Missing data in observational studies often occurs in multiple variables required for the estimation of ACEs, such as the exposure, the outcome and/or the covariates used to control for confounding. Applying standard methods for ACE estimation (e.g., outcome regression with covariate adjustment) using data from complete records (\u0026ldquo;complete cases analysis\u0026rdquo; [CCA]) may lead to selection bias.(\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e) Therefore, missing data need to be carefully considered and addressed to minimise the potential for selection bias.\u003c/p\u003e \u003cp\u003eTo date most reviews of the handling of missing data have been carried out in the context of trials (see (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) and references therein). In contrast, there has been little attention given to how missing data are handled in observational studies, a context in which multivariable missingness is often encountered. One flexible and widely recommended approach for estimation in the presence of multivariable missingness is multiple imputation (MI).(\u003cspan additionalcitationids=\"CR6\" citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e) In the first stage of MI, missing data are imputed multiple times with random draws from the predictive distribution of the missing values given the observed data and a specified imputation model. In the second stage, the statistical analysis of interest (e.g., outcome regression with covariate adjustment) is applied to each imputed dataset and the results are combined to obtain a single estimate with associated standard error.(\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e) Mackinnon (2010) and Hayati Rezvan et al. (2015) reviewed the implementation and documentation of MI in both trials and observational studies,(\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) and Karahalios et al. (2012) reviewed how missing exposure data are reported in large cohort studies with one or more waves of follow-up.(\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e) However, none of these reviews focussed on complexities that arise due to multivariable missingness.\u003c/p\u003e \u003cp\u003eThe aim of the current study was to review the handling of missing data in observational studies that address causal questions using MI. A scoping review was conducted to systematically benchmark the current state of practice, focussing on four key areas: missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. In the next section we describe considerations for transparent reporting within each of these four areas to provide context for our review. We then describe our scoping review methodology and present our results. We end with a discussion of our findings and key messages.\u003c/p\u003e\n\u003ch3\u003eConsiderations for reporting ACE estimation with MI from incomplete observational data\u003c/h3\u003e\n\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eMissing data summaries\u003c/h2\u003e \u003cp\u003eDescribing the amount of missing data is an important first step for transparent reporting as the potential for selection bias will generally increase with larger proportions of missing data. When data are missing in a single variable, the number (%) of completely observed values for that variable also summarises the number (%) of complete cases. In contrast, when multiple variables required for analysis are incompletely observed, the number (%) of observed values for each variable may vastly differ from the number (%) with complete cases because of the pattern of missing data, that is, the way in which the variables are jointly missing. In the latter context, a complete description of the missing data would include summaries of the missing data for each variable, as well as summaries of each missing data pattern. Such summaries can be easily obtained in statistical software.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eMissing data assumptions\u003c/h2\u003e \u003cp\u003eDescribing the relationship between the missing and observed data, i.e., the \u0026ldquo;missing data mechanism\u0026rdquo;, is important because the performance of any estimation method depends critically on the missing data mechanism. Sometimes this will be known (e.g. a machine used for measurement temporarily stopped working), but in most cases the missing data mechanism will be unknown and assumptions about the mechanism, along with a justification for these assumptions, are required. Missingness assumptions are often expressed using the classification of missing data patterns as \u0026ldquo;missing completely at random\u0026rdquo; (MCAR), \u0026ldquo;missing at random\u0026rdquo; (MAR) or \u0026ldquo;missing not at random\u0026rdquo; (MNAR).(\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e). However, assessing the plausibility of the MCAR/MAR/MNAR assumptions in the context of multivariable missingness is difficult, partly due to the existence of several different, often imprecise, definitions of MCAR, MAR and MNAR in the literature and the difficulty in interpreting these definitions,(\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e) and partly because assessment involves making a judgement about the dependence (or lack thereof) of the distribution of the missing data pattern on the observed and missing data.(\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e) An attractive alternative to using the MCAR/MAR/MNAR assumptions is to view missing data as a causal problem and consider causes of missingness for each incompletely observed variable using missingness directed acyclic graphs (m-DAGs).(\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e) m-DAGs are an extension to standard causal diagrams (DAGs) that include nodes to represent missingness in each incomplete variable, thereby allowing for the clear and transparent specification of assumptions about the causes of missing data, as well as the causal relationships amongst the main variables of interest. Although developing a realistic m-DAG can be challenging and time-consuming, m-DAGs lead to assumptions that are more transparent and easier to assess than assumptions expressed using the MCAR/MAR/MNAR framework. Uncertainty about the assumptions depicted in the m-DAG can be assessed using a sensitivity analysis (see next section).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003ePrimary and sensitivity analyses\u003c/h2\u003e \u003cp\u003eThe next important area for reporting is to justify and describe an appropriate primary method for estimation of the ACE, given the missingness assumptions. It is well known that both a CCA and standard MI (an implementation of MI that does not incorporate an external (to the data) assumption about a difference between the distribution of the observed and missing data) can provide consistent estimation of the ACE when data are MCAR, that standard MI can provide consistent estimation when data are MAR, and that both approaches may provide biased estimation when data are MNAR. However, in the context of multivariable missingness, a CCA can also provide consistent estimation under missingness mechanisms that could be classified as MAR, and both CCA and MI have been shown in theory and simulations to provide unbiased or approximately unbiased estimation of ACEs across a range of missingness mechanisms that could be classified as MNAR.(\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e) Therefore, it is not straightforward to justify an estimation approach even if it is believed that data are MAR or MNAR. In contrast, for a given m-DAG, graph theory can be used to establish whether the ACE is \u003cem\u003erecoverable\u003c/em\u003e (that is, whether it can be estimated unbiasedly from the observed data). If the ACE is recoverable, the process of establishing recoverability can aid in determining whether a CCA and/or standard MI would be appropriate for estimation (see, e.g., the worked example provided by Lee \u003cem\u003eet. al.\u003c/em\u003e (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e)). If the ACE is not recoverable, neither standard MI nor a CCA can be used for unbiased estimation, and a more sophisticated approach that incorporates an assumption about a difference in distribution between the missing and observed values is needed. For example, the not-at-random-fully-conditional-specification (NARFCS) procedure extends standard MI to incorporate such assumptions through the inclusion of a sensitivity parameter \u0026ldquo;delta\u0026rdquo;, elicited from external information, that represents the difference between the distributions of the observed and missing values.(\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e) The assumptions made about the missing data and how this justifies the choice of analytic method for the primary analysis should be carefully described.\u003c/p\u003e \u003cp\u003eSensitivity analyses to reflect uncertainty due to assumptions made about the missing data for the primary analysis are strongly recommended.(\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e) There are two types of missing data sensitivity analyses to consider; the first is to examine the sensitivity of estimates to the assumptions made about the causes of missing data, e.g. the existence or strength of arrows in the m-DAG. The second type of missing data sensitivity analysis is to examine the sensitivity of estimates to assumptions made for modelling the missing data, e.g., the form of the imputation model. As with the primary analysis, sensitivity analyses should be justified and described in enough detail that the analysis could be reproduced.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eMI implementation\u003c/h2\u003e \u003cp\u003eWhen using standard MI for estimation, quantities that need to be described to ensure that the analysis could be reproduced include, but are not limited to: the imputation method, e.g., multivariate normal imputation or multivariate imputation by chained equations; the imputation model, e.g., which variables are included and in what form; if using multivariate imputation by chained equations, the models/methods that are used to impute each incomplete variable, e.g., linear or logistic regression; the number of imputations conducted; the analysis model that is fitted to obtain estimates within each imputed dataset; and the method for combining estimates across imputed datasets.(\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e) If using an approach that incorporates an assumption about a difference in distribution between the missing and observed values (e.g., a NARFCS procedure), then, in addition to the above quantities, it is important to describe how the assumption is incorporated in the models used for the estimation procedure.\u003c/p\u003e \u003c/div\u003e"},{"header":"Methods","content":"\u003cp\u003eThe protocol for this scoping review has been published previously.(\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) Briefly, we included observational studies that aimed to answer at least one causal research question using MI, published in \u003cem\u003eInternational Journal of Epidemiology, American Journal of Epidemiology, European Journal of Epidemiology, Journal of Clinical Epidemiology\u003c/em\u003e and \u003cem\u003eEpidemiology\u003c/em\u003e between January 2019 and December 2021. These journals were chosen as they are high ranking, general journals in epidemiology that should capture current best practices in the use of MI for estimating ACEs from observational data. A full text search for the term \u0026ldquo;multiple imputation\u0026rdquo; was conducted on the journal websites, following the methodology of Hayati Rezvan \u003cem\u003eet al\u003c/em\u003e.(\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e) Causal questions were identified if the study authors explicitly stated that they were estimating an ACE or if the study authors estimated an effect that was given, at least implicitly, a causal interpretation. Studies were excluded from the review if they met any of the following criteria: the study did not aim to answer a causal question, a clear research goal could not be identified, the primary purpose of the article was methodological development, the analysis was based on aggregated data, the article reported qualitative research, the study exposure was assigned to participants by investigators (i.e. a trial), or the study was retracted. The most recent search was performed on 10th June 2022.\u003c/p\u003e \u003cp\u003eA random sample of 10 articles were independently screened and reviewed by two reviewers (RM and KL) to develop the data collection instrument. One reviewer (RM) screened and reviewed all articles. Double data extraction was independently completed for 10% of articles (RM and KL). In addition, a second reviewer (CN or KL) screened articles when there was uncertainty about the inclusion criteria and reviewed articles when there was uncertainty about the information being extracted. Disagreements between reviewers were resolved via discussion with a third reviewer.\u003c/p\u003e \u003cp\u003eA summary of the data extraction items and a copy of the data extraction questionnaire are provided in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and the Supplementary Material, respectively, of Mainzer \u003cem\u003eet al\u003c/em\u003e.(\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) Briefly, for each study included in the review, data were extracted on the following: study characteristics; the quantity of missing data; the missing data assumptions made and whether these assumptions were justified; details of the primary analysis and whether or not the primary analysis was justified based on missing data assumptions; details of any secondary/sensitivity analysis conducted that handled the missing data differently from the primary analysis and its justification; and details of the MI implementation. For each study, we defined the \u0026ldquo;inception sample\u0026rdquo; as the set of participants who met eligibility criteria for inclusion in the study to answer the research question of interest, where eligibility criteria do not include any requirement for variables to be complete, and the \u0026ldquo;analysis sample\u0026rdquo; as the participants who were included in the analysis to answer the research question of interest. Extracted items were summarised using descriptive statistics. Data cleaning and analysis was performed in R.(\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e) Reporting follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews checklist.(\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e)\u003c/p\u003e "},{"header":"Results","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eScreening process\u003c/h2\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e presents a flow diagram of the article screening process. Of the 304 papers that met the inclusion criteria, 130 papers were included in this review.(\u003cspan additionalcitationids=\"CR23 CR24 CR25 CR26 CR27 CR28 CR29 CR30 CR31 CR32 CR33 CR34 CR35 CR36 CR37 CR38 CR39 CR40 CR41 CR42 CR43 CR44 CR45 CR46 CR47 CR48 CR49 CR50 CR51 CR52 CR53 CR54 CR55 CR56 CR57 CR58 CR59 CR60 CR61 CR62 CR63 CR64 CR65 CR66 CR67 CR68 CR69 CR70 CR71 CR72 CR73 CR74 CR75 CR76 CR77 CR78 CR79 CR80 CR81 CR82 CR83 CR84 CR85 CR86 CR87 CR88 CR89 CR90 CR91 CR92 CR93 CR94 CR95 CR96 CR97 CR98 CR99 CR100 CR101 CR102 CR103 CR104 CR105 CR106 CR107 CR108 CR109 CR110 CR111 CR112 CR113 CR114 CR115 CR116 CR117 CR118 CR119 CR120 CR121 CR122 CR123 CR124 CR125 CR126 CR127 CR128 CR129 CR130 CR131 CR132 CR133 CR134 CR135 CR136 CR137 CR138 CR139 CR140 CR141 CR142 CR143 CR144 CR145 CR146 CR147 CR148 CR149 CR150\" citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR151\" class=\"CitationRef\"\u003e151\u003c/span\u003e) There were 14 articles that were screened by a second reviewer due to uncertainty about inclusion criteria. Double data extraction was completed for a further 14 articles. All disagreements were resolved via discussion. Minor changes were made to the review protocol to accommodate unanticipated challenges in data extraction (described in Additional file 1).\u003c/p\u003e \u003cp\u003e[Figure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e about here.]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eStudy characteristics\u003c/h2\u003e \u003cp\u003eStudy characteristics are summarised in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Most papers included in this review were published in \u003cem\u003eAmerican Journal of Epidemiology\u003c/em\u003e (38%) or \u003cem\u003eInternational Journal of Epidemiology\u003c/em\u003e (26%). The most common study design was a prospective longitudinal study (65%), followed by a retrospective analysis of routinely collected data (12%). The most common outcomes used for analyses were binary (35%) and time to event (38%). Few studies made their causal aim explicit (25%) or presented a DAG to depict causal assumptions (31%). However, most studies identified a set of variables to control for confounding (82%) and almost all studies estimated an effect using a regression model (or a more sophisticated causal effect estimation method such as g-computation) with adjustment for a set of covariates, implicitly or explicitly assumed to be confounders (99%).\u003c/p\u003e \n\u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSummary of study characteristics for the 130 included papers.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003en (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePublication year\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2019\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e47 (36%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2020\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e45 (35%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2021\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e38 (29%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJournal\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAmerican Journal of Epidemiology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e50 (38%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEpidemiology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e24 (18%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEuropean Journal of Epidemiology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e21 (16%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eInternational Journal of Epidemiology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e34 (26%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJournal of Clinical Epidemiology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStudy design\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eProspective longitudinal study\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e85 (65%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRetrospective analysis of routinely collected data\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e15 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePooled cohort analysis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e9 (7%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCase-control study\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCross-sectional study\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5 (4%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCase-cohort study\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2 (2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eType of outcome used for analysis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBinary\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e45 (35%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCategorical (excluding binary)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3 (2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eContinuous\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (25%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTime to event\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e49 (38%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCausal question inclusion criteria\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eExplicitly stated interest in a causal effect\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (25%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEstimate was given a causal interpretation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e130 (100%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTypical signals of a causal analysis\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eA directed acyclic graph was used to depict causal assumptions\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e40 (31%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eA set of variables were identified to control for confounding\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e106 (82%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEffect was estimated using a regression model with adjustment for a set of covariates\u003csup\u003e4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e129 (99%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e1\u003c/sup\u003eNumber of papers published between January 2019 and December 2021 based using a Pub Med search for ((\"2019/01/01\"[Date - Publication] : \"2021/12/31\"[Date - Publication])) AND (\"\u003cem\u003eJournal name\u003c/em\u003e\"[Journal]): American Journal of Epidemiology, 876; Epidemiology, 496; European Journal of Epidemiology, 370; International Journal of Epidemiology, 814; Journal of Clinical Epidemiology, 996.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e2\u003c/sup\u003eSecondary analysis of trial data (n\u0026thinsp;=\u0026thinsp;2); prospective follow-up of cohort recruited for trial (n\u0026thinsp;=\u0026thinsp;2); pooled analysis of data from case-control and cohort studies (n\u0026thinsp;=\u0026thinsp;1); pooled analysis of data from case-control studies (n\u0026thinsp;=\u0026thinsp;1); transportability study using data from 4 clinical trials and 1 observational cohort (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e3\u003c/sup\u003eCategories are not mutually exclusive.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e4\u003c/sup\u003eOne study used structural equation modelling seemingly without adjustment for covariates, although a causal conclusion was made.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e\n\u003cp\u003e[Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e about here]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eMissing data summaries\u003c/h2\u003e \u003cp\u003eThe reported quantity of missing data is summarised in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. The size of the inception sample could not be established in 38% of studies, and 83% of studies derived an analysis sample by excluding individuals with missing data in specific variables. The percentage of complete cases could be established in just 33/130 (25%) studies (median, 25th \u0026ndash; 75th percentiles: 85%, 75% \u0026ndash; 92%), although an upper bound on the percentage of complete cases that was tighter than 100% (indicating the maximum possible percentage of complete cases given the missing data summaries provided) could be established for another 81/130 (62%) studies (median upper bound, 25th -75th percentiles: 81%, 64% \u0026ndash; 91%). Almost all studies (88%) incurred missing data in multiple variables in the analysis sample (despite most studies already arriving at an analysis sample by excluding individuals with missing data in specific variables).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAmount of missing data. Summaries are n (%) unless stated otherwise.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSummary\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAble to establish the size of the inception sample\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e81 (62%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e49 (38%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAnalysis sample was defined by excluding individuals with missing data in specific variables\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e108 (83%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e22 (17%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComplete cases\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAble to establish the % of complete cases\u003c/p\u003e \u003cp\u003e% of complete cases, median (25th \u0026minus;\u0026thinsp;75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (25%)\u003c/p\u003e \u003cp\u003e85% (75% \u0026ndash; 92%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOnly able to establish an upper bound on the % of complete cases\u003c/p\u003e \u003cp\u003eUpper bound on % of complete cases, median (25th \u0026minus;\u0026thinsp;75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e81 (62%)\u003c/p\u003e \u003cp\u003e81% (64% \u0026ndash; 91%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot able to establish the percentage of complete cases\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e16 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMissing values in the exposure\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, and able to establish the % of missing values\u003c/p\u003e \u003cp\u003e% of missing values, median (25\u003csup\u003eth \u0026ndash;\u003c/sup\u003e 75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e39 (30%)\u003c/p\u003e \u003cp\u003e11% (3% \u0026ndash; 16%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, but only able to establish a lower bound on the % of missing values\u003c/p\u003e \u003cp\u003eLower bound on % of missing values, median (25\u003csup\u003eth \u0026ndash;\u003c/sup\u003e 75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e4 (3%)\u003c/p\u003e \u003cp\u003e31% (5% \u0026minus;\u0026thinsp;67%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, but unable to establish the % or a lower bound on the %\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e70 (54%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e10 (8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMissing values in the outcome\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, and able to establish the % of missing values\u003c/p\u003e \u003cp\u003e% of missing values, median (25\u003csup\u003eth \u0026ndash;\u003c/sup\u003e 75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e19 (15%)\u003c/p\u003e \u003cp\u003e9% (5% \u0026ndash; 28%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, but only able to establish a lower bound on the % of missing values\u003c/p\u003e \u003cp\u003eLower bound on % of missing values, median (25\u003csup\u003eth \u0026ndash;\u003c/sup\u003e 75th percentiles)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003cp\u003e6% (3% \u0026ndash; 23%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, but unable to establish the % or a lower bound on the %\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e5 (4%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e91 (70%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e8 (6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMissing values in the covariates\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, in 2 or more\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e109 (84%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes, in 1 covariate only\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnable to establish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMultivariable missingness within analysis sample\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e114 (88%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e8 (6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnable to establish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e8 (6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e1\u003c/sup\u003eInception sample defined as the participants who met eligibility criteria for inclusion in the study to answer the research question of interest, where eligibility criteria do not include any requirement for variables to be complete.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e2\u003c/sup\u003eIncludes 5 studies where analyses were conducted separately by sub-groups (e.g., sex), but the inception sample for the sub-group could not be identified even though the inception sample for the entire group may have been provided.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e3\u003c/sup\u003eTime-to-event outcomes were not considered to be missing data (we did not treat censored data as missing) except for in two studies where authors explicitly stated that the outcome was imputed.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e[Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e about here]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eMissing data assumptions\u003c/h2\u003e \u003cp\u003eMissing data assumptions are described in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Most studies (66%) omitted a statement about missing data assumptions entirely. Of the 44 studies that did provide an explicit or indirect statement about missing data assumptions, 35/44 (80%) stated the MAR assumption, 2/44 (5%) stated the MCAR assumption and 6/44 (14%) alluded to data being \u0026ldquo;not MCAR\u0026rdquo; but did not distinguish between MAR and MNAR. Eleven of the 44 (25%) studies that provided a statement about missing data assumptions provided a justification for their missing data assumptions (described in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, footnote 3). Of the 130 studies in the review, 31 (24%) linked the justification for the primary analysis to the missing data assumptions.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eAssumptions about the missing data mechanism.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSummary\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMissing data assumptions\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo statement of missing data assumptions was provided\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e86 (66%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eData were assumed to be MAR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e35 (27%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eData were assumed to be not MCAR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eData were assumed to be MCAR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2 (2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eA comprehensive description of missing data assumptions was provided, e.g., using an m-DAG\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJustification provided for missing data assumptions (as % of papers that made a statement about missing data assumptions, n\u0026thinsp;=\u0026thinsp;44)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e11 (25%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (75%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJustified the primary analysis using missing data assumptions\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e31 (24%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e98 (75%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003eAbbreviations: MAR, missing at random; MCAR, missing completely at random; MNAR, missing not at random; m-DAG, missingness directed acyclic graph.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e1\u003c/sup\u003eThe assumption may have been stated explicitly or made indirectly. For example, explicit statements of the MAR assumption include: \u0026ldquo;We assumed the missing at random assumption held and is reasonable\u0026rdquo;,(\u003cspan citationid=\"CR107\" class=\"CitationRef\"\u003e107\u003c/span\u003e) and \u0026ldquo;We imputed data using multiple imputation by chained equations under the assumption that data were missing at random\u0026rdquo;.(\u003cspan citationid=\"CR135\" class=\"CitationRef\"\u003e135\u003c/span\u003e) Indirect statements of the MAR assumption include \u0026ldquo;This multiple imputation approach assumes missing at random\u0026rdquo;,(\u003cspan citationid=\"CR88\" class=\"CitationRef\"\u003e88\u003c/span\u003e) and \u0026ldquo;We first imputed missing values using multiple imputation by chained equations, which assumes the data are missing at random conditional on the variables in the imputation model\u0026rdquo;.(\u003cspan citationid=\"CR115\" class=\"CitationRef\"\u003e115\u003c/span\u003e)\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e2\u003c/sup\u003eData assumed to be \u0026ldquo;MCAR, conditional on age and ethnicity\u0026rdquo; (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e3\u003c/sup\u003eTwo studies justified assuming that data were MCAR; justifications included adding the questionnaire to the study after the study began (n\u0026thinsp;=\u0026thinsp;1) and a lack in data registration (n\u0026thinsp;=\u0026thinsp;1). Three studies justified assuming that data were not MCAR; justifications included clinicians ordering tests according to glucose level (n\u0026thinsp;=\u0026thinsp;1), and describing characteristics associated with missingness (n\u0026thinsp;=\u0026thinsp;2). Six studies justified assuming that data were MAR; justifications included describing characteristics associated with missingness and/or conducting formal hypothesis tests (n\u0026thinsp;=\u0026thinsp;4), examining the missingness pattern (n\u0026thinsp;=\u0026thinsp;1) and because children moved homes and/or were impossible to locate (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e4\u003c/sup\u003eJustified MI to improve efficiency in the estimators.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e[Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e about here]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003ePrimary and sensitivity analyses\u003c/h2\u003e \u003cp\u003eDetails of the primary and secondary/sensitivity analyses are described in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e. Most studies (79%) used MI as the primary analysis method and approximately half (69/130, 53%) of the studies conducted a secondary analysis that handled the missing data differently. Of the 69 studies that conducted a secondary analysis, 70% of studies either provided no justification for conducting the secondary analysis or justified the secondary analysis as a sensitivity analysis without describing to what aspect of their primary analysis they were assessing sensitivity. A further 17/69 (25%) studies provided a vague justification for the secondary analysis, including to examine the influence of missing data (6%), to examine the impact of the missing data method (10%), and to address possible selection bias (9%). 88% of studies that conducted a secondary analysis performed both a CCA and an MI analysis; of these, only 3 studies (5%) observed a substantial difference between CCA and MI estimates. One study (1%) conducted an \u0026ldquo;extreme case\u0026rdquo; analysis that involved single imputation of the outcome under two extreme scenarios, thereby incorporating an external assumption about a difference in distribution between the missing and observed outcome data. However, no studies used a model-based approach such as a NARFCS procedure or elicited external information from subject-matter experts about the difference in distribution between the missing and observed data.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePrimary and secondary analyses.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003en (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMethod used for the primary analysis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStandard MI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e80 (62%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStandard MI, combined with weighting\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e23 (18%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCCA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e21 (16%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSecondary analysis conducted that handled the missing data differently\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e69 (53%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e61 (47%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMethod used for the secondary analysis (as % of papers that conducted a secondary analysis, n\u0026thinsp;=\u0026thinsp;69)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStandard MI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e27 (39%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStandard MI, combined with weighting\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2 (3%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCCA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e26 (38%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCCA, combined with weighting\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (9%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eConducted more than two secondary analyses\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e8 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eJustification for the secondary analysis, (as % of papers that conducted a secondary analysis, n\u0026thinsp;=\u0026thinsp;69)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot provided\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e25 (36%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAs a sensitivity analysis (without further justification)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e23 (33%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTo examine the influence of missing data\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e4 (6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTo examine the impact of the missing data method\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e7 (10%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTo address possible selection bias\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (9%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTo examine robustness to parametric modelling assumptions\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTo examine robustness to causal assumptions about the missing data mechanism\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3 (4%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eConducted both a CCA and an MI analysis, regardless of whether weighting was used or not (as % of papers that conducted a secondary analysis, n\u0026thinsp;=\u0026thinsp;69)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e61 (88%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e8 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eObserved a substantial difference between MI and CCA estimates, (as % of papers that conducted both a CCA and a MI analysis, n\u0026thinsp;=\u0026thinsp;61)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3 (5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e53 (95%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e1\u003c/sup\u003eWeights that were used to address selection bias due to loss to follow up or censoring. Excluding that were used to address confounding bias.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e2\u003c/sup\u003eTreated \u0026ldquo;missing\u0026rdquo; as an additional category, with or without weighting (n\u0026thinsp;=\u0026thinsp;3); Single median imputation to obtain exposure (n\u0026thinsp;=\u0026thinsp;1); Single mean imputation for variables with \u0026gt;\u0026thinsp;25% missing values (n\u0026thinsp;=\u0026thinsp;1); MI for one covariate with \u0026gt;\u0026thinsp;25% missing values and single (median/mode) imputation for variables with less than 5% missing values (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e3\u003c/sup\u003eDescribed in Additional file 1, Supplementary Table\u0026nbsp;1.\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e4\u003c/sup\u003eAs a sensitivity analysis to examine the robustness of findings to statistical assumptions without stating which statistical assumptions (n\u0026thinsp;=\u0026thinsp;1); as a sensitivity analysis to address possible selection bias and to exploit information in incomplete record participants (n\u0026thinsp;=\u0026thinsp;1); Estimates were presented from both MI and a CCA after fitting two different models (one weighted and one unweighted). No justification was provided for conducting both MI and CCA analyses, but fitting two models was justified by seeing whether the choice of model impacted results and fitting models with and without weights was conducted to see how weighting affected the results (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e5\u003c/sup\u003eStandard MI, with and without weighting (n\u0026thinsp;=\u0026thinsp;4); Standard MI, with weighting, with and without imputation of exposure (n\u0026thinsp;=\u0026thinsp;1); Standard MI, with and without inclusion of the outcome in the imputation model (n\u0026thinsp;=\u0026thinsp;1); Three versions of single imputation and standard MI (n\u0026thinsp;=\u0026thinsp;1); Missing treated as an additional category, last value carried forward and standard MI (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e[Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e about here]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eMI implementation\u003c/h2\u003e \u003cp\u003eThe details of the MI implementation are described in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e. Most studies (71%) reported the number of imputations (median, 25th -75th percentiles: 20, 3\u0026ndash;100). Multivariate imputation by chained equations was the most used imputation method (67% of studies), but the imputation method was unclear for a further 25% of studies. MI was most often conducted in Stata or R. In more than half of the studies it was unclear whether all analysis variables were included in the imputation procedure (58%), whether auxiliary variables were used in the imputation procedure (55%), and whether interactions were included in the imputation model (57%). Of the 87 studies that reported using multivariate imputation by chained equations, 18 (21%) reported the type of models that were used in the imputation procedure. In approximately two-thirds (65%) of studies, the method that was used to obtain a final MI estimate and its standard error was not stated and could not be deduced from the description in the paper.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eMultiple imputation implementation. Summaries are n (%) unless stated otherwise.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSummary\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eReported number of imputations\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e92 (71%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e38 (29%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNumber of imputations, median (range)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e20 (\u003cspan additionalcitationids=\"CR4 CR5 CR6 CR7 CR8 CR9 CR10 CR11 CR12 CR13 CR14 CR15 CR16 CR17 CR18 CR19 CR20 CR21 CR22 CR23 CR24 CR25 CR26 CR27 CR28 CR29 CR30 CR31 CR32 CR33 CR34 CR35 CR36 CR37 CR38 CR39 CR40 CR41 CR42 CR43 CR44 CR45 CR46 CR47 CR48 CR49 CR50 CR51 CR52 CR53 CR54 CR55 CR56 CR57 CR58 CR59 CR60 CR61 CR62 CR63 CR64 CR65 CR66 CR67 CR68 CR69 CR70 CR71 CR72 CR73 CR74 CR75 CR76 CR77 CR78 CR79 CR80 CR81 CR82 CR83 CR84 CR85 CR86 CR87 CR88 CR89 CR90 CR91 CR92 CR93 CR94 CR95 CR96 CR97 CR98 CR99\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR100\" class=\"CitationRef\"\u003e100\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMultiple imputation method\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMultivariate imputation by chained equations\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e87 (67%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMultivariate normal imputation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (4.6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e4 (3.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (25%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSoftware package used for conducting the multiple imputation analysis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStata\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e40 (31%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33 (25%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSAS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e26 (20%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSPSS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (0.8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e14 (11%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e16 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAll analysis variables included in imputation model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e35 (27%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e20 (15%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e75 (58%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAuxiliary variables included in imputation model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e42 (32%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e16 (12%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e72 (55%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eInteractions included in imputation model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2 (1.5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e54 (42%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnclear\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e74 (57%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eReported type of models used for imputation (as % of papers that used multivariate imputation by chained equations, n\u0026thinsp;=\u0026thinsp;87)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e18 (21%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e69 (79%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStated how a final estimate and standard error were obtained\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEither stated, provided code or method could be deduced from software description\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e45 (35%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot stated\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e85 (65%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e1\u003c/sup\u003eImputation performed using a bootstrapping-based algorithm for panel data in R package Amelia II (n\u0026thinsp;=\u0026thinsp;1), imputation performed in the pan package mitml for multilevel data (n\u0026thinsp;=\u0026thinsp;1), referenced a paper where the MI methods are described rather than providing a description (n\u0026thinsp;=\u0026thinsp;1), used a multiple imputation analysis for exposure and covariates without stating what the analysis was, and used Kaplan-Meier multiple imputation for the outcome as part of a sensitivity analysis (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003ctr\u003e\u003ctd colspan=\"2\"\u003e\u003csup\u003e2\u003c/sup\u003eStudy used two software packages for analysis but it was not clear which package was used for MI (n\u0026thinsp;=\u0026thinsp;13), NORM software (n\u0026thinsp;=\u0026thinsp;1).\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e[Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e about here]\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eWe systematically reviewed the literature to assess the current state of practice in using MI for estimation of causal effects from incompletely observed observational data. We focussed on four key areas: missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation. Overall, we found that most studies are not reporting missing data, and missing-data-related assumptions, decisions, or analyses with sufficient clarity.\u003c/p\u003e \u003cp\u003eAn unexpected although perhaps unsurprising finding from this review was that the analytical sample is often arrived at by excluding individuals with missing data in specific variables, for example, by using eligibility criteria that require key variables to be completely observed. This means the full extent of missing data is difficult to quantify due to difficulty in identifying the inception sample. Therefore, for the purposes of reporting the amount of missing data in this review, we considered the amount of missing data within the analysis sample only. However, identifying the exact amount of missing data within the well-defined analysis sample was also often difficult because summaries were frequently reported per variable without describing missing data patterns.\u003c/p\u003e \u003cp\u003eDetails of the assumptions made about the missing data mechanism were often lacking and, when provided, not justified appropriately. A statement of assumptions about the missingness mechanism was provided for just one-third (33%) of studies. This is, however, an improvement over what was found in the reviews conducted by Mackinnon (2010), where 8/50 (16%) observational studies provided a statement that data were MAR,(\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) and Rezvan et al. (2015), where 7/30 (23%) observational studies stated or described the assumed missing data mechanism.(\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e) When a statement about the missing data mechanism was provided, most studies said they assumed data were MAR, but justifications for missingness assumptions were provided in just 11 studies. The most common justification for the MAR assumption included participant characteristics differing between those with and without complete data, determined by an investigation of summary statistics or by conducting formal hypothesis tests. However, it is impossible to distinguish between MAR and MNAR using data-based assessments, so these justifications are not complete. As described in the Introduction, the MCAR/MAR/MNAR assumptions are difficult to interpret and assess in the context of multivariable missingness, so it is not surprising that we found lacking or incomplete justifications for these assumptions. Of note, no study provided a comprehensive description of missing data assumptions, for example, using an m-DAG. Furthermore, the omission of a statement of missing data assumptions entirely from most studies suggests that the critical link between missing data assumptions and estimation methods is not generally appreciated. When missing data assumptions were used to guide the choice of MI as the primary analysis, the most common justification for using MI was because data were assumed to be MAR (without justifying the MAR assumption).\u003c/p\u003e \u003cp\u003eMost studies in this review used standard MI for the primary analysis. Approximately half of the studies conducted a secondary analysis that treated the missing data differently from the primary analysis, but the reason for doing so was almost always omitted or unclear. When studies did carry out two analyses that handled the missing data differently, it was common to conduct both a CCA and MI. Without justification, it is not clear why such an analysis is warranted. It may be to examine the sensitivity of ACE estimates to causal assumptions made about the missing data mechanism for the primary analysis. We speculate another motivation for such an analysis may be the misconception that a CCA is the \u0026ldquo;normal\u0026rdquo; approach to dealing with missing data while standard MI provides a more sophisticated analysis that allows you to assess whether the missing data were really an issue or not. However, if under plausible missingness assumptions neither standard MI nor CCA can provide unbiased estimation, then it would be incorrect to conclude that the missing data \u0026ldquo;had little impact\u0026rdquo; on the results. In other words, when there is no unbiased estimate to compare against, the impact of the missing data remains unknown. Of the 61 studies that conducted both a CCA and MI analysis, only 3 (5%) studies observed a substantial difference between MI and CCA estimates. Just one study conducted an analysis that incorporated assumptions about a difference between the missing and observed data distributions. Despite being an area of recent methodological development, our finding that such analyses are not being performed often is similar to findings from previous reviews, see e.g. (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR152\" class=\"CitationRef\"\u003e152\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eMI is increasingly recognised as a method for estimation that needs to be carefully tailored to the target analysis.(\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e) However, the findings from the current review suggest that there is room for improvement in the reporting of MI implementation. For example, certain aspects of the imputation model form were reported just over half of the time despite being needed to judge the appropriateness of the MI model and ensure the analysis can be reproduced.\u003c/p\u003e \u003cp\u003eThe strengths of this review are that it documents the current practices in the use of MI for estimating ACEs from incomplete observational data. Our review followed a clear, pre-specified protocol,(\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) and, by including articles in top general epidemiology journals, our review reflects current best practice. Furthermore, the analysis conducted for the current study is entirely reproducible as all data and code are available on GitHub: github.com/rheanna-mainzer/MI-scoping-review. This review has several limitations. Authors may have chosen not to provide details on all aspects of handling missing data that we examined, for example, due to strict journal word limits. However, all accompanying supplementary material was also reviewed and used for data extraction. Most of the data extraction was performed by a single reviewer (RM), with double data extraction performed for 10% of studies, so there may be some extraction errors. Also, it may have been useful to extract additional items or extract items in more detail to better capture the variety of analyses undertaken. However, additional notes on each paper were recorded and are available as part of the complete dataset on GitHub. Lastly, by limiting to five top general epidemiology journals, our results may not reflect papers published in other journals, but it seems unlikely that less highly regarded journals would exhibit higher standards in this area of practice.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThe message from our review is clear: there is a need for greater clarity in the conduct and reporting of causal effect estimation using MI with incomplete observational data. Researchers are encouraged to move beyond the MCAR/MAR/MNAR framework and adopt a more transparent approach for outlining missing data assumptions, to use missing data assumptions to justify the estimation method, and to report their assumptions, methods and results systematically. The development of guidelines that journals can adopt is a key step needed to improve practice.\u003c/p\u003e"},{"header":"List Of Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eACE\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eAverage causal effect\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eCCA\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eComplete case analysis\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eDAG\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eDirected acyclic graph\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMAR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMissing at random\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMCAR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMissing completely at random\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003em-DAG\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMissingness directed acyclic graph\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMI\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMultiple imputation\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMNAR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMissing not at random\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eNARFCS\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eNot-at-random fully conditional specification\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cem\u003eEthics approval and consent to participate\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eConsent for publication\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAvailability of data and materials\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets supporting the conclusions of this article are available from RM\u0026rsquo;s GitHub repository: github.com/rheanna-mainzer/MI-scoping-review.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eCompeting interests\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eFunding\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by Australian National Health and Medical Research Council (NHMRC) Investigator Grant Leadership Level 1 grants (grant 1196068 awarded to JS and 2017498 to KL), an NHMRC Investigator Grant Emerging Leadership Level 2 (grant 2009572 awarded to MM-B) and an NHMRC Project Grant (grant 1166023). Research at the Murdoch Children\u0026rsquo;s Research Institute is supported by the Victorian Government\u0026rsquo;s Operational Infrastructure Support Program.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAuthors\u0026rsquo; contributions\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eRM: Conceptualization, Software, Validation, Formal analysis, Writing \u0026ndash; Original Draft, Writing \u0026ndash; Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMM-B: Conceptualization, Methodology, Writing \u0026ndash; Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eCN: Validation, Writing \u0026ndash; Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eJS: Conceptualization, Methodology, Writing \u0026ndash; Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eJC: Conceptualization, Methodology, Validation, Writing \u0026ndash; Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eKL: Conceptualization, Methodology, Validation, Supervision, Writing \u0026ndash; Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAcknowledgements\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eHern\u0026aacute;n MA. The C-word: Scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108(5):616\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLederer DJ, Bell SC, Branson RD, Chalmers JD, Marshall R, Maslove DM, et al. Control of confounding and reporting of results in causal inference studies. Guidance for authors from editors of respiratory, sleep, and critical care journals. Annals Am Thorac Soc. 2019;16(1):22\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMoreno-Betancur M, Lee KJ, Leacy FP, White IR, Simpson JA, Carlin JB. Canonical causal diagrams to guide the treatment of missing data in epidemiologic studies. Am J Epidemiol. 2018;187(12):2705\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMainzer R, Moreno-Betancur M, Nguyen C, Simpson J, Carlin J, Lee K. Handling of missing data with multiple imputation in observational studies that address causal questions: protocol for a scoping review. BMJ Open. 2023;13(2):e065576.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRubin DB. Multiple imputation for nonresponse in surveys. Wiley; 2004.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVan Buuren S. Flexible imputation of missing data. CRC; 2018.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMeng X-L. Multiple-imputation inferences with uncongenial sources of input. Stat Sci. 1994:538\u0026ndash;58.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHayati Rezvan P, Lee KJ, Simpson JA. The rise of multiple imputation: a review of the reporting and implementation of the method in medical research. BMC Med Res Methodol. 2015;15:1\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMackinnon A. The use and reporting of multiple imputation in medical research\u0026ndash;a review. J Intern Med. 2010;268(6):586\u0026ndash;93.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKarahalios A, Baglietto L, Carlin JB, English DR, Simpson JA. A review of the reporting and handling of missing data in cohort studies with repeated assessment of exposure measures. BMC Med Res Methodol. 2012;12(1):96.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee KJ, Carlin JB, Simpson JA, Moreno-Betancur M. Assumptions and analysis planning in studies with missing data in multiple variables: moving beyond the MCAR/MAR/MNAR classification. Int J Epidemiol. 2023;52(4):1268\u0026ndash;75.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeaman S, Galati J, Jackson D, Carlin J. What is meant by missing at random? Stat Sci. 2013;28(2).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDoretti M, Geneletti S, Stanghellini E. Missing data: a unified taxonomy guided by conditional independence. Int Stat Rev. 2018;86(2):189\u0026ndash;204.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMohan K, Pearl J. Graphical models for processing missing data. J Am Stat Assoc. 2021;116(534):1023\u0026ndash;37.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Dashti SG, Carlin JB, Lee KJ, Moreno-Betancur M. Recoverability and estimation of causal effects under typical multivariable missingness mechanisms. arXiv preprint arXiv:230106739. 2023.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTompsett DM, Leacy F, Moreno-Betancur M, Heron J, White IR. On the use of the not-at-random fully conditional specification (NARFCS) procedure in practice. Stat Med. 2018;37(15):2338\u0026ndash;53.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLittle RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee KJ, Tilling KM, Cornish RP, Little RJ, Bell ML, Goetghebeur E, et al. Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. J Clin Epidemiol. 2021;134:79\u0026ndash;88.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eR Core Team R. R: A language and environment for statistical computing. 2023.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and explanation. Ann Intern Med. 2018;169(7):467\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAgier L, Basaga\u0026ntilde;a X, Hernandez-Ferrer C, Maitre L, Tamayo Uria I, Urquiza J, et al. Association between the pregnancy exposome and fetal growth. Int J Epidemiol. 2020;49(2):572\u0026ndash;86.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAllison RM, Birken CS, Lebovic G, Howard AW, L\u0026rsquo;Abbe MR, Morency M-E, et al. Consumption of cow\u0026rsquo;s milk in early childhood and fracture risk: A prospective cohort study. Am J Epidemiol. 2020;189(2):146\u0026ndash;55.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBadon SE, Quesenberry CP, Xu F, Avalos LA, Hedderson MM. Gestational weight gain, birthweight and early-childhood obesity: between- and within-family comparisons. Int J Epidemiol. 2020;49(5):1682\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBarul C, Richard H, Parent M-E. Night-shift work and risk of prostate cancer: Results from a Canadian case-control study, the Prostate Cancer and Environment Study. Am J Epidemiol. 2019;188(10):1801\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBell GA, Perkins N, Buck Louis GM, Kannan K, Bell EM, Gao C et al. Exposure to persistent organic pollutants and birth characteristics: The Upstate KIDS Study. Epidemiology. 2019;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBell-Gorrod H, Fox MP, Boulle A, Prozesky H, Wood R, Tanser F, et al. The impact of delayed switch to second-line antiretroviral therapy on mortality, depending on definition of failure time and CD4 count at failure. Am J Epidemiol. 2020;189(8):811\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBernasconi DP, Antolini L, Rossi E, Blanco-Lopez JG, Galimberti S, Andersen PK, et al. A causal inference approach to compare leukaemia treatment outcome in the absence of randomization and with dependent censoring. Int J Epidemiol. 2022;51(1):314\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBhatta L, Cepelis A, Vikjord SA, Malmo V, Laugsand LE, Dalen H, et al. Bone mineral density and risk of cardiovascular disease in men and women: the HUNT study. Eur J Epidemiol. 2021;36(11):1169\u0026ndash;77.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBijlsma MJ, Wilson B, Tarkiainen L, Myrskyl\u0026auml; M, Martikainen P. The impact of unemployment on antidepressant purchasing: Adjusting for unobserved time-constant confounding in the g-formula. Epidemiology. 2019;30(3).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBjelland EK, Gran JM, Hofvind S, Eskild A. The association of birthweight with age at natural menopause: a population study of women in Norway. Int J Epidemiol. 2020;49(2):528\u0026ndash;36.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBlouin B, Casapia M, Kaufman JS, Joseph L, Larson C, Gyorkos TW. Bayesian methods for exposure misclassification adjustment in a mediation analysis: Hemoglobin and malnutrition in the association between: ascaris and IQ. Epidemiology. 2019;30(5).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBorch KB, Weiderpass E, Braaten T, Hansen MS, Licaj I. Risk of lung cancer and physical activity by smoking status and body mass index, the Norwegian Women and Cancer Study. Eur J Epidemiol. 2019;34(5):489\u0026ndash;98.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCepelis A, Brumpton BM, Laugsand LE, Dalen H, Langhammer A, Janszky I, et al. Asthma, asthma control and risk of acute myocardial infarction: HUNT study. Eur J Epidemiol. 2019;34(10):967\u0026ndash;77.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChasekwa B, Ntozini R, Church JA, Majo FD, Tavengwa N, Mutasa B et al. Prevalence, risk factors and short-term consequences of adverse birth outcomes in Zimbabwean pregnant women: a secondary analysis of a cluster-randomized trial. Int J Epidemiol. 2021:dyab248.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen J, van der Duin D, Campos-Obando N, Ikram MA, Nijsten TEC, Uitterlinden AG, et al. Serum 25-hydroxyvitamin D3 is associated with advanced glycation end products (AGEs) measured as skin autofluorescence: The Rotterdam Study. Eur J Epidemiol. 2019;34(1):67\u0026ndash;77.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen R, Tedroff K, Villamor E, Lu D, Cnattingius S. Risk of intellectual disability in children born appropriate-for-gestational-age at term or post-term: impact of birth weight for gestational age and gestational age. Eur J Epidemiol. 2020;35(3):273\u0026ndash;82.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen Y, Kim ES, VanderWeele TJ. Religious-service attendance and subsequent health and well-being throughout adulthood: evidence from three prospective cohorts. Int J Epidemiol. 2020;49(6):2030\u0026ndash;40.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen Z, Glisic M, Song M, Aliahmad HA, Zhang X, Moumdjian AC, et al. Dietary protein intake and all-cause and cause-specific mortality: results from the Rotterdam Study and a meta-analysis of prospective cohort studies. Eur J Epidemiol. 2020;35(5):411\u0026ndash;29.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChigogora S, Pearce A, Law C, Viner R, Chittleborough C, Griffiths LJ et al. Could greater physical activity reduce population prevalence and socioeconomic inequalities in children\u0026rsquo;s mental health poblems? A policy simulation. Epidemiology. 2020;31(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCohen JM, Wood ME, Hern\u0026aacute;ndez-D\u0026iacute;az S, Ystrom E, Nordeng H. Paternal antidepressant use as a negative control for maternal use: assessing familial confounding on gestational length and anxiety traits in offspring. Int J Epidemiol. 2019;48(5):1665\u0026ndash;72.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eColen CG, Pinchak NP, Barnett KS. Racial disparities in health among college-educated African Americans: Can attendance at historically black colleges or universities reduce the risk of metabolic syndrome in midlife? Am J Epidemiol. 2021;190(4):553\u0026ndash;61.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCoulombe J, Moodie EEM, Shortreed SM, Renoux C. Can the risk of severe depression-related outcomes be reduced by tailoring the antidepressant therapy to patient characteristics? Am J Epidemiol. 2021;190(7):1210\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCrump C, Friberg D, Li X, Sundquist J, Sundquist K. Preterm birth and risk of sleep-disordered breathing from childhood into mid-adulthood. Int J Epidemiol. 2019;48(6):2039\u0026ndash;49.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDam V, van der Schouw YT, Onland-Moret NC, Groenwold RHH, Peters SAE, Burgess S, et al. Association of menopausal characteristics and risk of coronary heart disease: a pan-European case\u0026ndash;cohort analysis. Int J Epidemiol. 2019;48(4):1275\u0026ndash;85.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDebras C, Chazelas E, Srour B, Julia C, Kesse-Guyot E, Zelek L, et al. Glycaemic index, glycaemic load and cancer risk: results from the prospective NutriNet-Sant\u0026eacute; cohort. Int J Epidemiol. 2022;51(1):250\u0026ndash;64.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDekhtyar S, Vetrano DL, Marengoni A, Wang H-X, Pan K-Y, Fratiglioni L, et al. Association between speed of multimorbidity accumulation in old ae and life experiences: A cohort study. Am J Epidemiol. 2019;188(9):1627\u0026ndash;36.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDelaney JA, Nance RM, Whitney BM, Crane HM, Williams-Nguyen J, Feinstein MJ et al. Cumulative human immunodeficiency viremia, antiretroviral therapy, and incident myocardial infarction. Epidemiology. 2019;30(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEnthoven CA, Tideman JWL, Polling JR, Tedja MS, Raat H, Iglesias AI, et al. Interaction between lifestyle and genetic susceptibility in myopia: the Generation R study. Eur J Epidemiol. 2019;34(8):777\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFerraro AA, Barbieri MA, da Silva AAM, Goldani MZ, Fernandes MTB, Cardoso VC, et al. Cesarean delivery and hypertension in early adulthood. Am J Epidemiol. 2019;188(7):1296\u0026ndash;303.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFlannagan KS, Mumford SL, Sjaarda LA, Radoc JG, Perkins NJ, Andriessen VC et al. Is opioid use safe in women trying to conceive? Epidemiology. 2020;31(6).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFraser GE, Jaceldo-Siegl K, Orlich M, Mashchak A, Sirirat R, Knutsen S. Dairy, soy, and risk of breast cancer: those confounded milks. Int J Epidemiol. 2020;49(5):1526\u0026ndash;37.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFreedman LS, Agay N, Farmer R, Murad H, Olmer L, Dankner R. Metformin treatment among men with diabetes and the risk of prostate cancer: A population-based historical cohort study. Am J Epidemiol. 2022;191(4):626\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGarcia-Saenz A, de Miguel AS, Espinosa A, Costas L, Aragon\u0026eacute;s N, Tonne C et al. Association between outdoor light-at-night exposure and colorectal cancer in Spain. Epidemiology. 2020;31(5).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGeorge KM, Lutsey PL, Kucharska-Newton A, Palta P, Heiss G, Osypuk T, et al. Life-course individual and neighborhood socioeconomic status and risk of dementia in the Atherosclerosis Risk in Communities Neurocognitive Study. Am J Epidemiol. 2020;189(10):1134\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGerlovin H, Posner DC, Ho Y-L, Rentsch CT, Tate JP, King JT Jr., et al. Pharmacoepidemiology, machine learning, and COVID-19: An intent-to-treat analysis of hydroxychloroquine, with or without azithromycin, and COVID-19 outcomes among hospitalized US veterans. Am J Epidemiol. 2021;190(11):2405\u0026ndash;19.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGero K, Hikichi H, Aida J, Kondo K, Kawachi I. Associations between community social capital and preservation of functional capacity in the aftermath of a major disaster. Am J Epidemiol. 2020;189(11):1369\u0026ndash;78.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGialamas A, Haag DG, Mittinty MN, Lynch J. Which time investments in the first 5 years of life matter most for children\u0026rsquo;s language and behavioural outcomes at school entry? Int J Epidemiol. 2020;49(2):548\u0026ndash;58.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGiorgianni F, Ernst P, Dell\u0026rsquo;Aniello S, Suissa S, Renoux C. β2-agonists and the incidence of Parkinson disease. Am J Epidemiol. 2020;189(8):801\u0026ndash;10.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoin DE, Izano MA, Eick SM, Padula AM, DeMicco E, Woodruff TJ et al. Maternal experience of multiple hardships and fetal growth: extending environmental mixtures methodology to social exposures. Epidemiology. 2021;32(1).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoin DE, Pearson RM, Craske MG, Stein A, Pettifor A, Lippman SA, et al. Depression and incident HIV in adolescent girls and young women in HIV prevention trials network 068: targets for prevention and mediating factors. Am J Epidemiol. 2020;189(5):422\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGram IT, Park S-Y, Maskarinec G, Wilkens LR, Haiman CA, Le Marchand L. Smoking and breast cancer risk by race/ethnicity and oestrogen and progesterone receptor status: the Multiethnic Cohort (MEC) study. Int J Epidemiol. 2019;48(2):501\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHamad R, Batra A, Karasek D, LeWinn KZ, Bush NR, Davis RL, et al. The impact of the revised WIC food package on maternal nutrition during pregnancy and postpartum. Am J Epidemiol. 2019;188(8):1493\u0026ndash;502.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHarlow AF, Hatch EE, Wesselink AK, Rothman KJ, Wise LA. Electronic cigarettes and fecundability: Results from a prospective preconception cohort study. Am J Epidemiol. 2021;190(3):353\u0026ndash;61.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHarlow AF, Wesselink AK, Hatch EE, Rothman KJ, Wise LA. Male preconception marijuana use and spontaneous abortion: A prospective cohort study. Epidemiology. 2021;32(2).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHe J-R, Hirst JE, Tikellis G, Phillips GS, Ramakrishnan R, Paltiel O et al. Common maternal infections during pregnancy and childhood leukaemia in the offspring: findings from six international birth cohorts. Int J Epidemiol. 2021:dyab199.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHillreiner A, Baumeister SE, Sedlmeier AM, Finger JD, Schlitt HJ, Leitzmann MF. Association between cardiorespiratory fitness and colorectal cancer in the UK Biobank. Eur J Epidemiol. 2020;35(10):961\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHjorth S, Potteg\u0026aring;rd A, Broe A, Hemmingsen CH, Leinonen MK, Hargreave M et al. Prenatal exposure to nitrofurantoin and risk of childhood leukaemia: A registry-based cohort study in four Nordic countries. Int J Epidemiol. 2021:dyab219.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHlav\u0026aacute;čov\u0026aacute; J, Flegr J, Řež\u0026aacute;bek K, Calda P, Kaňkov\u0026aacute; Š. Male-to-female presumed transmission of toxoplasmosis between sexual partners. Am J Epidemiol. 2021;190(3):386\u0026ndash;92.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHolzhausen EA, Hagen EW, LeCaire T, Cadmus-Bertram L, Malecki KC, Peppard PE. A comparison of self- and proxy-reported subjective sleep durations with objective actigraphy measurements in a survey of Wisconsin children 6\u0026ndash;17 years of age. Am J Epidemiol. 2021;190(5):755\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHu MD, Lawrence KG, Bodkin MR, Kwok RK, Engel LS, Sandler DP. Neighborhood deprivation, obesity, and diabetes in residents of the US Gulf Coast. Am J Epidemiol. 2021;190(2):295\u0026ndash;304.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIndia-Aldana S, Rundle AG, Zeleniuch-Jacquotte A, Quinn JW, Kim B, Afanasyeva Y et al. Neighborhood walkability and mortality in a prospective cohort of women. Epidemiology. 2021;32(6).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInoue K, Mayeda ER, Paul KC, Shih IF, Yan Q, Yu Y, et al. Mediation of the associations of physical activity with cardiovascular events and mortality by diabetes in older Mexican Americans. Am J Epidemiol. 2020;189(10):1124\u0026ndash;33.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInoue K, Ritz B, Ernst A, Tseng W-L, Yuan Y, Meng Q, et al. Behavioral problems at age 11 years after prenatal and postnatal exposure to acetaminophen: Parent-reported and self-reported outcomes. Am J Epidemiol. 2021;190(6):1009\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIshii M, Seki T, Kaikita K, Sakamoto K, Nakai M, Sumita Y, et al. Short-term exposure to desert dust and the risk of acute myocardial infarction in Japan: a time-stratified case-crossover study. Eur J Epidemiol. 2020;35(5):455\u0026ndash;64.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIsumi A, Doi S, Ochi M, Kato T, Fujiwara T. Child maltreatment and mental health in middle childhood: A longitudinal study in Japan. Am J Epidemiol. 2022;191(4):655\u0026ndash;64.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJanki S, Dehghan A, van de Wetering J, Steyerberg EW, Klop KWJ, Kimenai HJAN, et al. Long-term prognosis after kidney donation: a propensity score matched comparison of living donors and non-donors from two population cohorts. Eur J Epidemiol. 2020;35(7):699\u0026ndash;707.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKerschberger B, Boulle A, Kuwengwa R, Ciglenecki I, Schomaker M. The impact of same-day antiretroviral therapy initiation under the World Health Organization treat-all policy. Am J Epidemiol. 2021;190(8):1519\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim K, Browne RW, Nobles CJ, Radin RG, Holland TL, Omosigho UR et al. Associations between preconception plasma fatty acids and pregnancy outcomes. Epidemiology. 2019;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLara M, Labrecque JA, van Lenthe FJ, Voortman T. Estimating reductions in ethnic inequalities in child adiposity from hypothetical diet, screen time, and sports participation interventions. Epidemiology. 2020;31(5).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeon ME, Schinasi LH, Lebailly P, Beane Freeman LE, Nordby K-C, Ferro G, et al. Pesticide use and risk of non-Hodgkin lymphoid malignancies in agricultural cohorts from France, Norway and the USA: a pooled analysis from the AGRICOH consortium. Int J Epidemiol. 2019;48(5):1519\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLepage B, Colineaux H, Kelly-Irving M, Vineis P, Delpierre C, Lang T. Comparison of smoking reduction with improvement of social conditions in early life: simulation in a British cohort. Int J Epidemiol. 2021;50(3):797\u0026ndash;808.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLergenmuller S, Ghiasvand R, Robsahm TE, Green AC, Lund E, Rueegg CS, et al. Sunscreens with high versus low sun protection factor and cutaneous squamous cell carcinoma risk: A population-based cohort study. Am J Epidemiol. 2022;191(1):75\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLerro CC, Hofmann JN, Andreotti G, Koutros S, Parks CG, Blair A, et al. Dicamba use and cancer incidence in the agricultural health study: an updated analysis. Int J Epidemiol. 2020;49(4):1326\u0026ndash;37.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLouie P, Upenieks L, Siddiqi A, Williams DR, Takeuchi DT. Race, flourishing, and all-cause mortality in the United States, 1995\u0026ndash;2016. Am J Epidemiol. 2021;190(9):1735\u0026ndash;43.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLove S-AM, North KE, Zeng D, Petruski-Ivleva N, Kucharska-Newton A, Palta P, et al. Nine-year ethanol intake trajectories and their association with 15-year cognitive decline among black and white adults: The Atherosclerosis Risk in Communities Neurocognitive Study. Am J Epidemiol. 2020;189(8):788\u0026ndash;800.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLyall K, Windham GC, Snyder NW, Kuskovsky R, Xu P, Bostwick A, et al. Association between midpregnancy polyunsaturated fatty acid levels and offspring autism spectrum disorder in a California population-based case-control study. Am J Epidemiol. 2021;190(2):265\u0026ndash;76.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMagnus MC, Fraser A, Rich-Edwards JW, Magnus P, Lawlor DA, H\u0026aring;berg SE. Time-to-pregnancy and risk of cardiovascular disease among men and women. Eur J Epidemiol. 2021;36(4):383\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eM\u0026aring;rild K, Tapia G, Midttun \u0026Oslash;, Ueland PM, Magnus MC, Rewers M, et al. Smoking in pregnancy, cord blood cotinine and risk of celiac disease diagnosis in offspring. Eur J Epidemiol. 2019;34(7):637\u0026ndash;49.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMitchell A, Fall T, Melhus H, Wolk A, Micha\u0026euml;lsson K, Byberg L. Is the effect of Mediterranean diet on hip fracture mediated through type 2 diabetes mellitus and body mass index? Int J Epidemiol. 2021;50(1):234\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMitha A, Chen R, Johansson S, Razaz N, Cnattingius S. Maternal body mass index in early pregnancy and severe asphyxia-related complications in preterm infants. Int J Epidemiol. 2020;49(5):1647\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMollan KR, Pence BW, Xu S, Edwards JK, Mathews WC, O\u0026rsquo;Cleirigh C, et al. Transportability from randomized trials to clinical care: On initial HIV treatment with efavirenz and suicidal thoughts or behaviors. Am J Epidemiol. 2021;190(10):2075\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMooldijk SS, Licher S, Vinke EJ, Vernooij MW, Ikram MK, Ikram MA. Season of birth and the risk of dementia in the population-based Rotterdam Study. Eur J Epidemiol. 2021;36(5):497\u0026ndash;506.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNa\u0026euml;l V, P\u0026eacute;r\u0026egrave;s K, Dartigues J-F, Letenneur L, Amieva H, Arleo A, et al. Vision loss and 12-year risk of dementia in older adults: the 3C cohort study. Eur J Epidemiol. 2019;34(2):141\u0026ndash;52.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eN\u0026oslash;st TH, Alcala K, Urbarova I, Byrne KS, Guida F, Sandanger TM, et al. Systemic inflammation markers and cancer incidence in the UK Biobank. Eur J Epidemiol. 2021;36(8):841\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eO\u0026rsquo;Brien KM, D\u0026rsquo;Aloisio AA, Shi M, Murphy JD, Sandler DP, Weinberg CR. Perineal talc use, douching, and the risk of uterine cancer. Epidemiology. 2019;30(6).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOng YY, Sadananthan SA, Aris IM, Tint MT, Yuan WL, Huang JY, et al. Mismatch between poor fetal growth and rapid postnatal weight gain in the first 2 years of life is associated with higher blood pressure and insulin resistance without increased adiposity in childhood: the GUSTO cohort study. Int J Epidemiol. 2020;49(5):1591\u0026ndash;603.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOude Groeniger J, de Koster W, van der Waal J. Time-varying Effects of Screen Media Exposure in the Relationship Between Socioeconomic Background and Childhood Obesity. Epidemiology. 2020;31(4).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePedersen KM, \u0026Ccedil;olak Y, Vedel-Krogh S, Kobylecki CJ, Bojesen SE, Nordestgaard BG. Risk of ulcerative colitis and Crohn\u0026rsquo;s disease in smokers lacks causal evidence. Eur J Epidemiol. 2021.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePinto Pereira SM, De Stavola BL, Rogers NT, Hardy R, Cooper R, Power C. Adult obesity and mid-life physical functioning in two British birth cohorts: investigating the mediating role of physical inactivity. Int J Epidemiol. 2020;49(3):845\u0026ndash;56.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePongiglione B, Kern ML, Carpentieri JD, Schwartz HA, Gupta N, Goodman A. Do children\u0026rsquo;s expectations about future physical activity predict their physical activity in adulthood? Int J Epidemiol. 2020;49(5):1749\u0026ndash;58.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRadojčić MR, Perera RS, Chen L, Spector TD, Hart DJ, Ferreira ML, et al. Specific body mass index trajectories were related to musculoskeletal pain and mortality: 19-year follow-up cohort. J Clin Epidemiol. 2022;141:54\u0026ndash;63.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRanzani OT, Mil\u0026agrave; C, Sanchez M, Bhogadi S, Kulkarni B, Balakrishnan K, et al. Association between ambient and household air pollution with carotid intima-media thickness in peri-urban South India: CHAI-Project. Int J Epidemiol. 2020;49(1):69\u0026ndash;79.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eReese H, Routray P, Torondel B, Sinharoy SS, Mishra S, Freeman MC, et al. Assessing longer-term effectiveness of a combined household-level piped water and sanitation intervention on child diarrhoea, acute respiratory infection, soil-transmitted helminth infection and nutritional status: a matched cohort study in rural Odisha, India. Int J Epidemiol. 2019;48(6):1757\u0026ndash;67.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eReinhard E, Carrino L, Courtin E, van Lenthe FJ, Avendano M. Public transportation use and cognitive function in older age: A quasiexperimental evaluation of the Free Bus Pass Policy in the United Kingdom. Am J Epidemiol. 2019;188(10):1774\u0026ndash;83.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRhee J, Loftfield E, Freedman ND, Liao LM, Sinha R, Purdue MP. Coffee consumption and risk of renal cell carcinoma in the NIH-AARP Diet and Health Study. Int J Epidemiol. 2021;50(5):1473\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRichardson K, Mattishent K, Loke YK, Steel N, Fox C, Grossi CM, et al. History of benzodiazepine prescriptions and risk of dementia: Possible bias due to prevalent users and covariate measurement timing in a nested case-control study. Am J Epidemiol. 2019;188(7):1228\u0026ndash;36.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRiddell CA, Goin DE, Morello-Frosch R, Apte JS, Glymour MM, Torres JM, et al. Hyper-localized measures of air pollution and risk of preterm birth in Oakland and San Jose, California. Int J Epidemiol. 2021;50(6):1875\u0026ndash;85.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRist PM, Buring JE, Rexrode KM, Cook NR, Rost NS. Prospectively collected lifestyle and health information as risk factors for white matter hyperintensity volume in stroke patients. Eur J Epidemiol. 2019;34(10):957\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobert A, Edmunds WJ, Watson CH, Henao-Restrepo AM, Gsell P-S, Williamson E, et al. Determinants of transmission risk during the late stage of the West African ebola epidemic. Am J Epidemiol. 2019;188(7):1319\u0026ndash;27.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRogers NT, Blodgett JM, Searle SD, Cooper R, Davis DHJ, Pinto Pereira SM. Early-life socioeconomic position and the accumulation of health-related deficits by midlife in the 1958 British Birth Cohort Study. Am J Epidemiol. 2021;190(8):1550\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRogers NT, Power C, Pinto Pereira SM. Birthweight, lifetime obesity and physical functioning in mid-adulthood: a nationwide birth cohort study. Int J Epidemiol. 2020;49(2):657\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRovio SP, Pihlman J, Pahkala K, Juonala M, Magnussen CG, Pitk\u0026auml;nen N, et al. Childhood exposure to parental smoking and midlife cognitive function: The Young Finns Study. Am J Epidemiol. 2020;189(11):1280\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRudolph KE, Gimbrone C, D\u0026iacute;az I. Helped into harm: Mediation of a housing voucher intervention on mental health and substance use in boys. Epidemiology. 2021;32(3).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRudolph KE, Levy J, Schmidt NM, Stuart EA, Ahern J. Using transportability to understand differences in mediation mechanisms across trial sites of a housing voucher experiment. Epidemiology. 2020;31(4):523\u0026ndash;33.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalmon C, Song L, Muir K, Pashayan N, Dunning AM, Batra J, et al. Marital status and prostate cancer incidence: a pooled analysis of 12 case\u0026ndash;control studies from the PRACTICAL consortium. Eur J Epidemiol. 2021;36(9):913\u0026ndash;25.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSangaramoorthy M, Hines LM, Torres-Mej\u0026iacute;a G, Phipps AI, Baumgartner KB, Wu AH et al. A pooled analysis of breastfeeding and breast cancer risk by hormone receptor status in parous Hispanic women. Epidemiology. 2019;30(3).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSato K, Amemiya A, Haseda M, Takagi D, Kanamori M, Kondo K, et al. Postdisaster changes in social capital and mental Hhealth: A natural experiment from the 2016 Kumamoto earthquake. Am J Epidemiol. 2020;189(9):910\u0026ndash;21.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchliep KC, Mumford SL, Silver RM, Wilcox B, Radin RG, Perkins NJ et al. Preconception perceived stress is associated with reproductive hormone levels and longer time to pregnancy. Epidemiology. 2019;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchuch HS, Nascimento GG, Peres KG, Mittinty MN, Demarco FF, Correa MB, et al. The controlled direct effect of early-life socioeconomic position on periodontitis in a birth cohort. Am J Epidemiol. 2019;188(6):1101\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchwartz GL, Leifheit KM, Berkman LF, Chen JT, Arcaya MC. Health selection into eviction: Adverse birth outcomes and children\u0026rsquo;s risk of eviction through age 5 years. Am J Epidemiol. 2021;190(7):1260\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSellers R, Warne N, Rice F, Langley K, Maughan B, Pickles A, et al. Using a cross-cohort comparison design to test the role of maternal smoking in pregnancy in child mental health and learning: evidence from two UK cohorts born four decades apart. Int J Epidemiol. 2020;49(2):390\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShiba K, Hanazato M, Aida J, Kondo K, Arcaya M, James P et al. Cardiometabolic profiles and change in neighborhood food and built environment among older adults: A natural experiment. Epidemiology. 2020;31(6).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShiba K, Hikichi H, Aida J, Kondo K, Kawachi I. Long-term associations between disaster experiences and cardiometabolic risk: A natural experiment from the 2011 great east Japan earthquake and tsunami. Am J Epidemiol. 2019;188(6):1109\u0026ndash;19.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShiba K, Torres JM, Daoud A, Inoue K, Kanamori S, Tsuji T et al. Estimating the impact of sustained social participation on depressive symptoms in older adults. Epidemiology. 2021;32(6).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStacy SL, Buchanich JM, Ma Z-q, Mair C, Robertson L, Sharma RK, et al. Maternal obesity, birth size, and risk of childhood cancer development. Am J Epidemiol. 2019;188(8):1503\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu Y, D\u0026rsquo;Arcy C, Meng X. Social support and positive coping skills as mediators buffering the impact of childhood maltreatment on psychological distress and positive mental health in adulthood: Analysis of a national population-based sample. Am J Epidemiol. 2020;189(5):394\u0026ndash;402.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSudharsanan N, Ho JY. Rural\u0026ndash;urban differences in adult life expectancy in Indonesia: A parametric g-formula\u0026ndash;based decomposition approach. Epidemiology. 2020;31(3).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTefft BC, Arnold LS. Estimating cannabis involvement in fatal crashes in Washington State before and after the legalization of recreational cannabis consumption using multiple imputation of missing values. Am J Epidemiol. 2021;190(12):2582\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTorres JM, Rudolph KE, Sofrygin O, Wong R, Walter LC, Glymour MM. Having an adult dhild in the United States, physical functioning, and unmet Nneeds for care among older mexican adults. Epidemiology. 2019;30(4).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTorres JM, Sofrygin O, Rudolph KE, Haan MN, Wong R, Glymour MM. US migration status of adult children and cognitive decline among older parents who remain in mexico. Am J Epidemiol. 2020;189(8):761\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTsarna E, Reedijk M, Birks LE, Guxens M, Ballester F, Ha M, et al. Associations of maternal cell-phone use during pregnancy with pregnancy duration and fetal growth in 4 birth cohorts. Am J Epidemiol. 2019;188(7):1270\u0026ndash;80.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVable AM, Duarte Cd, Cohen AK, Glymour MM, Ream RK, Yen IH. Does the type and timing of educational attainment influence physical health? A novel application of sequence analysis. Am J Epidemiol. 2020;189(11):1389\u0026ndash;401.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan der Schaft N, Schoufour JD, Nano J, Kiefte-de Jong JC, Muka T, Sijbrands EJG, et al. Dietary antioxidant capacity and risk of type 2 diabetes mellitus, prediabetes and insulin resistance: the Rotterdam Study. Eur J Epidemiol. 2019;34(9):853\u0026ndash;61.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan Gennip ACE, Sedaghat S, Carnethon MR, Allen NB, Klein BEK, Cotch MF, et al. Retinal microvascular caliber and incident depressive symptoms: The Multi-Ethnic Study of Atherosclerosis. Am J Epidemiol. 2022;191(5):843\u0026ndash;55.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan Lee L, Crozier SR, Aris IM, Tint MT, Sadananthan SA, Michael N, et al. Prospective associations of maternal choline status with offspring body composition in the first 5 years of life in two large mother\u0026ndash;offspring cohorts: the Southampton Women\u0026rsquo;s Survey cohort and the Growing Up in Singapore Towards healthy Outcomes cohort. Int J Epidemiol. 2019;48(2):433\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWagner M, Grodstein F, Proust-Lima C, Samieri C. Long-term trajectories of body weight, diet, and physical activity from midlife through late life and subsequent cognitive decline in women. Am J Epidemiol. 2020;189(4):305\u0026ndash;13.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWalsemann KM, Ailshire JA. Early educational experiences and trajectories of cognitive functioning among US adults in midlife and later. Am J Epidemiol. 2020;189(5):403\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang C-R, Hu T-Y, Hao F-B, Chen N, Peng Y, Wu J-J, et al. Type 2 diabetes\u0026ndash;prevention diet and all-cause and cause-specific mortality: A prospective study. Am J Epidemiol. 2022;191(3):472\u0026ndash;86.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang H, L\u0026aacute;szl\u0026oacute; KD, Gissler M, Li F, Zhang J, Yu Y, et al. Maternal hypertensive disorders and neurodevelopmental disorders in offspring: a population-based cohort in two Nordic countries. Eur J Epidemiol. 2021;36(5):519\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWesselink AK, Bresnick KA, Hatch EE, Rothman KJ, Mikkelsen EM, Wang TR, et al. Association between male use of pain medication and fecundability. Am J Epidemiol. 2020;189(11):1348\u0026ndash;59.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWesselink AK, Claus Henn B, Fruh V, Orta OR, Weuve J, Hauser R et al. A prospective ultrasound study of plasma polychlorinated biphenyl concentrations and incidence of uterine leiomyomata. Epidemiology. 2021;32(2).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWhite J, Fluharty M, de Groot R, Bell S, Batty GD. Mortality among rough sleepers, squatters, residents of homeless shelters or hotels and sofa-surfers: a pooled analysis of UK birth cohorts. Int J Epidemiol. 2021:dyab253.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilliams-Nguyen J, Hawes SE, Nance RM, Lindstr\u0026ouml;m S, Heckbert SR, Kim HN, et al. Association between chronic hepatitis C virus infection and myocardial infarction among people living with HIV in the United States. Am J Epidemiol. 2020;189(6):554\u0026ndash;63.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXiao J, Gao Y, Yu Y, Toft G, Zhang Y, Luo J, et al. Associations of parental birth characteristics with autism spectrum disorder (ASD) risk in their offspring: a population-based multigenerational cohort study in Denmark. Int J Epidemiol. 2021;50(2):485\u0026ndash;95.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXuan Y, Bobak M, Anusruti A, Jansen EHJM, Pająk A, Tamosiunas A, et al. Association of serum markers of oxidative stress with myocardial infarction and stroke: pooled results from four large European cohort studies. Eur J Epidemiol. 2019;34(5):471\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYisahak SF, Hinkle SN, Mumford SL, Li M, Andriessen VC, Grantz KL, et al. Vegetarian diets during pregnancy, and maternal and neonatal outcomes. Int J Epidemiol. 2021;50(1):165\u0026ndash;78.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYoussim I, Gorfine M, Calderon-Margalit R, Manor O, Paltiel O, Siscovick DS, et al. Holocaust experience and mortality patterns: 4-decade follow-up in a population-based cohort. Am J Epidemiol. 2021;190(8):1541\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu D-W, Li Q-J, Cheng L, Yang P-F, Sun W-P, Peng Y, et al. Dietary vitamin K intake and the risk of pancreatic cancer: A prospective study of 101,695 american adults. Am J Epidemiol. 2021;190(10):2029\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYuan J, Hu YJ, Zheng J, Kim JH, Sumerlin T, Chen Y, et al. Long-term use of antibiotics and risk of type 2 diabetes in women: a prospective cohort study. Int J Epidemiol. 2020;49(5):1572\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou Z, Lin C, Ma J, Towne SD, Han Y, Fang Y. The association of social isolation with the risk of stroke among middle-aged and older adults in china. Am J Epidemiol. 2019;188(8):1456\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBell ML, Fiero M, Horton NJ, Hsu C-H. Handling missing data in RCTs; a review of the top medical journals. BMC Med Res Methodol. 2014;14(1):1\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-medical-research-methodology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bmrm","sideBox":"Learn more about [BMC Medical Research Methodology](http://bmcmedresmethodol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bmrm/default.aspx","title":"BMC Medical Research Methodology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Missing data, causal inference, missingness mechanism","lastPublishedDoi":"10.21203/rs.3.rs-4452118/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4452118/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eMissing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions (\u0026ldquo;missing completely at random\u0026rdquo;, \u0026ldquo;missing at random\u0026rdquo; [MAR], \u0026ldquo;missing not at random\u0026rdquo;) are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eWe searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eOf the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eEffort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.\u003c/p\u003e","manuscriptTitle":"Gaps in the usage and reporting of multiple imputation for incomplete data: Findings from a scoping review of observational studies addressing causal questions","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-05-31 20:29:36","doi":"10.21203/rs.3.rs-4452118/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-06-25T07:29:17+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-06-22T13:25:20+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-06-15T10:34:29+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"182977075592393571379369409136029865247","date":"2024-05-31T07:50:58+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"87677486062978304285497142300840294134","date":"2024-05-28T06:05:58+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-05-27T19:40:20+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-05-27T19:24:07+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-05-27T19:22:09+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-05-27T19:22:09+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Medical Research Methodology","date":"2024-05-21T04:24:50+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-medical-research-methodology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bmrm","sideBox":"Learn more about [BMC Medical Research Methodology](http://bmcmedresmethodol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bmrm/default.aspx","title":"BMC Medical Research Methodology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"969ff979-b878-4965-8d07-7e539cfc18cc","owner":[],"postedDate":"May 31st, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2024-09-09T16:11:07+00:00","versionOfRecord":{"articleIdentity":"rs-4452118","link":"https://doi.org/10.1186/s12874-024-02302-6","journal":{"identity":"bmc-medical-research-methodology","isVorOnly":false,"title":"BMC Medical Research Methodology"},"publishedOn":"2024-09-04 16:05:15","publishedOnDateReadable":"September 4th, 2024"},"versionCreatedAt":"2024-05-31 20:29:36","video":"","vorDoi":"10.1186/s12874-024-02302-6","vorDoiUrl":"https://doi.org/10.1186/s12874-024-02302-6","workflowStages":[]},"version":"v1","identity":"rs-4452118","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4452118","identity":"rs-4452118","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00