Fairness analysis of machine learning predictions of aggression in acute psychiatric care | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Fairness analysis of machine learning predictions of aggression in acute psychiatric care Yifan Wang, Laura Sikstrom, Robert Xiao, Zoe Findlay, Juveria Zaheer, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7781555/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 10 You are reading this latest preprint version Abstract Managing patient aggression is a major challenge in acute psychiatry, and machine learning (ML) applications are increasingly being developed to support individualized risk assessment and de-escalation. However, ML algorithms have been shown to exhibit unfair behavior based on protected characteristics, such as an individual’s sex or ethnicity. This is especially worrying in psychiatric contexts as social and systemic inequities - such as disparities in access to psychiatric care or racial profiling in admissions to hospital by police - can become embedded in training datasets. Despite the potential for ML algorithms to replicate and amplify such inequities, the fairness of ML-based predictions of aggression in acute psychiatry has received limited investigation. To address this gap, we trained an ML algorithm to predict aggressive incidents from structured electronic health records corresponding to 17,703 patients receiving acute care at a large psychiatric hospital between January 2016 and May 2022 ( n = 42,719 observation days). We analyzed predictions for fairness by assessing disparities in false positive rates [FPR] and true positive rates [TPR] (i.e., the equalized odds criterion), based on patient race/ethnicity, gender, admission mode, citizenship, and housing status, as well as intersections of race/ethnicity and gender. The random forest algorithm performed best (ROC-AUC = 0.812). Fairness analyses revealed significant disparities in FPR and TPR across subgroups, such that FPR were higher for Middle Eastern and Black patients, men, those admitted into emergency care by the police, and those with unstable or supportive forms of housing. Middle eastern men had the highest FPR of any intersectional group. Our analysis demonstrates the potential for ML algorithms to exhibit unfairness across multiple demographic and social groups in predictions of inpatient aggression, reflecting known social and structural inequities. To prevent the reinforcement and amplification of existing disparities, it will be critical to apply strategies to mitigate unfairness in this context. At the same time, evaluating and exploring unfair ML behavior can reveal unique insights into underlying inequities that might be impacting patient experiences and care. Health sciences/Health care Biological sciences/Psychology Social science/Psychology Health sciences/Risk factors Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Patient aggression is a major concern in clinical settings such as acute psychiatry, and encompasses a range of behaviors including verbal abuse, sexual harassment, and physical violence. It has adverse effects on the quality of care, patient and staff safety, and public perceptions of mental health care 1 – 3 . However, coercive interventions used to manage the risk of aggression can be similarly problematic: the administration of medications (i.e., chemical restraints) and physical restraints to manage aggression have been shown to negatively impact a patient’s experience of care in a potentially traumatizing way 4 , 5 . Machine learning (ML) has increasingly been applied to predict risk of patient aggression in psychiatric and forensic settings, as it can leverage complex datasets to generate more individualized predictions to enable earlier and more targeted de-escalation using non-coercive forms of intervention. Previous studies have trained ML algorithms on diverse datasets, from clinical data to neuroimaging scans, with these algorithms often exceeding the predictive performance of current clinical instruments 6 . However, there has been limited investigation of the fairness of these models in acute psychiatry – that is, whether their predictions display any prejudice or favoritism towards an individual or group based on certain inherent or acquired characteristics, such as race or sex (ie: a protected group ) 7 . Algorithmic fairness has been a growing area of focus in ML research, and widely used algorithms for criminal recidivism prediction and healthcare resource allocation have been shown to be unfair towards racially marginalized groups, such as Black or lower-income individuals 8 – 10 , as well as the intersections of underserved groups, like Hispanic females 11 . The potential for algorithmic unfairness is particularly concerning in the context of predicting aggression in acute psychiatry because of pervasive inequities embedded in the training data. Inequities can be structural (interpersonal and systemic processes which create inequities in power and resources 12 ) and social (disparities relating to an individual’s proximal social, political, and economic environment 12 ). These include racial profiling in police apprehensions for involuntary psychiatric admission, gendered biases in clinician perceptions of inpatient violence risk, and disparities in access to quality mental health care based on socioeconomic status 13 – 15 . Inequities defined by the intersection of race, ethnicity and gender are also a significant concern, given the well-documented challenges Black men face in in accessing and receiving mental health care 16 – 18 . These inequities can be readily embedded in the data used to train ML algorithms. Evidence of performance disparities in ML-based predictions of violence against hospital providers have emerged in one prior study, suggesting less accurate predictions for Asian and Native Hawaiian patient groups, but the analysis only examined fairness stratified by patient race, and the ability to draw conclusions was limited by small sample sizes 19 . There is a need to better assess how both proximal social indicators, like race/ethnicity and sex, as well as upstream factors such as policing or housing are related to the fairness of ML-based predictions of aggression. This analysis, which was part of a larger mixed-methods study dissecting the construction and use of predictive care tools, had three main objectives 20 . First, we used demographic and clinical features to train a supervised ML algorithm to predict whether a patient would become aggressive on a given day. Second, we performed a fairness assessment, examining the algorithm's performance stratified by demographic characteristics of patients, focusing on gender, race/ethnicity, citizenship, admission mode, and housing status. Third, we characterized how the model performs for groups of patients defined by the intersection of both race/ethnicity and gender 11 . Overall, by examining algorithmic unfairness in ML predictions of aggression, our findings highlight the importance of assessing fairness across diverse social and demographic factors during model development and evaluation, to prevent the deployment of ML models in acute psychiatry that can harm specific populations. Methods Study population This analysis utilized electronic health records (EHRs) from ten inpatient care units at the Centre of Addiction and Mental Health (CAMH), a large mental health and addictions hospital in Toronto, Canada, between January 2016 and May 2022. Only patients who were admitted to inpatient units via the hospital’s emergency department (ED) were included, to enable consideration of admission mode (e.g., apprehension for admission by police). Patients were excluded from analysis if they were referred from a corrections facility or another hospital. However, we did not exclude patients with prior acute care visits, so the analysis included multiple inpatient hospitalizations from unique patients. Demographic data were obtained from patient-reported forms routinely collected at admission, and include age, gender, sexual orientation, citizenship, housing status, income, highest education level, language, ethnicity, and marital status. Clinical and contextual factors were also documented at patient intake, including primary psychiatric diagnosis assessed by ED psychiatrists via a brief diagnostic interview, presence of substance-induced symptoms, mode of admission, and inpatient unit location. Finally, risk assessment data included ratings on the Dynamic Appraisal of Situational Aggression (DASA), which is a clinically validated instrument assessing each patient’s risk of aggression over the next 24 hours. Assessment is based on seven dichotomous items which capture behavioural and interpersonal factors related to this risk (e.g., agitation, sensitivity to provocation, verbal threats) 21 , 22 . At CAMH, DASA scores are generated by nurses each morning that a patient is on the unit, based on their clinical observations and relevant information from a chart review of the past 24 hours. The outcome data included aggressive incidents involving patients, as documented by any attending staff (e.g., nurses, clinicians, security guards or program assistants) in CAMH’s reporting tool. Incidents were included as outcomes if they were categorized by staff as either “abuse/assault/violence” or “physical/sexual/verbal behaviors and assaults”. Any documented use of any combination of chemical restraints, physical restraints, or seclusion was included as an outcome since these interventions are only used when violence or aggression is deemed imminent 20 , 23 . Outcomes (i.e., aggressive incidents or restraints and seclusions) occurring during prior visits to acute care were treated as predictors (i.e., a binary variable indicating whether incident occurred prior to admission). Because patients are assessed using the DASA for the risk of imminent aggression (e.g. within the next 24 hours), predictions were made on each day of the acute care stay. Most outcomes were expected to occur on the first three days that a patient was receiving acute care. For this reason, we included up to three days or prediction windows for each visit. If one or more outcomes occurred during a given visit, we only included data collected until the first occurrence, since interventions used to manage the outcome may alter risk. Since clinical, sociodemographic and admission data was only collected once for each visit, it was repeated across the three days for each visit and patient. Data processing The intake demographic questionnaire contained open-ended response options for variables that were categorized as ‘other’, which were all manually categorized into the existing categories, in consultation with CAMH acute care clinicians. Gender was grouped into three categories: male, female, and gender expansive. Race/ethnicity was grouped into Black, Asian, South Asian, Indigenous, Latin-American, Middle Eastern, Mixed, and White. Primary diagnoses were grouped by the study team and in consultation with clinicians into ten diagnosis types, guided by the DSM-5 categories. Each of the seven DASA items was included as an individual dichotomous variable to retain information about specific aggression-related factors. An extensive description of data capture and processing can be found in the supplementary information. The final predictor dataset included 16 categorical variables. A 70%/30% train-test split was performed. Randomization for the train-test split was done by patient, as opposed to by observation-day to ensure that different inpatient days, or multiple presentations to acute care for the same patient were not split between the two sets. No functions were fitted on the test set, which was withheld until the final performance and fairness evaluation. Variables with ≤ 20% missing were imputed by the mode, while missing values for variables with > 20% missing were imputed with “missing” to preserve the potential informativeness of high missingness 23 . Non-binary variables were one-hot-encoded in preparation for model training. Addressing class imbalance In the study population, the outcomes were imbalanced by almost a ratio of 33:1, with significantly greater cases with no incident. To prevent the decision boundary from greatly favoring the majority class at the expense of the fidelity of minority class predictions (e.g., by making almost exclusively negative predictions), F1 score was used as the primary evaluation metric, which is calculated based on the balance between precision and recall, thereby offering a more reliable evaluation of imbalanced data classification. Additionally, a range of resampling algorithms were tested as part of the model tuning process, where the training set was either undersampled by removing cases from the majority class or oversampled by adding synthetic cases to the minority class to balance the distribution of positive and negative cases. Model training Model selection and optimization was performed on the training set using 5-fold cross validation, optimizing for F1 score. Logistic regression, naïve bayes, random forest, gradient boosting, support vector machine, decision tree, and simple neural network were evaluated as candidate models. Additionally, no resampling, random undersampling, nearmiss undersampling, and SMOTE oversampling were evaluated as candidate resampling methods in combination with all candidate models (Supplementary methods). The final model and sampler was refit on the entire training set and its performance was measured based on its predictions on the hold-out test set. No resampling was performed on the test set (Fig. 1 ). Standard deviations and confidence intervals for all performance and fairness metrics were calculated by training the model five times using five different random seeds, then applying each to the test set 11 . Feature importances were extracted using impurity-based importance as implemented in Sci-kit learn. Fairness assessment Fairness analysis was conducted using observational criteria based on post-hoc analysis of model outputs, true outcome, and sensitive attributes. Disparate mistreatment as measured by false positive rate (FPR) was used as a primary metric by which to assess the fairness of the algorithm, where \(\:FPR=\frac{FP}{FP+TN\:\left(\text{A}\text{c}\text{t}\text{u}\text{a}\text{l}\:\text{n}\text{e}\text{g}\text{a}\text{t}\text{i}\text{v}\text{e}\text{s}\right)}\) 25 . This metric enables a focus on whether social and structural inequities may lead to higher rates of incorrectly flagging individuals from certain demographic groups as being at high risk for aggression. A fair algorithm with respect to disparate mistreatment must not have different FPR between subgroups of sensitive attributes. To understand the fairness behavior of the algorithm more thoroughly, group-specific true positive rate (TPR), F1-score, ROC curves, and calibration curves were also assessed. Attributes that were analyzed for fairness include race/ethnicity, gender, admission mode, citizenship, and housing status. Intersectional fairness analysis was performed for the intersection of gender and race/ethnicity. When performing fairness analysis for a given feature, individuals with imputed values for that feature were excluded from that analysis. Results Sample characteristics Across all observation days, there were a total of 41447 “no incident” cases (i.e., observation days on which a violent or aggressive incident was not reported) to 1272 “incident” cases (i.e., days on which such an incident was reported) corresponding to 17703 total unique patients. These patients were split into 29879 cases in the train set (n = 12398 unique patients) and 12840 in the test set (n = 5305 unique patients). Patients were relatively evenly distributed across age categories, but they were predominantly male, White, single, and of Canadian citizenship. The most common diagnoses were psychotic disorders, and patients were most commonly accompanied during ED admission by family or friends. Citizenship, Housing, Marital status, and Admission mode had proportions of missing observations above 20%, thereby requiring imputation with a “missing” label. All features differed significantly when comparing no incident vs incident populations (p < 0.001). Sample characteristics at the level of observation days in the overall data can be found in Table 1 . Sample characteristics at the level of unique patients can be found in supplementary table 2, and characteristics by train/test set is reported in supplementary table 3. Table 1 Sample characteristics at the level of observation days Characteristic Overall, n = 42719 No incident, n = 41447 1 Incident, n = 1272 1 p-value 2 Age < 0.001 16–24 8525 (20%) 8239 (20%) 286 (22%) 25–29 7306 (17%) 7076 (17%) 230 (18%) 30–34 6234 (15%) 6026 (15%) 208 (16%) 35–44 8327 (19%) 8059 (19%) 268 (21%) 45–64 10440 (24%) 10199 (25%) 241 (19%) 65+ 1887 (4.4%) 1848 (4.5%) 39 (3.1%) Gender < 0.001 Female 16824 (39%) 16444 (40%) 380 (30%) Male 24977 (58%) 24121 (58%) 856 (67%) Gender diverse 440 (1.0%) 429 (1.0%) 11 (0.9%) Missing 478 (1.1%) 453 (1.1%) 25 (2.0%) Citizenship < 0.001 Canadian 29804 (70%) 29050 (70%) 754 (59%) Indigenous 32 (< 0.1%) 32 (< 0.1%) 0 (0%) Not Canadian 2328 (5.4%) 2257 (5.4%) 71 (5.6%) Missing 10555 (25%) 10108 (24%) 447 (35%) Housing status < 0.001 Living with family 6540 (15%) 6370 (15%) 170 (13%) Own 4304 (10%) 4224 (10%) 80 (6.3%) Rent 13112 (31%) 12834 (31%) 278 (22%) Supportive housing 3966 (9.3%) 3846 (9.3%) 120 (9.4%) Unstable housing/unhoused 5488 (13%) 5243 (13%) 245 (19%) Missing 9309 (22%) 8930 (22%) 379 (30%) Marital status < 0.001 Partnered 3231 (7.6%) 3179 (7.7%) 52 (4.1%) Single 26389 (62%) 25669 (62%) 720 (57%) Missing 13099 (31%) 12599 (30%) 500 (39%) Race/Ethnicity < 0.001 White 20470 (48%) 19969 (48%) 501 (39%) Asian 2711 (6.3%) 2651 (6.4%) 60 (4.7%) Black 5218 (12%) 4968 (12%) 250 (20%) South Asian 2270 (5.3%) 2204 (5.3%) 66 (5.2%) Indigenous 1059 (2.5%) 1037 (2.5%) 22 (1.7%) Latin American 1212 (2.8%) 1180 (2.8%) 32 (2.5%) Middle Eastern 1357 (3.2%) 1296 (3.1%) 61 (4.8%) Mixed 1099 (2.6%) 1072 (2.6%) 27 (2.1%) Missing 7323 (17%) 7070 (17%) 253 (20%) Admit Mode < 0.001 Case Worker / Nurse 2024 (4.7%) 1980 (4.8%) 44 (3.5%) Friend / Family 9917 (23%) 9744 (24%) 173 (14%) Mobile Crisis 134 (0.3%) 123 (0.3%) 11 (0.9%) Other 956 (2.2%) 924 (2.2%) 32 (2.5%) Police 8761 (21%) 8269 (20%) 492 (39%) Self 8300 (19%) 8195 (20%) 105 (8.3%) Missing 12627 (30%) 12212 (29%) 415 (33%) Primary Diagnosis < 0.001 Adjustment disorder 389 (0.9%) 385 (0.9%) 4 (0.3%) Anxiety disorder 1188 (2.8%) 1182 (2.9%) 6 (0.5%) Bipolar mood disorder 5785 (14%) 5535 (13%) 250 (20%) Depressive disorder 5161 (12%) 5136 (12%) 25 (2.0%) Neurocognitive disorders 442 (1.0%) 426 (1.0%) 16 (1.3%) Neurodevelopmental disorders 772 (1.8%) 740 (1.8%) 32 (2.5%) Other 922 (2.2%) 901 (2.2%) 21 (1.7%) Personality disorder 2622 (6.1%) 2558 (6.2%) 64 (5.0%) Primary psychotic disorder 18404 (43%) 17710 (43%) 694 (55%) Substance-related disorder 5488 (13%) 5350 (13%) 138 (11%) Trauma and stressor related disorder 1070 (2.5%) 1056 (2.5%) 14 (1.1%) Missing 476 (1.1%) 468 (1.1%) 8 (0.6%) 1 Data presented as n, (% of respective group total); 2 Chi-square test of independence Model performance The best-performing model on the train set was a 200-estimator random forest (RF) with no oversampling or undersampling. On the hold-out test set, the random forest obtained ROC-AUC of 0.8120 ± 0.0016, accuracy of 0.9323 ± 0.0004, and F1 score of 0.2213 ± 0.0031 (Fig. 2 A). The model had a sensitivity/TPR of 0.3265 ± 0.0057 and a specificity of 0.9507 ± 0.0005. Feature importance extracted from the RF revealed that the DASA items, especially irritability, as well as the presence of a violent/aggressive incident or restraint occurring prior to admission into acute care, are highly important for predictions. (Fig. 2 C) Fairness assessment Race/Ethnicity Middle Eastern individuals had the highest FPR among all ethnic groups (FPR [standard deviation] = 0.0801 [0.0048]) followed by Black (0.0694 [0.002]), Indigenous (0.0552 [0.0037]), Mixed (0.0525 [0.0021]), White (0.0404 [0.0008]), South Asian (0.0356 [0.0019]), Asian (0.0322 [0.0028]), and Latin American (0.0313 [0.0028]) individuals ( Fig. 3 , Supplementary table 5 ). There was also significant variation in TPR: Latin American (TPR [standard deviation] = 0.3846 [0.0000]) and Middle Eastern (0.3778 [0.0222]) having the highest TPR, Asian (0.2381 [0.0000]) and South Asian (0.2571 [0.0350]) had the lowest TPR (Supplementary table 5). Predictive accuracy was highest in Middle Eastern individuals (F1 score [standard deviation] = 0.2372 [0.0158]). ROC curves reveal significant differences in the TPR-FPR trade-offs between groups, with Black individuals having considerably higher FPR for any TPR (Supplementary Fig. 2). Gender Men (0.0542 [0.0005]) had higher FPR than women (0.0426 [0.0009]) and gender expansive individuals (0.0418 [0.006]). TPR, F1 score and ROC-AUC are all lower in men compared to women. At conservative prediction thresholds, men have higher FPR for any given TPR compared to women and gender expansive individuals. Admission mode Individuals who were admitted by police had significantly higher FPR than any other group label (0.0941 [0.0019]), followed by other (0.0547 [0.002]), mobile crisis (0.0476 [0.0000]), self (0.0326 [0.0007]), case worker/nurse (0.0303 [0.0007]), and friend/family (0.0264 [0.0011]). Police admissions also had relatively high TPR (0.4174 [0.0165]) and F1 score (0.2405 [0.0158]). Citizenship Canadian citizens had higher FPR and TPR (FPR = 0.0427 [0.004], TPR = 0.3145 [0.0072]) than non-citizens (FPR = 0.0210 [0.0024], TPR = 0.2696 [0.0174]). There is significant mismatch in the ROC curves between the two groups with Canadian individuals having higher FPR rates for any given TPR at conservative prediction thresholds. Housing Those who were in unstable forms of housing or unhoused (0.0829 [0.0014]) or were living in supportive housing (0.0502 [0.0017]) had higher FPR than those who had more stable forms of housing, such as owning (0.0344 [0.002]), renting (0.0318 [0.0007]), and living with family (0.0273 [0.001]). Individuals living with supportive housing have considerably lower predictive accuracy than other groups with the lowest TPR (0.2308 [0.0243]), F1 score (0.1356 [0.0123]) and ROC-AUC (0.7651 [0.0048]). Intersectional analysis of disparate mistreatment Intersectional analysis was performed for the intersection of ethnicity and gender (Fig. 4 ; Supplementary table 6). The “gender expansive” group was excluded due to low sample sizes (N < 15 observations) for all ethnicities except White. All other intersectional groups had more than 50 observations. Middle Eastern men had the highest FPR (FPR [standard deviation] = 0.0933 [0.0074]) and a highly pronounced gender-specific effect; Middle Eastern women had a significantly lower FPR (0.0372 [0.0047]). However, both genders were similar in terms of their TPR. Black men (0.0759 [0.0026]) and Indigenous men (0.0747 [0.0062]) also had a relatively high FPR, and their TPRs also tended to be higher as compared to Black women (TPR [standard deviation] = 0.2353 [0.0372]) and Indigenous women (0.2500 [0.0000]). Across all races/ethnicities, men had an intersectional FPR equal to or greater than that of women. Discussion In this study, we assessed whether ML predictions of inpatient aggression in acute psychiatric care are unfair. To our knowledge, this is the most comprehensive fairness assessment of ML as related to this outcome, and builds on previous work by Dobbins et al. 19 by examining a wider range of social determinants and applying an intersectional approach. A random forest model was trained on a range of demographic, clinical, admission, and risk assessment data, yielding an ROC-AUC of 0.81. Although maximizing predictive performance was not an emphasis of this study, the model achieved comparable performance to ML algorithms reported in prior research trained on tabular data in clinically heterogenous psychiatric populations (ROC-AUC obtained in Suchting et al. = 0.78 23 , Menger et al. = 0.76 26 , Wang et al. = 0.63 27 ). The fairness assessment revealed the algorithm violates both disparate mistreatment and equalized odds: there were significant disparities in FPR, TPR, and ROC-AUC curves across race/ethnicity, gender, admission mode, citizenship, and housing status. Relative to other groups, FPR was elevated in individuals who are Middle Eastern and Black, those who identify as male, are admitted into emergency care by the police, Canadian citizens, and with unstable or supportive forms of housing. Intersectional analyses revealed that Middle Eastern men had the highest FPR among all groups. There were significant differences in TPR and ROC-AUC curves in relation to the FPR of each group, suggesting the nature of algorithmic unfairness differs between groups. For example, in the case of patients who are Middle Eastern, in unstable or no housing, or admitted by police, FPR and TPR were both elevated relative to other groups, suggesting the model was calibrated to increase overall predictive accuracy at the expense of higher FPR. Conversely, for other groups like Black patients, models had high FPR and low TPR, suggesting poor overall performance. Importantly, observational measures of unfairness such as TPR and FPR are merely outcome measures that do not explain how unfair predictions arise. Rather, these results must be understood in the context of underlying social and structural inequities that can give rise to unfair predictions in the first place, such as racial profiling in the criminal justice system, racial residential segregation, or barriers to accessing mental healthcare 28 . We discuss some of these parallels in the section below. Black individuals are less likely to receive adequate outpatient psychiatric treatment, they are more likely to be involuntarily admitted into inpatient treatment, and they may also present with more severe psychotic symptoms, compared to White individuals 13 , 14 . Black men in particular face significant barriers in accessing mental health care, and they are more likely to be misdiagnosed with psychotic disorders, as compared to White men 16 – 18 . Interpersonal bias is also possibility, where structurally reinforced stereotypes may lead to higher risk perceptions for racially marginalized individuals on clinical risk instruments like the DASA, though research is largely inconclusive on whether these instruments are themselves biased. Both male gender and Black race have been found to be significantly associated with violence in psychiatric settings. Findings from our study suggest that these associations can become embedded in clinical datasets, which may lead to unfair treatment by ML algorithms, both via increased false positive predictions and poorer performance in identifying at-risk individuals 2 , 29 . Police apprehension for admission into the ED is also communicated among clinicians to be a relevant factor in risk assessment due to an increased likelihood of aggression in patients admitted involuntarily, and/or referred by the police 2 , 30 . It is therefore perhaps not surprising that this mode of admission was associated with the highest FPR than any other predictor in the fairness assessment. Patients apprehended by police for admission into emergency psychiatric care are indeed more likely to become violent or aggressive, which is likely to account for relatively high FPRs and TPRs for this group 31 . At the same time, racially marginalized and Indigenous groups have increased rates of involuntary admissions into psychiatric care by police, likely due to various factors, such as barriers to accessing mental health care or racial profiling 30 , 32 – 34 . This tendency may in part explain the finding of higher FPRs among Black men, and potentially Middle Eastern and Indigenous individuals as well. The fairness assessment also highlights housing as a potential source of algorithmic unfairness, specifically for those with unstable or supportive forms of housing. On a social level, unstable housing has been associated with psychiatric conditions, such as trauma and substance use, as well as a lower educational attainment and disrupted support networks 35 , 36 . Conditions of unstable housing may contribute to food or water insecurity, sleep deprivation, and hyper vigilance, which can lead to the expression of behaviours that are rated as precursors of aggression on clinical instruments such as the DASA (e.g: irritability, sensitivity to provocation, and unwillingness to follow instructions). Structurally, current psychiatric care systems are not well-equipped to meet the constellation of needs of unhoused individuals, which may contribute to their increased ED use and higher false positive predictions for the risk of violence in inpatient care 36 – 40 . Supportive housing services for people with severe mental illness offer more stability, but they are in high demand but extremely under-resourced, often unable to meet complex, individual needs 41 . We also identified performance disparities that are not linked to well-researched inequities. For example, while qualitative analyses have shown a general distrust of biomedical mental health services among Middle Eastern individuals, there is a considerable research gap in characterizing how they interact with these systems 42 . Although our analyses suggest that high FPR for Middle Eastern patients may be in part related to improved model TPR/sensitivity, social and structural determinants likely play a role in the way their risk of violence or aggression is perceived; these may be related to cultural communication barriers, or expressions of distrust manifesting as increased irritability or an unwillingness to follow instructions. However, the gender discrepancy in FPR (but not in TPR) for this group suggests this effect may only extend to men. Similarly, the algorithm displayed modest FPR differences based on citizenship, which is also not a well-documented demographic feature in the psychiatric literature. Nevertheless, citizenship may be an important factor to consider in future fairness assessments of ML models in healthcare, given its impact on access to community, social, and health services. These findings highlight the importance of thoughtful documentation and processing of demographic data, which is a strength of our study. Specifically, access to high-quality and diverse sociodemographic information is necessary for evaluating ML models for fairness, making it critical that these data are measured or not lost during processing 43 . Middle Eastern ethnicity, for example, does not appear to be commonly encoded as a unique racial or ethnic category in research datasets, which inevitably precludes the discovery of important trends in this population as identified in our study 44 . Demographics in our dataset were drawn from CAMH’s health equity form, which was designed to capture a range of rich features which are not frequently characterized, such as specific ethnic and gender minorities 45 . Overall, our results suggest that if fairness is not properly considered, the deployment of ML algorithms to support the prediction of aggression in acute psychiatric care, and other clinical settings, has the potential to cause significant harms with respect to both disparate mistreatment and equalized odds in socially and structurally disadvantaged groups 44 . Bias in ML algorithms has already been shown to reduce clinician accuracy 46 ; in psychiatric risk assessment, the unwarranted use of interventions based on a false positive prediction can lead to unnecessary distress, disruption of trust in a therapeutic relationship or the health system, and may even precipitate violent or aggressive incidents when they otherwise would not have occurred 47 . Furthermore, there is extensive literature highlighting the cyclical nature of algorithmic unfairness: algorithms can reproduce and amplify existing inequalities, which can then become embedded in new datasets used to develop ML algorithms or inform care 44 , 48 . Even if an unfair recommendation is not followed, disagreement between providers and ML algorithms may lead providers to fear legal implications against them, which may negatively impact care 49 . Given these concerns, algorithmic unfairness is recognized by both patients and providers as a major barrier in the clinical implementation of predictive risk models 49 , 50 . There exist a range of algorithmic methods to improve a model’s fairness, such as integrating fairness benchmarks into optimization criteria during model training, resampling the input data itself to improve fairness, or enforcing specific fairness criterion using group-specific prediction thresholds 51 . Several studies have now applied “debiasing” methods to clinical ML algorithms, demonstrating promising results 52 – 54 . Our findings highlight the necessity to properly assess fairness so that these measures can be applied as appropriate to predictive risk models before they are deployed. An important consideration, however, is that most debiasing methods use the ground truth outcome label as a benchmark to determine whether a model is fair 55 . In other words, most methods seek to faithfully replicate “the world as it is” – no more, but no less unfair than the input data. However, we have discussed how data relating to inpatient aggression, particularly the administration of coercive interventions, is deeply intertwined with societal inequities. As such, debiasing metrics and methods in this context must use some “true” notion of fairness that represents “the world as it should be”. Algorithmic interventions, therefore, do not constitute a complete solution. To enable algorithmic debiasing approaches, practitioners first must define how a fair and equitable ML algorithm should behave – this is a social question, not a technical one. Ultimately, ML systems do not operate in a vacuum, but rather as part of highly complex sociotechnical systems where algorithms and societal inequities interact in complex ways. We highlight that ML fairness assessments can identify inequities across large, complex datasets to help target further investigation. However, fairness analysis alone cannot deeply characterize these social and structural drivers of unfairness, nor the exact processes by which they ultimately result in unfair predictions. When seeking to understand algorithmic fairness, therefore, it is important to characterize and understand these biases and inequities on a social level, such as through qualitative approaches that reveal patient and provider experiences 56 , 57 . It is also important to note that there is no single optimal way to assess the fairness of ML algorithms. There are over 70 definitions of fairness, many of are conflicting, making it impossible to simultaneously satisfy all possible definitions 58 . We restricted our analysis to a single a priori perspective of what constitutes a fair ML model with a focus on disparate mistreatment and equalized odds, making it possible that our analysis missed other relevant fairness considerations or perspectives. For example, in contrast to the group notion of fairness used in this study, individual fairness postulates that similar individuals should receive similar ML predictions, drawing from philosophies of consistency and individual justice rather than anti-discrimination frameworks 48 , 59 . Individual fairness often relies on counterfactual or explanation-based ways to define fairness, neither of which were assessed in this study 59 . Additionally, there are limitations within the dataset used for this study. Our algorithm was trained using an urban Canadian population – although underlying inequities appear pervasive across populations, our findings may not generalize to other populations 60 . Moreover, the analysis relied on EHR data which is known to vary in quality. For instance, it is possible that some aggressive incidents were not documented, or modes of admission were mislabelled. Following prior work 22 , we included restraints in the outcome under the assumption that they were only applied when aggressive incidents were imminent, which may not always hold. Additionally, segmentation of the dataset into subgroups reduced the sample size for the fairness assessment, especially with respect to minority and intersectional groups. For example, limited sample sizes necessitated us to collapse granular descriptions of ethnic heritage into “Black” as a big-bucket category, which may mask additional disparities in ML fairness within this heterogenous group 61 . Similar limitations were present with gender, as we grouped all genders that were not male or female into a single category, which still lacked the sufficient size to perform intersectional analysis. As such, we encourage future ML studies in this context to perform fairness assessments, particularly by leveraging rich dataset features such as granular ethnic breakdowns or larger sample sizes for intersectional groups. This will enable a more nuanced and thorough understanding of algorithmic fairness, and how they may differ across populations. In conclusion, ML predictions of aggression in acute psychiatric care and other clinical settings have the potential to be unfairly biased. However, this is not meant to be an argument against the use of ML in such contexts. Rather, we suggest that it is critical to be aware of fairness-related considerations prior to their implementation, and illustrate how performing such analyses can shed light about underlying inequities. To this end, we encourage future ML work in psychiatry to consider fairness as a critical element of evaluation, and to conduct further research to interrogate these identified inequities. Declarations Acknowledgements We thank members of CAMH's Data & Insights Team for their support with health record extraction and interpretation. Funding sources This work was supported by a Dalla Lana School of Public Health Interdisciplinary Data Science Seed Grant (S.L.H, no ward/grant number), the Krembil Foundation (L.S. and M.M.M., no award/grant number), the Social Sciences and Humanities Research Council Insight Development Grant (L.S.: #430-2021-01166) and a Google Award for Inclusion Research (L.S. and S.L.H, no award/grant number). Ethics Declaration This study was approved by the CAMH research ethics board (REB #053-2021). All patient EHR data was processed in adherence with protocols reviewed by the CAMH Privacy Department and research ethics board. Informed consent declaration Direct patient consent was not required for this study, as was approved by the CAMH research ethics board (REB #053-2021). Completing Interests S.L.H, J.Z., L.S., and M.M.M report financial support from the University of Toronto Dalla Lana School of Public Health. S.H, J.Z., L.S., and M.M.M. report financial support from the Social Sciences and Humanities Research Council of Canada. L.S. and S.L.H. report financial support from Google Research. These funders had no role in study conceptualization, design, implementation or dissemination of findings. Data Availability The dataset for this study is restricted as it comprised confidential electronic health records. Inquiries about the data can be directed to the corresponding author. Contributions L.S., M.M.M, J.Z., and S.H. conceptualized the study and acquired funding for its completion. Y.W., Z.F., and R.Z. contributed to the interpretation and processing of data. Y.W., R.Z., M.M.M., L.S., and S.H. conceptualized the methods. Y.W. completed the analysis and visualizations and drafted the paper. All authors reviewed and contributed edits to the manuscript. References Itzhaki, M. et al. Exposure of mental health nurses to violence associated with job stress, life satisfaction, staff resilience, and post-traumatic growth. Int. J. Ment. Health Nurs. 24, 403–412 (2015). Iozzino, L., Ferrari, C., Large, M., Nielssen, O. & Girolamo, G. de. Prevalence and Risk Factors of Violence by Psychiatric Acute Inpatients: A Systematic Review and Meta-Analysis. PLOS ONE 10, e0128536 (2015). Pescosolido, B. A., Manago, B. & Monahan, J. Evolving Public Views On The Likelihood Of Violence From People With Mental Illness: Stigma And Its Consequences. Health Aff. (Millwood) 38, 1735–1743 (2019). Zaheer, J. Documenting Restraint: Minimizing Trauma. in Interrogating Psychiatric Narratives of Madness: Documented Lives (eds. Daley, A. & Pilling, M. D.) 111–135 (Springer International Publishing, Cham, 2021). doi: 10.1007/978-3-030-83692-4_5 . Lu, W., Mueser, K. T., Rosenberg, S. D., Yanos, P. T. & Mahmoud, N. Posttraumatic Reactions to Psychosis: A Qualitative Analysis. Front. Psychiatry 8, 129 (2017). Parmigiani, G., Barchielli, B., Casale, S., Mancini, T. & Ferracuti, S. The impact of machine learning in predicting risk of violence: A systematic review. Front. Psychiatry 13, (2022). Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A Survey on Bias and Fairness in Machine Learning. Preprint at https://doi.org/10.48550/arXiv.1908.09635 (2022). Julia, A., Larson, J., Surya, M. & Lauren, K. Machine Bias. Machine Bias https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing (2016). Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019). Chen, I. Y., Szolovits, P. & Ghassemi, M. Can AI Help Reduce Disparities in General Medical and Mental Health Care? AMA J. Ethics 21, E167-179 (2019). Seyyed-Kalantari, L., Zhang, H., McDermott, M. B. A., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176–2182 (2021). National Collaborating Centre for Determinants of Health. Glossary of essential health equity terms. (2022). Hairston, D. R., Gibbs, T. A., Wong, S. S. & Jordan, A. Clinician Bias in Diagnosis and Treatment. in Racism and Psychiatry: Contemporary Issues and Interventions (eds. Medlock, M. M., Shtasel, D., Trinh, N.-H. T. & Williams, D. R.) 105–137 (Springer International Publishing, Cham, 2019). doi: 10.1007/978-3-319-90197-8_7 . Smith, C. M. et al. Association of Black Race With Physical and Chemical Restraint Use Among Patients Undergoing Emergency Psychiatric Evaluation. Psychiatr. Serv. Wash. DC 73, 730–736 (2022). Kirkbride, J. B. et al. The social determinants of mental health and disorder: evidence, prevention and recommendations. World Psychiatry 23, 58–90 (2024). Motley, R. & Banks, A. Black Males, Trauma, and Mental Health Service Use: A Systematic Review. Perspect. Soc. Work J. Dr. Stud. Univ. Houst. Grad. Sch. Soc. Work 14, 4–19 (2018). Olbert, C. M., Nagendra, A. & Buck, B. Meta-analysis of Black vs. White racial disparity in schizophrenia diagnosis in the United States: Do structured assessments attenuate racial disparities? J. Abnorm. Psychol. 127, 104–115 (2018). Tegnerowicz, J. “Maybe It Was Something Wrong With Me”: On the Psychiatric Pathologization of Black Men. in Inequality, Crime, and Health Among African American Males vol. 20 73–94 (Emerald Publishing Limited, 2018). Dobbins, N. J. et al. Deep learning models can predict violence and threats against healthcare providers using clinical notes. Npj Ment. Health Res. 3, 61 (2024). Sikstrom, L. et al. Predictive care: a protocol for a computational ethnographic approach to building fair models of inpatient violence in emergency psychiatry. BMJ Open 13, e069255 (2023). Ogloff, J. R. P. & Daffern, M. The dynamic appraisal of situational aggression: an instrument to assess risk for imminent aggression in psychiatric inpatients. Behav. Sci. Law 24, 799–813 (2006). Lantta, T., Kontio, R., Daffern, M., Adams, C. E. & Välimäki, M. Using the Dynamic Appraisal of Situational Aggression with mental health inpatients: a feasibility study. Patient Prefer. Adherence 10, 691–701 (2016). Suchting, R., Green, C. E., Glazier, S. M. & Lane, S. D. A data science approach to predicting patient aggressive events in a psychiatric hospital. Psychiatry Res. 268, 217–222 (2018). Weltens, I. et al. Aggression on the psychiatric ward: Prevalence and risk factors. A systematic review of the literature. PLoS ONE 16, e0258346 (2021). Zafar, M. B., Valera, I., Rodriguez, M. G. & Gummadi, K. P. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. in Proceedings of the 26th International Conference on World Wide Web 1171–1180 (2017). doi: 10.1145/3038912.3052660 . Menger, V., Spruit, M., van Est, R., Nap, E. & Scheepers, F. Machine Learning Approach to Inpatient Violence Risk Assessment Using Routinely Collected Clinical Notes in Electronic Health Records. JAMA Netw. Open 2, e196709 (2019). Wang, K. Z. et al. Prediction of physical violence in schizophrenia with machine learning algorithms. Psychiatry Res. 289, 112960 (2020). El-Azab, S. & Nong, P. Clinical algorithms, racism, and “fairness” in healthcare: A case of bounded justice. Big Data Soc. 10, 20539517231213820 (2023). Watts, D., Leese, M., Thomas, S., Atakan, Z. & Wykes, T. The Prediction of Violence in Acute Psychiatric Units. Int. J. Forensic Ment. Health 2, 173–180 (2003). Maharaj, R., Gillies, D., Andrew, S. & O’brien, L. Characteristics of patients referred by police to a psychiatric hospital. J. Psychiatr. Ment. Health Nurs. 18, 205–212 (2011). Dharma, C. et al. Examining Systemic and Interpersonal Bias in Violence Risk Assessments of Patients in Acute Psychiatric Care. Psychiatr. Serv. Wash. DC 76, 326–335 (2025). Meerai, S., Abdillahi, I. & Poole, J. An Introduction to Anti-Black Sanism. Intersect. Glob. J. Soc. Work Anal. Res. Polity Pract. 5, 18–35 (2016). Bhui, K. et al. Ethnic variations in pathways to and use of specialist mental health services in the UK. Systematic review. Br. J. Psychiatry J. Ment. Sci. 182, 105–116 (2003). Chow, J. C.-C., Jaffee, K. & Snowden, L. Racial/Ethnic Disparities in the Use of Mental Health Services in Poverty Areas. Am. J. Public Health 93, 792–797 (2003). Schreiter, S. et al. Housing situation and healthcare for patients in a psychiatric centre in Berlin, Germany: a cross-sectional patient survey. BMJ Open 9, e032576 (2019). Narendorf, S. C. Intersection of homelessness and mental health: A mixed methods study of young adults who accessed psychiatric emergency services. Child. Youth Serv. Rev. 81, 54–62 (2017). Amato, S., Nobay, F., Amato, D. P., Abar, B. & Adler, D. Sick and unsheltered: Homelessness as a major risk factor for emergency care utilization. Am. J. Emerg. Med. 37, 415–420 (2019). Kushel, M. B., Perry, S., Bangsberg, D., Clark, R. & Moss, A. R. Emergency Department Use Among the Homeless and Marginally Housed: Results From a Community-Based Study. Am. J. Public Health 92, 778–784 (2002). Serper, M. R. et al. Predictors of aggression on the psychiatric inpatient service. Compr. Psychiatry 46, 121–127 (2005). Mauri, M. C. et al. Aggressiveness and violence in psychiatric patients: a clinical or social paradigm? CNS Spectr. 24, 564–573 (2019). Sanford, S., Roche, B., Molina, I., Weston, N. A. & Sirotich, F. Toronto Supportive Housing Growth Plan: Needs Assessment. Tahir, R., Due, C., Ward, P. & Ziersch, A. Understanding mental health from the perception of Middle Eastern refugee women: A critical systematic review. SSM - Ment. Health 2, 100130 (2022). Andrus, M., Spitzer, E., Brown, J. & Xiang, A. What We Can’t Measure, We Can’t Understand: Challenges to Demographic Data Procurement in the Pursuit of Fairness. in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 249–260 (Association for Computing Machinery, New York, NY, USA, 2021). doi: 10.1145/3442188.3445888 . Soliman, L., Jain, A., Rozel, J. & Rachal, J. Safe Spaces: Mitigating Potential Aggression in Acute Care Psychiatry. FOCUS 21, 46–51 (2023). We Ask Because We Care. Jabbour, S. et al. Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study. JAMA 330, 2275–2284 (2023). Ling, S., Cleverley, K. & Perivolaris, A. Understanding Mental Health Service User Experiences of Restraint Through Debriefing: A Qualitative Analysis. Can. J. Psychiatry Rev. Can. Psychiatr. 60, 386–392 (2015). Caton, S. & Haas, C. Fairness in Machine Learning: A Survey. ACM Comput. Surv. 3616865 (2023) doi: 10.1145/3616865 . Giddings, R. et al. Factors influencing clinician and patient interaction with machine learning-based risk prediction models: a systematic review. Lancet Digit. Health 6, e131–e144 (2024). Sax, D. R., Sturmer, L. R., Mark, D. G., Rana, J. S. & Reed, M. E. Barriers and Opportunities Regarding Implementation of a Machine Learning-Based Acute Heart Failure Risk Stratification Tool in the Emergency Department. Diagnostics 12, 2463 (2022). Feng, Q., Du, M., Zou, N. & Hu, X. Fair Machine Learning in Healthcare: A Review. Preprint at http://arxiv.org/abs/2206.14397 (2024). Zhu, Y. et al. M $ ^3 $ Fair: Mitigating Bias in Healthcare Data through Multi-Level and Multi-Sensitive-Attribute Reweighting Method. arXiv.org https://arxiv.org/abs/2306.04118v1 (2023). Yang, J., Soltan, A. A. S., Eyre, D. W., Yang, Y. & Clifton, D. A. An adversarial training framework for mitigating algorithmic biases in clinical machine learning. Npj Digit. Med. 6, 1–10 (2023). Li, F. et al. Evaluating and mitigating bias in machine learning models for cardiovascular disease prediction. J. Biomed. Inform. 138, 104294 (2023). Hellström, T., Dignum, V. & Bensch, S. Bias in Machine Learning -- What is it Good for? Preprint at https://doi.org/10.48550/arXiv.2004.00686 (2020). Chin, M. H. et al. Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care. JAMA Netw. Open 6, e2345050 (2023). Aquino, Y. S. J. et al. Practical, epistemic and normative implications of algorithmic bias in healthcare artificial intelligence: a qualitative study of multidisciplinary expert perspectives. J. Med. Ethics (2023) doi: 10.1136/jme-2022-108850 . Kleinberg, J., Mullainathan, S. & Raghavan, M. Inherent Trade-Offs in the Fair Determination of Risk Scores. Preprint at https://doi.org/10.48550/arXiv.1609.05807 (2016). Binns, R. On the Apparent Conflict Between Individual and Group Fairness. Preprint at http://arxiv.org/abs/1912.06883 (2019). Silva, M., Loureiro, A. & Cardoso, G. Social determinants of mental health: A review of the evidence. Eur. J. Psychiatry 30, 259–292 (2016). Movva, R. et al. Coarse race data conceals disparities in clinical risk score performance. in Proceedings of the 8th Machine Learning for Healthcare Conference 443–472 (PMLR, 2023). Additional Declarations Competing interest reported. S.L.H, J.Z., L.S., and M.M.M report financial support from the University of Toronto Dalla Lana School of Public Health. S.H, J.Z., L.S., and M.M.M. report financial support from the Social Sciences and Humanities Research Council of Canada. L.S. and S.L.H. report financial support from Google Research. These funders had no role in study conceptualization, design, implementation or dissemination of findings. Supplementary Files Supplementary.docx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 23 Dec, 2025 Reviews received at journal 17 Dec, 2025 Reviewers agreed at journal 12 Dec, 2025 Reviews received at journal 31 Oct, 2025 Reviewers agreed at journal 23 Oct, 2025 Reviewers agreed at journal 23 Oct, 2025 Reviewers invited by journal 21 Oct, 2025 Editor assigned by journal 20 Oct, 2025 Submission checks completed at journal 12 Oct, 2025 First submitted to journal 04 Oct, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7781555","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":537492802,"identity":"526096ae-ab1e-4539-90bc-537976477ee5","order_by":0,"name":"Yifan Wang","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Yifan","middleName":"","lastName":"Wang","suffix":""},{"id":537492803,"identity":"8cb3dc7c-2a4d-4912-8f78-135350b734b5","order_by":1,"name":"Laura Sikstrom","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Laura","middleName":"","lastName":"Sikstrom","suffix":""},{"id":537492804,"identity":"02f1ed98-bef1-47f3-92ef-637dc2c39286","order_by":2,"name":"Robert Xiao","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Robert","middleName":"","lastName":"Xiao","suffix":""},{"id":537492805,"identity":"c33f20c3-a935-4c3f-9d19-c749b0264a8a","order_by":3,"name":"Zoe Findlay","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Zoe","middleName":"","lastName":"Findlay","suffix":""},{"id":537492806,"identity":"29a07669-6513-4d5a-92a1-54be3c565c99","order_by":4,"name":"Juveria Zaheer","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Juveria","middleName":"","lastName":"Zaheer","suffix":""},{"id":537492807,"identity":"c495f7f8-a807-4af8-bdb6-364e6b9b9c34","order_by":5,"name":"Sean Hill","email":"","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":false,"prefix":"","firstName":"Sean","middleName":"","lastName":"Hill","suffix":""},{"id":537492808,"identity":"bfbdba37-8a1a-453b-b99d-d00212750755","order_by":6,"name":"Marta M Maslej","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/ElEQVRIiWNgGAWjYBACgwNg6jADAzuQ8wEkQkiL5QEGxgYGhucMDMwMDIYziNFiD9HyH6yFmYcYLWbHzz5/8IPhtrx8M/OBYpsau2hzBuaHH/BqOZNu2NjDcNtww2G2BOOcY8m5OxvYjCXwajmQxtjAw3CbcQMzj4FxbgNz7oYDPAx4tRicf8bY+IfhsP38Zv4PxpYN9SAtzD/warmRxtjMw3A4seEwD4MxY8NhkBY2/LbceMY4W8bgcDLQLwaGPceO5wIZZhb4HZbG8PFNxWHb+e3Nzwx+1FTnbjje/PgGPi1QjWCSDUIxE1YPB8wPSFA8CkbBKBgFIwgAAEIxT+AQM2vPAAAAAElFTkSuQmCC","orcid":"","institution":"Centre for Addition and Mental Health","correspondingAuthor":true,"prefix":"","firstName":"Marta","middleName":"M","lastName":"Maslej","suffix":""}],"badges":[],"createdAt":"2025-10-04 18:53:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7781555/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7781555/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94845122,"identity":"f1470b33-6689-4061-a0fc-636c3f93ddcf","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":579914,"visible":true,"origin":"","legend":"","description":"","filename":"PaperNPJMenHealthRes.docx","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/6a84ed89b31d7c12901fa2a1.docx"},{"id":94845117,"identity":"356547ce-a911-44a7-be80-74824e94d7b5","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":9458,"visible":true,"origin":"","legend":"","description":"","filename":"67b43e4ff4c04220b636c89f9305a667.json","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/a6ad58cec99940c76429dada.json"},{"id":94845128,"identity":"c82f59b6-d2d7-4111-8a5f-bb5973a4079b","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":845043,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementary.docx","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/34b4d8389ef29c606c6de632.docx"},{"id":94845129,"identity":"ad93a5f4-da7f-45c2-8e01-2e807dc9f7b4","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":149360,"visible":true,"origin":"","legend":"","description":"","filename":"67b43e4ff4c04220b636c89f9305a6671enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/cddcbe553dc34894fcebc88e.xml"},{"id":94845120,"identity":"a9f0c5c5-42e0-459a-9c0d-bd8b88e7f320","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":25461,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/f8747ef66008c93a0c3a3f0e.png"},{"id":94845126,"identity":"6f378384-f5d4-48ee-b9aa-7e4ca2121b04","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":30073,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/32b19d6bdc429e22ebec91f5.png"},{"id":94845125,"identity":"8da30a59-3745-4b5a-9ea4-4c6a158ffe86","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":29384,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/1c512e6a0083ebdc11bd5159.png"},{"id":94984947,"identity":"a3f7bf32-ea5d-4bca-acfe-5d4394bcb71d","added_by":"auto","created_at":"2025-11-03 06:56:59","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":27242,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/0aea4e977eda8a9b3e17501e.png"},{"id":94845131,"identity":"f99255eb-9545-4017-88f7-1eca6224d962","added_by":"auto","created_at":"2025-10-31 10:07:25","extension":"xml","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":149397,"visible":true,"origin":"","legend":"","description":"","filename":"67b43e4ff4c04220b636c89f9305a6671structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/542c8dce7daf4f7972e1e87e.xml"},{"id":94845130,"identity":"0065c020-4045-438e-9867-09a8dc07d750","added_by":"auto","created_at":"2025-10-31 10:07:25","extension":"html","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":160203,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/7a2ea5a09e733e7688efb142.html"},{"id":94845118,"identity":"f156c14a-84b5-4574-9cff-5b346112f345","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":80962,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMethodology for model training and fairness analysis.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/02f6e515e1325f8a51193033.png"},{"id":94985241,"identity":"d1224619-1d01-4ccb-864b-6a2557518961","added_by":"auto","created_at":"2025-11-03 06:57:45","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":113809,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eModel performance on test set, averaged across 5 training fits (A) Receiver operating curve – Area under curve (ROC-AUC) (B) Confusion matrix (C) Model feature importances, based on impurity-based importance.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/202562384ec98999627e23d9.png"},{"id":94845119,"identity":"bdea4a6d-6759-45d7-877d-a9eb932b9320","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":116586,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTrue positive rate (TPR) and false positive rate (FPR) by (A) ethnicity, (B) gender, (C) admission mode, (D) citizenship, and (E) housing.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/79249a9bf65982029a820604.png"},{"id":94845127,"identity":"6c8442f1-ed41-4bd1-928f-1dbef5fbf8af","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":138074,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIntersectional (A) true positive rate (TPR) and (B) false positive rate (FPR) by ethnicity and gender. All groups had \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003en \u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e\u0026gt; 50 observations.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/d06fd8433c88d75ae9961ed0.png"},{"id":94990364,"identity":"0a4ef0dd-b90b-484d-a406-d8ed874ede1d","added_by":"auto","created_at":"2025-11-03 07:16:37","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1552083,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/cc8daa43-d1cc-4cb2-91bd-5ad448941f2c.pdf"},{"id":94845123,"identity":"ad76cb76-2da8-411f-b6a6-c53e2cbc5f66","added_by":"auto","created_at":"2025-10-31 10:07:24","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":845043,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementary.docx","url":"https://assets-eu.researchsquare.com/files/rs-7781555/v1/8837081760948c82b2b46d3d.docx"}],"financialInterests":"Competing interest reported. S.L.H, J.Z., L.S., and M.M.M report financial support from the University of Toronto Dalla Lana School of Public Health. S.H, J.Z., L.S., and M.M.M. report financial support from the Social Sciences and Humanities Research Council of Canada. L.S. and S.L.H. report financial support from Google Research. These funders had no role in study conceptualization, design, implementation or dissemination of findings.","formattedTitle":"Fairness analysis of machine learning predictions of aggression in acute psychiatric care","fulltext":[{"header":"Introduction","content":"\u003cp\u003ePatient aggression is a major concern in clinical settings such as acute psychiatry, and encompasses a range of behaviors including verbal abuse, sexual harassment, and physical violence. It has adverse effects on the quality of care, patient and staff safety, and public perceptions of mental health care\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. However, coercive interventions used to manage the risk of aggression can be similarly problematic: the administration of medications (i.e., chemical restraints) and physical restraints to manage aggression have been shown to negatively impact a patient\u0026rsquo;s experience of care in a potentially traumatizing way\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. Machine learning (ML) has increasingly been applied to predict risk of patient aggression in psychiatric and forensic settings, as it can leverage complex datasets to generate more individualized predictions to enable earlier and more targeted de-escalation using non-coercive forms of intervention. Previous studies have trained ML algorithms on diverse datasets, from clinical data to neuroimaging scans, with these algorithms often exceeding the predictive performance of current clinical instruments\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. However, there has been limited investigation of the \u003cem\u003efairness\u003c/em\u003e of these models in acute psychiatry \u0026ndash; that is, whether their predictions display any prejudice or favoritism towards an individual or group based on certain inherent or acquired characteristics, such as race or sex (ie: a \u003cem\u003eprotected group\u003c/em\u003e)\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eAlgorithmic fairness has been a growing area of focus in ML research, and widely used algorithms for criminal recidivism prediction and healthcare resource allocation have been shown to be unfair towards racially marginalized groups, such as Black or lower-income individuals\u003csup\u003e\u003cspan additionalcitationids=\"CR9\" citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e, as well as the intersections of underserved groups, like Hispanic females\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. The potential for algorithmic unfairness is particularly concerning in the context of predicting aggression in acute psychiatry because of pervasive inequities embedded in the training data. Inequities can be structural (interpersonal and systemic processes which create inequities in power and resources\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e) and social (disparities relating to an individual\u0026rsquo;s proximal social, political, and economic environment\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e). These include racial profiling in police apprehensions for involuntary psychiatric admission, gendered biases in clinician perceptions of inpatient violence risk, and disparities in access to quality mental health care based on socioeconomic status\u003csup\u003e\u003cspan additionalcitationids=\"CR14\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. Inequities defined by the intersection of race, ethnicity and gender are also a significant concern, given the well-documented challenges Black men face in in accessing and receiving mental health care\u003csup\u003e\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. These inequities can be readily embedded in the data used to train ML algorithms. Evidence of performance disparities in ML-based predictions of violence against hospital providers have emerged in one prior study, suggesting less accurate predictions for Asian and Native Hawaiian patient groups, but the analysis only examined fairness stratified by patient race, and the ability to draw conclusions was limited by small sample sizes\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. There is a need to better assess how both proximal social indicators, like race/ethnicity and sex, as well as upstream factors such as policing or housing are related to the fairness of ML-based predictions of aggression.\u003c/p\u003e\u003cp\u003eThis analysis, which was part of a larger mixed-methods study dissecting the construction and use of predictive care tools, had three main objectives\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. First, we used demographic and clinical features to train a supervised ML algorithm to predict whether a patient would become aggressive on a given day. Second, we performed a fairness assessment, examining the algorithm's performance stratified by demographic characteristics of patients, focusing on gender, race/ethnicity, citizenship, admission mode, and housing status. Third, we characterized how the model performs for groups of patients defined by the intersection of both race/ethnicity and gender\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. Overall, by examining algorithmic unfairness in ML predictions of aggression, our findings highlight the importance of assessing fairness across diverse social and demographic factors during model development and evaluation, to prevent the deployment of ML models in acute psychiatry that can harm specific populations.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eStudy population\u003c/h2\u003e\u003cp\u003eThis analysis utilized electronic health records (EHRs) from ten inpatient care units at the Centre of Addiction and Mental Health (CAMH), a large mental health and addictions hospital in Toronto, Canada, between January 2016 and May 2022. Only patients who were admitted to inpatient units via the hospital\u0026rsquo;s emergency department (ED) were included, to enable consideration of admission mode (e.g., apprehension for admission by police). Patients were excluded from analysis if they were referred from a corrections facility or another hospital. However, we did not exclude patients with prior acute care visits, so the analysis included multiple inpatient hospitalizations from unique patients.\u003c/p\u003e\u003cp\u003eDemographic data were obtained from patient-reported forms routinely collected at admission, and include age, gender, sexual orientation, citizenship, housing status, income, highest education level, language, ethnicity, and marital status. Clinical and contextual factors were also documented at patient intake, including primary psychiatric diagnosis assessed by ED psychiatrists via a brief diagnostic interview, presence of substance-induced symptoms, mode of admission, and inpatient unit location. Finally, risk assessment data included ratings on the Dynamic Appraisal of Situational Aggression (DASA), which is a clinically validated instrument assessing each patient\u0026rsquo;s risk of aggression over the next 24 hours. Assessment is based on seven dichotomous items which capture behavioural and interpersonal factors related to this risk (e.g., agitation, sensitivity to provocation, verbal threats)\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. At CAMH, DASA scores are generated by nurses each morning that a patient is on the unit, based on their clinical observations and relevant information from a chart review of the past 24 hours.\u003c/p\u003e\u003cp\u003eThe outcome data included aggressive incidents involving patients, as documented by any attending staff (e.g., nurses, clinicians, security guards or program assistants) in CAMH\u0026rsquo;s reporting tool. Incidents were included as outcomes if they were categorized by staff as either \u0026ldquo;abuse/assault/violence\u0026rdquo; or \u0026ldquo;physical/sexual/verbal behaviors and assaults\u0026rdquo;. Any documented use of any combination of chemical restraints, physical restraints, or seclusion was included as an outcome since these interventions are only used when violence or aggression is deemed imminent\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. Outcomes (i.e., aggressive incidents or restraints and seclusions) occurring during prior visits to acute care were treated as predictors (i.e., a binary variable indicating whether incident occurred prior to admission).\u003c/p\u003e\u003cp\u003eBecause patients are assessed using the DASA for the risk of imminent aggression (e.g. within the next 24 hours), predictions were made on each day of the acute care stay. Most outcomes were expected to occur on the first three days that a patient was receiving acute care. For this reason, we included up to three days or prediction windows for each visit. If one or more outcomes occurred during a given visit, we only included data collected until the first occurrence, since interventions used to manage the outcome may alter risk. Since clinical, sociodemographic and admission data was only collected once for each visit, it was repeated across the three days for each visit and patient.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eData processing\u003c/h3\u003e\n\u003cp\u003eThe intake demographic questionnaire contained open-ended response options for variables that were categorized as \u0026lsquo;other\u0026rsquo;, which were all manually categorized into the existing categories, in consultation with CAMH acute care clinicians. Gender was grouped into three categories: male, female, and gender expansive. Race/ethnicity was grouped into Black, Asian, South Asian, Indigenous, Latin-American, Middle Eastern, Mixed, and White. Primary diagnoses were grouped by the study team and in consultation with clinicians into ten diagnosis types, guided by the DSM-5 categories. Each of the seven DASA items was included as an individual dichotomous variable to retain information about specific aggression-related factors. An extensive description of data capture and processing can be found in the supplementary information. The final predictor dataset included 16 categorical variables.\u003c/p\u003e\u003cp\u003eA 70%/30% train-test split was performed. Randomization for the train-test split was done by patient, as opposed to by observation-day to ensure that different inpatient days, or multiple presentations to acute care for the same patient were not split between the two sets. No functions were fitted on the test set, which was withheld until the final performance and fairness evaluation. Variables with \u0026le;\u0026thinsp;20% missing were imputed by the mode, while missing values for variables with \u0026gt;\u0026thinsp;20% missing were imputed with \u0026ldquo;missing\u0026rdquo; to preserve the potential informativeness of high missingness\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. Non-binary variables were one-hot-encoded in preparation for model training.\u003c/p\u003e\n\u003ch3\u003eAddressing class imbalance\u003c/h3\u003e\n\u003cp\u003eIn the study population, the outcomes were imbalanced by almost a ratio of 33:1, with significantly greater cases with no incident. To prevent the decision boundary from greatly favoring the majority class at the expense of the fidelity of minority class predictions (e.g., by making almost exclusively negative predictions), F1 score was used as the primary evaluation metric, which is calculated based on the balance between precision and recall, thereby offering a more reliable evaluation of imbalanced data classification. Additionally, a range of resampling algorithms were tested as part of the model tuning process, where the training set was either undersampled by removing cases from the majority class or oversampled by adding synthetic cases to the minority class to balance the distribution of positive and negative cases.\u003c/p\u003e\n\u003ch3\u003eModel training\u003c/h3\u003e\n\u003cp\u003eModel selection and optimization was performed on the training set using 5-fold cross validation, optimizing for F1 score. Logistic regression, na\u0026iuml;ve bayes, random forest, gradient boosting, support vector machine, decision tree, and simple neural network were evaluated as candidate models. Additionally, no resampling, random undersampling, nearmiss undersampling, and SMOTE oversampling were evaluated as candidate resampling methods in combination with all candidate models (Supplementary methods). The final model and sampler was refit on the entire training set and its performance was measured based on its predictions on the hold-out test set. No resampling was performed on the test set (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Standard deviations and confidence intervals for all performance and fairness metrics were calculated by training the model five times using five different random seeds, then applying each to the test set\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. Feature importances were extracted using impurity-based importance as implemented in Sci-kit learn.\u003c/p\u003e\n\u003ch3\u003eFairness assessment\u003c/h3\u003e\n\u003cp\u003eFairness analysis was conducted using observational criteria based on post-hoc analysis of model outputs, true outcome, and sensitive attributes. Disparate mistreatment as measured by false positive rate (FPR) was used as a primary metric by which to assess the fairness of the algorithm, where \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:FPR=\\frac{FP}{FP+TN\\:\\left(\\text{A}\\text{c}\\text{t}\\text{u}\\text{a}\\text{l}\\:\\text{n}\\text{e}\\text{g}\\text{a}\\text{t}\\text{i}\\text{v}\\text{e}\\text{s}\\right)}\\)\u003c/span\u003e\u003c/span\u003e \u003csup\u003e25\u003c/sup\u003e. This metric enables a focus on whether social and structural inequities may lead to higher rates of incorrectly flagging individuals from certain demographic groups as being at high risk for aggression. A fair algorithm with respect to disparate mistreatment must not have different FPR between subgroups of sensitive attributes. To understand the fairness behavior of the algorithm more thoroughly, group-specific true positive rate (TPR), F1-score, ROC curves, and calibration curves were also assessed.\u003c/p\u003e\u003cp\u003eAttributes that were analyzed for fairness include race/ethnicity, gender, admission mode, citizenship, and housing status. Intersectional fairness analysis was performed for the intersection of gender and race/ethnicity. When performing fairness analysis for a given feature, individuals with imputed values for that feature were excluded from that analysis.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003eSample characteristics\u003c/h2\u003e\u003cp\u003eAcross all observation days, there were a total of 41447 \u0026ldquo;no incident\u0026rdquo; cases (i.e., observation days on which a violent or aggressive incident was not reported) to 1272 \u0026ldquo;incident\u0026rdquo; cases (i.e., days on which such an incident was reported) corresponding to 17703 total unique patients. These patients were split into 29879 cases in the train set (n\u0026thinsp;=\u0026thinsp;12398 unique patients) and 12840 in the test set (n\u0026thinsp;=\u0026thinsp;5305 unique patients).\u003c/p\u003e\u003cp\u003ePatients were relatively evenly distributed across age categories, but they were predominantly male, White, single, and of Canadian citizenship. The most common diagnoses were psychotic disorders, and patients were most commonly accompanied during ED admission by family or friends. Citizenship, Housing, Marital status, and Admission mode had proportions of missing observations above 20%, thereby requiring imputation with a \u0026ldquo;missing\u0026rdquo; label. All features differed significantly when comparing no incident vs incident populations (p\u0026thinsp;\u0026lt;\u0026thinsp;0.001). Sample characteristics at the level of observation days in the overall data can be found in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Sample characteristics at the level of unique patients can be found in supplementary table 2, and characteristics by train/test set is reported in supplementary table 3.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSample characteristics at the level of observation days\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCharacteristic\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eOverall,\u003c/p\u003e\u003cp\u003en\u0026thinsp;=\u0026thinsp;42719\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNo incident, n\u0026thinsp;=\u0026thinsp;41447\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eIncident,\u003c/p\u003e\u003cp\u003en\u0026thinsp;=\u0026thinsp;1272\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003ep-value\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e16\u0026ndash;24\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8525 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8239 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e286 (22%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e25\u0026ndash;29\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7306 (17%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e7076 (17%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e230 (18%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e30\u0026ndash;34\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e6234 (15%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6026 (15%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e208 (16%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e35\u0026ndash;44\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8327 (19%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8059 (19%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e268 (21%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e45\u0026ndash;64\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e10440 (24%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e10199 (25%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e241 (19%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e65+\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1887 (4.4%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1848 (4.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e39 (3.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eGender\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFemale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e16824 (39%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e16444 (40%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e380 (30%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e24977 (58%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e24121 (58%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e856 (67%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGender diverse\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e440 (1.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e429 (1.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e11 (0.9%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e478 (1.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e453 (1.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e25 (2.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eCitizenship\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCanadian\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e29804 (70%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e29050 (70%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e754 (59%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIndigenous\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e32 (\u0026lt;\u0026thinsp;0.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e32 (\u0026lt;\u0026thinsp;0.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNot Canadian\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2328 (5.4%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2257 (5.4%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e71 (5.6%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e10555 (25%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e10108 (24%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e447 (35%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eHousing status\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLiving with family\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e6540 (15%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6370 (15%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e170 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOwn\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e4304 (10%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4224 (10%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e80 (6.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRent\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e13112 (31%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12834 (31%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e278 (22%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSupportive housing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e3966 (9.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3846 (9.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e120 (9.4%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUnstable housing/unhoused\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5488 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5243 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e245 (19%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e9309 (22%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8930 (22%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e379 (30%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMarital status\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePartnered\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e3231 (7.6%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3179 (7.7%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e52 (4.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSingle\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e26389 (62%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e25669 (62%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e720 (57%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e13099 (31%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12599 (30%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e500 (39%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRace/Ethnicity\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWhite\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e20470 (48%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e19969 (48%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e501 (39%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAsian\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2711 (6.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2651 (6.4%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e60 (4.7%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBlack\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5218 (12%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4968 (12%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e250 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSouth Asian\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2270 (5.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2204 (5.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e66 (5.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIndigenous\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1059 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1037 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e22 (1.7%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLatin American\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1212 (2.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1180 (2.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMiddle Eastern\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1357 (3.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1296 (3.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e61 (4.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMixed\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1099 (2.6%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1072 (2.6%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e27 (2.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7323 (17%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e7070 (17%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e253 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAdmit Mode\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCase Worker / Nurse\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2024 (4.7%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1980 (4.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e44 (3.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFriend / Family\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e9917 (23%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e9744 (24%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e173 (14%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMobile Crisis\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e134 (0.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e123 (0.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e11 (0.9%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOther\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e956 (2.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e924 (2.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePolice\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8761 (21%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8269 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e492 (39%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSelf\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8300 (19%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8195 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e105 (8.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e12627 (30%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12212 (29%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e415 (33%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003ePrimary Diagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e\u0026lt;\u0026thinsp;0.001\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAdjustment disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e389 (0.9%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e385 (0.9%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e4 (0.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnxiety disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1188 (2.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1182 (2.9%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6 (0.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBipolar mood disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5785 (14%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5535 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e250 (20%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDepressive disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5161 (12%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5136 (12%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e25 (2.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNeurocognitive disorders\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e442 (1.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e426 (1.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e16 (1.3%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNeurodevelopmental disorders\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e772 (1.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e740 (1.8%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOther\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e922 (2.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e901 (2.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e21 (1.7%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePersonality disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2622 (6.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2558 (6.2%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e64 (5.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePrimary psychotic disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e18404 (43%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e17710 (43%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e694 (55%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSubstance-related disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5488 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5350 (13%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e138 (11%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTrauma and stressor related disorder\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1070 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1056 (2.5%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e14 (1.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMissing\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e476 (1.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e468 (1.1%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e8 (0.6%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"5\"\u003e\u003csup\u003e1\u003c/sup\u003eData presented as n, (% of respective group total); \u003csup\u003e2\u003c/sup\u003eChi-square test of independence\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eModel performance\u003c/h3\u003e\n\u003cp\u003eThe best-performing model on the train set was a 200-estimator random forest (RF) with no oversampling or undersampling. On the hold-out test set, the random forest obtained ROC-AUC of 0.8120\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0016, accuracy of 0.9323\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0004, and F1 score of 0.2213\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0031 (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). The model had a sensitivity/TPR of 0.3265\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0057 and a specificity of 0.9507\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0005. Feature importance extracted from the RF revealed that the DASA items, especially irritability, as well as the presence of a violent/aggressive incident or restraint occurring prior to admission into acute care, are highly important for predictions. (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC)\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eFairness assessment\u003c/h2\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003eRace/Ethnicity\u003c/h2\u003e\u003cp\u003eMiddle Eastern individuals had the highest FPR among all ethnic groups (FPR [standard deviation]\u0026thinsp;=\u0026thinsp;0.0801 [0.0048]) followed by Black (0.0694 [0.002]), Indigenous (0.0552 [0.0037]), Mixed (0.0525 [0.0021]), White (0.0404 [0.0008]), South Asian (0.0356 [0.0019]), Asian (0.0322 [0.0028]), and Latin American (0.0313 [0.0028]) individuals \u003cb\u003e(\u003c/b\u003eFig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Supplementary table 5\u003cb\u003e).\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThere was also significant variation in TPR: Latin American (TPR [standard deviation]\u0026thinsp;=\u0026thinsp;0.3846 [0.0000]) and Middle Eastern (0.3778 [0.0222]) having the highest TPR, Asian (0.2381 [0.0000]) and South Asian (0.2571 [0.0350]) had the lowest TPR (Supplementary table 5). Predictive accuracy was highest in Middle Eastern individuals (F1 score [standard deviation]\u0026thinsp;=\u0026thinsp;0.2372 [0.0158]). ROC curves reveal significant differences in the TPR-FPR trade-offs between groups, with Black individuals having considerably higher FPR for any TPR (Supplementary Fig.\u0026nbsp;2).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eGender\u003c/h2\u003e\u003cp\u003eMen (0.0542 [0.0005]) had higher FPR than women (0.0426 [0.0009]) and gender expansive individuals (0.0418 [0.006]). TPR, F1 score and ROC-AUC are all lower in men compared to women. At conservative prediction thresholds, men have higher FPR for any given TPR compared to women and gender expansive individuals.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003eAdmission mode\u003c/h2\u003e\u003cp\u003eIndividuals who were admitted by police had significantly higher FPR than any other group label (0.0941 [0.0019]), followed by other (0.0547 [0.002]), mobile crisis (0.0476 [0.0000]), self (0.0326 [0.0007]), case worker/nurse (0.0303 [0.0007]), and friend/family (0.0264 [0.0011]). Police admissions also had relatively high TPR (0.4174 [0.0165]) and F1 score (0.2405 [0.0158]).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003eCitizenship\u003c/h2\u003e\u003cp\u003eCanadian citizens had higher FPR and TPR (FPR\u0026thinsp;=\u0026thinsp;0.0427 [0.004], TPR\u0026thinsp;=\u0026thinsp;0.3145 [0.0072]) than non-citizens (FPR\u0026thinsp;=\u0026thinsp;0.0210 [0.0024], TPR\u0026thinsp;=\u0026thinsp;0.2696 [0.0174]). There is significant mismatch in the ROC curves between the two groups with Canadian individuals having higher FPR rates for any given TPR at conservative prediction thresholds.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003eHousing\u003c/h2\u003e\u003cp\u003eThose who were in unstable forms of housing or unhoused (0.0829 [0.0014]) or were living in supportive housing (0.0502 [0.0017]) had higher FPR than those who had more stable forms of housing, such as owning (0.0344 [0.002]), renting (0.0318 [0.0007]), and living with family (0.0273 [0.001]). Individuals living with supportive housing have considerably lower predictive accuracy than other groups with the lowest TPR (0.2308 [0.0243]), F1 score (0.1356 [0.0123]) and ROC-AUC (0.7651 [0.0048]).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003eIntersectional analysis of disparate mistreatment\u003c/h2\u003e\u003cp\u003eIntersectional analysis was performed for the intersection of ethnicity and gender (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e; Supplementary table 6). The \u0026ldquo;gender expansive\u0026rdquo; group was excluded due to low sample sizes (N\u0026thinsp;\u0026lt;\u0026thinsp;15 observations) for all ethnicities except White. All other intersectional groups had more than 50 observations. Middle Eastern men had the highest FPR (FPR [standard deviation]\u0026thinsp;=\u0026thinsp;0.0933 [0.0074]) and a highly pronounced gender-specific effect; Middle Eastern women had a significantly lower FPR (0.0372 [0.0047]). However, both genders were similar in terms of their TPR. Black men (0.0759 [0.0026]) and Indigenous men (0.0747 [0.0062]) also had a relatively high FPR, and their TPRs also tended to be higher as compared to Black women (TPR [standard deviation]\u0026thinsp;=\u0026thinsp;0.2353 [0.0372]) and Indigenous women (0.2500 [0.0000]). Across all races/ethnicities, men had an intersectional FPR equal to or greater than that of women.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, we assessed whether ML predictions of inpatient aggression in acute psychiatric care are unfair. To our knowledge, this is the most comprehensive fairness assessment of ML as related to this outcome, and builds on previous work by Dobbins \u003cem\u003eet al.\u003c/em\u003e\u003csup\u003e19\u003c/sup\u003e by examining a wider range of social determinants and applying an intersectional approach. A random forest model was trained on a range of demographic, clinical, admission, and risk assessment data, yielding an ROC-AUC of 0.81. Although maximizing predictive performance was not an emphasis of this study, the model achieved comparable performance to ML algorithms reported in prior research trained on tabular data in clinically heterogenous psychiatric populations (ROC-AUC obtained in Suchting et al. = 0.78\u003csup\u003e23\u003c/sup\u003e, Menger et al. = 0.76\u003csup\u003e26\u003c/sup\u003e, Wang et al. = 0.63\u003csup\u003e27\u003c/sup\u003e). The fairness assessment revealed the algorithm violates both disparate mistreatment and equalized odds: there were significant disparities in FPR, TPR, and ROC-AUC curves across race/ethnicity, gender, admission mode, citizenship, and housing status. Relative to other groups, FPR was elevated in individuals who are Middle Eastern and Black, those who identify as male, are admitted into emergency care by the police, Canadian citizens, and with unstable or supportive forms of housing. Intersectional analyses revealed that Middle Eastern men had the highest FPR among all groups. There were significant differences in TPR and ROC-AUC curves in relation to the FPR of each group, suggesting the nature of algorithmic unfairness differs between groups. For example, in the case of patients who are Middle Eastern, in unstable or no housing, or admitted by police, FPR and TPR were both elevated relative to other groups, suggesting the model was calibrated to increase overall predictive accuracy at the expense of higher FPR. Conversely, for other groups like Black patients, models had high FPR and low TPR, suggesting poor overall performance.\u003c/p\u003e\u003cp\u003eImportantly, observational measures of unfairness such as TPR and FPR are merely outcome measures that do not explain how unfair predictions arise. Rather, these results must be understood in the context of underlying social and structural inequities that can give rise to unfair predictions in the first place, such as racial profiling in the criminal justice system, racial residential segregation, or barriers to accessing mental healthcare\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e. We discuss some of these parallels in the section below.\u003c/p\u003e\u003cp\u003eBlack individuals are less likely to receive adequate outpatient psychiatric treatment, they are more likely to be involuntarily admitted into inpatient treatment, and they may also present with more severe psychotic symptoms, compared to White individuals\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Black men in particular face significant barriers in accessing mental health care, and they are more likely to be misdiagnosed with psychotic disorders, as compared to White men\u003csup\u003e\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Interpersonal bias is also possibility, where structurally reinforced stereotypes may lead to higher risk perceptions for racially marginalized individuals on clinical risk instruments like the DASA, though research is largely inconclusive on whether these instruments are themselves biased. Both male gender and Black race have been found to be significantly associated with violence in psychiatric settings. Findings from our study suggest that these associations can become embedded in clinical datasets, which may lead to unfair treatment by ML algorithms, both via increased false positive predictions and poorer performance in identifying at-risk individuals\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003ePolice apprehension for admission into the ED is also communicated among clinicians to be a relevant factor in risk assessment due to an increased likelihood of aggression in patients admitted involuntarily, and/or referred by the police\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. It is therefore perhaps not surprising that this mode of admission was associated with the highest FPR than any other predictor in the fairness assessment. Patients apprehended by police for admission into emergency psychiatric care are indeed more likely to become violent or aggressive, which is likely to account for relatively high FPRs and TPRs for this group\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e. At the same time, racially marginalized and Indigenous groups have increased rates of involuntary admissions into psychiatric care by police, likely due to various factors, such as barriers to accessing mental health care or racial profiling\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e,\u003cspan additionalcitationids=\"CR33\" citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e. This tendency may in part explain the finding of higher FPRs among Black men, and potentially Middle Eastern and Indigenous individuals as well.\u003c/p\u003e\u003cp\u003eThe fairness assessment also highlights housing as a potential source of algorithmic unfairness, specifically for those with unstable or supportive forms of housing. On a social level, unstable housing has been associated with psychiatric conditions, such as trauma and substance use, as well as a lower educational attainment and disrupted support networks\u003csup\u003e\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e,\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e. Conditions of unstable housing may contribute to food or water insecurity, sleep deprivation, and hyper vigilance, which can lead to the expression of behaviours that are rated as precursors of aggression on clinical instruments such as the DASA (e.g: irritability, sensitivity to provocation, and unwillingness to follow instructions). Structurally, current psychiatric care systems are not well-equipped to meet the constellation of needs of unhoused individuals, which may contribute to their increased ED use and higher false positive predictions for the risk of violence in inpatient care\u003csup\u003e\u003cspan additionalcitationids=\"CR37 CR38 CR39\" citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. Supportive housing services for people with severe mental illness offer more stability, but they are in high demand but extremely under-resourced, often unable to meet complex, individual needs\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eWe also identified performance disparities that are not linked to well-researched inequities. For example, while qualitative analyses have shown a general distrust of biomedical mental health services among Middle Eastern individuals, there is a considerable research gap in characterizing how they interact with these systems\u003csup\u003e\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e. Although our analyses suggest that high FPR for Middle Eastern patients may be in part related to improved model TPR/sensitivity, social and structural determinants likely play a role in the way their risk of violence or aggression is perceived; these may be related to cultural communication barriers, or expressions of distrust manifesting as increased irritability or an unwillingness to follow instructions. However, the gender discrepancy in FPR (but not in TPR) for this group suggests this effect may only extend to men. Similarly, the algorithm displayed modest FPR differences based on citizenship, which is also not a well-documented demographic feature in the psychiatric literature. Nevertheless, citizenship may be an important factor to consider in future fairness assessments of ML models in healthcare, given its impact on access to community, social, and health services.\u003c/p\u003e\u003cp\u003eThese findings highlight the importance of thoughtful documentation and processing of demographic data, which is a strength of our study. Specifically, access to high-quality and diverse sociodemographic information is necessary for evaluating ML models for fairness, making it critical that these data are measured or not lost during processing\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e. Middle Eastern ethnicity, for example, does not appear to be commonly encoded as a unique racial or ethnic category in research datasets, which inevitably precludes the discovery of important trends in this population as identified in our study\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e. Demographics in our dataset were drawn from CAMH\u0026rsquo;s health equity form, which was designed to capture a range of rich features which are not frequently characterized, such as specific ethnic and gender minorities\u003csup\u003e\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eOverall, our results suggest that if fairness is not properly considered, the deployment of ML algorithms to support the prediction of aggression in acute psychiatric care, and other clinical settings, has the potential to cause significant harms with respect to both disparate mistreatment and equalized odds in socially and structurally disadvantaged groups\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e. Bias in ML algorithms has already been shown to reduce clinician accuracy\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e; in psychiatric risk assessment, the unwarranted use of interventions based on a false positive prediction can lead to unnecessary distress, disruption of trust in a therapeutic relationship or the health system, and may even precipitate violent or aggressive incidents when they otherwise would not have occurred\u003csup\u003e\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e. Furthermore, there is extensive literature highlighting the cyclical nature of algorithmic unfairness: algorithms can reproduce and amplify existing inequalities, which can then become embedded in new datasets used to develop ML algorithms or inform care\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e,\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u003c/sup\u003e. Even if an unfair recommendation is not followed, disagreement between providers and ML algorithms may lead providers to fear legal implications against them, which may negatively impact care\u003csup\u003e\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e. Given these concerns, algorithmic unfairness is recognized by both patients and providers as a major barrier in the clinical implementation of predictive risk models\u003csup\u003e\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e,\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eThere exist a range of algorithmic methods to improve a model\u0026rsquo;s fairness, such as integrating fairness benchmarks into optimization criteria during model training, resampling the input data itself to improve fairness, or enforcing specific fairness criterion using group-specific prediction thresholds\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e. Several studies have now applied \u0026ldquo;debiasing\u0026rdquo; methods to clinical ML algorithms, demonstrating promising results\u003csup\u003e\u003cspan additionalcitationids=\"CR53\" citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e. Our findings highlight the necessity to properly assess fairness so that these measures can be applied as appropriate to predictive risk models before they are deployed. An important consideration, however, is that most debiasing methods use the ground truth outcome label as a benchmark to determine whether a model is fair\u003csup\u003e\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e. In other words, most methods seek to faithfully replicate \u0026ldquo;the world as it is\u0026rdquo; \u0026ndash; no more, but no less unfair than the input data. However, we have discussed how data relating to inpatient aggression, particularly the administration of coercive interventions, is deeply intertwined with societal inequities. As such, debiasing metrics and methods in this context must use some \u0026ldquo;true\u0026rdquo; notion of fairness that represents \u0026ldquo;the world as it should be\u0026rdquo;. Algorithmic interventions, therefore, do not constitute a complete solution. To enable algorithmic debiasing approaches, practitioners first must define how a fair and equitable ML algorithm should behave \u0026ndash; this is a social question, not a technical one.\u003c/p\u003e\u003cp\u003eUltimately, ML systems do not operate in a vacuum, but rather as part of highly complex sociotechnical systems where algorithms and societal inequities interact in complex ways. We highlight that ML fairness assessments can identify inequities across large, complex datasets to help target further investigation. However, fairness analysis alone cannot deeply characterize these social and structural drivers of unfairness, nor the exact processes by which they ultimately result in unfair predictions. When seeking to understand algorithmic fairness, therefore, it is important to characterize and understand these biases and inequities on a social level, such as through qualitative approaches that reveal patient and provider experiences\u003csup\u003e\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e,\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eIt is also important to note that there is no single optimal way to assess the fairness of ML algorithms. There are over 70 definitions of fairness, many of are conflicting, making it impossible to simultaneously satisfy all possible definitions\u003csup\u003e\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e\u003c/sup\u003e. We restricted our analysis to a single a priori perspective of what constitutes a fair ML model with a focus on disparate mistreatment and equalized odds, making it possible that our analysis missed other relevant fairness considerations or perspectives. For example, in contrast to the group notion of fairness used in this study, individual fairness postulates that similar individuals should receive similar ML predictions, drawing from philosophies of consistency and individual justice rather than anti-discrimination frameworks\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e,\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e. Individual fairness often relies on counterfactual or explanation-based ways to define fairness, neither of which were assessed in this study \u003csup\u003e\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eAdditionally, there are limitations within the dataset used for this study. Our algorithm was trained using an urban Canadian population \u0026ndash; although underlying inequities appear pervasive across populations, our findings may not generalize to other populations\u003csup\u003e\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e\u003c/sup\u003e. Moreover, the analysis relied on EHR data which is known to vary in quality. For instance, it is possible that some aggressive incidents were not documented, or modes of admission were mislabelled. Following prior work\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e, we included restraints in the outcome under the assumption that they were only applied when aggressive incidents were imminent, which may not always hold. Additionally, segmentation of the dataset into subgroups reduced the sample size for the fairness assessment, especially with respect to minority and intersectional groups. For example, limited sample sizes necessitated us to collapse granular descriptions of ethnic heritage into \u0026ldquo;Black\u0026rdquo; as a big-bucket category, which may mask additional disparities in ML fairness within this heterogenous group\u003csup\u003e\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e\u003c/sup\u003e. Similar limitations were present with gender, as we grouped all genders that were not male or female into a single category, which still lacked the sufficient size to perform intersectional analysis. As such, we encourage future ML studies in this context to perform fairness assessments, particularly by leveraging rich dataset features such as granular ethnic breakdowns or larger sample sizes for intersectional groups. This will enable a more nuanced and thorough understanding of algorithmic fairness, and how they may differ across populations.\u003c/p\u003e\u003cp\u003eIn conclusion, ML predictions of aggression in acute psychiatric care and other clinical settings have the potential to be unfairly biased. However, this is not meant to be an argument against the use of ML in such contexts. Rather, we suggest that it is critical to be aware of fairness-related considerations prior to their implementation, and illustrate how performing such analyses can shed light about underlying inequities. To this end, we encourage future ML work in psychiatry to consider fairness as a critical element of evaluation, and to conduct further research to interrogate these identified inequities.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank members of CAMH's Data \u0026amp; Insights Team for their support with health record extraction and interpretation.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding sources\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by a Dalla Lana School of Public Health Interdisciplinary Data Science Seed Grant (S.L.H, no ward/grant number), the Krembil Foundation (L.S. and M.M.M., no award/grant number), the Social Sciences and Humanities Research Council Insight Development Grant (L.S.: #430-2021-01166) and a Google Award for Inclusion Research (L.S. and S.L.H, no award/grant number).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics Declaration\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was approved by the CAMH research ethics board (REB #053-2021). All patient EHR data was processed in adherence with protocols reviewed by the CAMH Privacy Department and research ethics board.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInformed consent declaration\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDirect patient consent was not required for this study, as was approved by the CAMH research ethics board (REB #053-2021).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompleting Interests\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eS.L.H, J.Z., L.S., and M.M.M report financial support from the University of Toronto Dalla Lana School of Public Health. S.H, J.Z., L.S., and M.M.M. report financial support from the Social Sciences and Humanities Research Council of Canada. L.S. and S.L.H. report financial support from Google Research. These funders had no role in study conceptualization, design, implementation or dissemination of findings.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe dataset for this study is restricted as it comprised confidential electronic health records. Inquiries about the data can be directed to the corresponding author. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eL.S., M.M.M, J.Z., and S.H. conceptualized the study and acquired funding for its completion. Y.W., Z.F., and R.Z. contributed to the interpretation and processing of data. Y.W., R.Z., M.M.M., L.S., and S.H. \u0026nbsp; conceptualized the methods. Y.W. completed the analysis and visualizations and drafted the paper. \u0026nbsp;All authors reviewed and contributed edits to the manuscript.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eItzhaki, M. \u003cem\u003eet al.\u003c/em\u003e Exposure of mental health nurses to violence associated with job stress, life satisfaction, staff resilience, and post-traumatic growth. \u003cem\u003eInt. J. Ment. Health Nurs.\u003c/em\u003e 24, 403\u0026ndash;412 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIozzino, L., Ferrari, C., Large, M., Nielssen, O. \u0026amp; Girolamo, G. de. Prevalence and Risk Factors of Violence by Psychiatric Acute Inpatients: A Systematic Review and Meta-Analysis. \u003cem\u003ePLOS ONE\u003c/em\u003e 10, e0128536 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePescosolido, B. A., Manago, B. \u0026amp; Monahan, J. Evolving Public Views On The Likelihood Of Violence From People With Mental Illness: Stigma And Its Consequences. \u003cem\u003eHealth Aff. (Millwood)\u003c/em\u003e 38, 1735\u0026ndash;1743 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZaheer, J. Documenting Restraint: Minimizing Trauma. in \u003cem\u003eInterrogating Psychiatric Narratives of Madness: Documented Lives\u003c/em\u003e (eds. Daley, A. \u0026amp; Pilling, M. D.) 111\u0026ndash;135 (Springer International Publishing, Cham, 2021). doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-3-030-83692-4_5\u003c/span\u003e\u003cspan address=\"10.1007/978-3-030-83692-4_5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu, W., Mueser, K. T., Rosenberg, S. D., Yanos, P. T. \u0026amp; Mahmoud, N. Posttraumatic Reactions to Psychosis: A Qualitative Analysis. \u003cem\u003eFront. Psychiatry\u003c/em\u003e 8, 129 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eParmigiani, G., Barchielli, B., Casale, S., Mancini, T. \u0026amp; Ferracuti, S. The impact of machine learning in predicting risk of violence: A systematic review. \u003cem\u003eFront. Psychiatry\u003c/em\u003e 13, (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMehrabi, N., Morstatter, F., Saxena, N., Lerman, K. \u0026amp; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1908.09635\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1908.09635\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJulia, A., Larson, J., Surya, M. \u0026amp; Lauren, K. Machine Bias. \u003cem\u003eMachine Bias\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing\u003c/span\u003e\u003cspan address=\"https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eObermeyer, Z., Powers, B., Vogeli, C. \u0026amp; Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. \u003cem\u003eScience\u003c/em\u003e 366, 447\u0026ndash;453 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, I. Y., Szolovits, P. \u0026amp; Ghassemi, M. Can AI Help Reduce Disparities in General Medical and Mental Health Care? \u003cem\u003eAMA J. Ethics\u003c/em\u003e 21, E167-179 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeyyed-Kalantari, L., Zhang, H., McDermott, M. B. A., Chen, I. Y. \u0026amp; Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. \u003cem\u003eNat. Med.\u003c/em\u003e 27, 2176\u0026ndash;2182 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNational Collaborating Centre for Determinants of Health. Glossary of essential health equity terms. (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHairston, D. R., Gibbs, T. A., Wong, S. S. \u0026amp; Jordan, A. Clinician Bias in Diagnosis and Treatment. in \u003cem\u003eRacism and Psychiatry: Contemporary Issues and Interventions\u003c/em\u003e (eds. Medlock, M. M., Shtasel, D., Trinh, N.-H. T. \u0026amp; Williams, D. R.) 105\u0026ndash;137 (Springer International Publishing, Cham, 2019). doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-3-319-90197-8_7\u003c/span\u003e\u003cspan address=\"10.1007/978-3-319-90197-8_7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSmith, C. M. \u003cem\u003eet al.\u003c/em\u003e Association of Black Race With Physical and Chemical Restraint Use Among Patients Undergoing Emergency Psychiatric Evaluation. \u003cem\u003ePsychiatr. Serv. Wash. DC\u003c/em\u003e 73, 730\u0026ndash;736 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKirkbride, J. B. \u003cem\u003eet al.\u003c/em\u003e The social determinants of mental health and disorder: evidence, prevention and recommendations. \u003cem\u003eWorld Psychiatry\u003c/em\u003e 23, 58\u0026ndash;90 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMotley, R. \u0026amp; Banks, A. Black Males, Trauma, and Mental Health Service Use: A Systematic Review. \u003cem\u003ePerspect. Soc. Work J. Dr. Stud. Univ. Houst. Grad. Sch. Soc. Work\u003c/em\u003e 14, 4\u0026ndash;19 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOlbert, C. M., Nagendra, A. \u0026amp; Buck, B. Meta-analysis of Black vs. White racial disparity in schizophrenia diagnosis in the United States: Do structured assessments attenuate racial disparities? \u003cem\u003eJ. Abnorm. Psychol.\u003c/em\u003e 127, 104\u0026ndash;115 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTegnerowicz, J. \u0026ldquo;Maybe It Was Something Wrong With Me\u0026rdquo;: On the Psychiatric Pathologization of Black Men. in \u003cem\u003eInequality, Crime, and Health Among African American Males\u003c/em\u003e vol. 20 73\u0026ndash;94 (Emerald Publishing Limited, 2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDobbins, N. J. \u003cem\u003eet al.\u003c/em\u003e Deep learning models can predict violence and threats against healthcare providers using clinical notes. \u003cem\u003eNpj Ment. Health Res.\u003c/em\u003e 3, 61 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSikstrom, L. \u003cem\u003eet al.\u003c/em\u003e Predictive care: a protocol for a computational ethnographic approach to building fair models of inpatient violence in emergency psychiatry. \u003cem\u003eBMJ Open\u003c/em\u003e 13, e069255 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOgloff, J. R. P. \u0026amp; Daffern, M. The dynamic appraisal of situational aggression: an instrument to assess risk for imminent aggression in psychiatric inpatients. \u003cem\u003eBehav. Sci. Law\u003c/em\u003e 24, 799\u0026ndash;813 (2006).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLantta, T., Kontio, R., Daffern, M., Adams, C. E. \u0026amp; V\u0026auml;lim\u0026auml;ki, M. Using the Dynamic Appraisal of Situational Aggression with mental health inpatients: a feasibility study. \u003cem\u003ePatient Prefer. Adherence\u003c/em\u003e 10, 691\u0026ndash;701 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSuchting, R., Green, C. E., Glazier, S. M. \u0026amp; Lane, S. D. A data science approach to predicting patient aggressive events in a psychiatric hospital. \u003cem\u003ePsychiatry Res.\u003c/em\u003e 268, 217\u0026ndash;222 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWeltens, I. \u003cem\u003eet al.\u003c/em\u003e Aggression on the psychiatric ward: Prevalence and risk factors. A systematic review of the literature. \u003cem\u003ePLoS ONE\u003c/em\u003e 16, e0258346 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZafar, M. B., Valera, I., Rodriguez, M. G. \u0026amp; Gummadi, K. P. Fairness Beyond Disparate Treatment \u0026amp; Disparate Impact: Learning Classification without Disparate Mistreatment. in \u003cem\u003eProceedings of the 26th International Conference on World Wide Web\u003c/em\u003e 1171\u0026ndash;1180 (2017). doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3038912.3052660\u003c/span\u003e\u003cspan address=\"10.1145/3038912.3052660\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMenger, V., Spruit, M., van Est, R., Nap, E. \u0026amp; Scheepers, F. Machine Learning Approach to Inpatient Violence Risk Assessment Using Routinely Collected Clinical Notes in Electronic Health Records. \u003cem\u003eJAMA Netw. Open\u003c/em\u003e 2, e196709 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, K. Z. \u003cem\u003eet al.\u003c/em\u003e Prediction of physical violence in schizophrenia with machine learning algorithms. \u003cem\u003ePsychiatry Res.\u003c/em\u003e 289, 112960 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEl-Azab, S. \u0026amp; Nong, P. Clinical algorithms, racism, and \u0026ldquo;fairness\u0026rdquo; in healthcare: A case of bounded justice. \u003cem\u003eBig Data Soc.\u003c/em\u003e 10, 20539517231213820 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWatts, D., Leese, M., Thomas, S., Atakan, Z. \u0026amp; Wykes, T. The Prediction of Violence in Acute Psychiatric Units. \u003cem\u003eInt. J. Forensic Ment. Health\u003c/em\u003e 2, 173\u0026ndash;180 (2003).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaharaj, R., Gillies, D., Andrew, S. \u0026amp; O\u0026rsquo;brien, L. Characteristics of patients referred by police to a psychiatric hospital. \u003cem\u003eJ. Psychiatr. Ment. Health Nurs.\u003c/em\u003e 18, 205\u0026ndash;212 (2011).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDharma, C. \u003cem\u003eet al.\u003c/em\u003e Examining Systemic and Interpersonal Bias in Violence Risk Assessments of Patients in Acute Psychiatric Care. \u003cem\u003ePsychiatr. Serv. Wash. DC\u003c/em\u003e 76, 326\u0026ndash;335 (2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMeerai, S., Abdillahi, I. \u0026amp; Poole, J. An Introduction to Anti-Black Sanism. \u003cem\u003eIntersect. Glob. J. Soc. Work Anal. Res. Polity Pract.\u003c/em\u003e 5, 18\u0026ndash;35 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBhui, K. \u003cem\u003eet al.\u003c/em\u003e Ethnic variations in pathways to and use of specialist mental health services in the UK. Systematic review. \u003cem\u003eBr. J. Psychiatry J. Ment. Sci.\u003c/em\u003e 182, 105\u0026ndash;116 (2003).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChow, J. C.-C., Jaffee, K. \u0026amp; Snowden, L. Racial/Ethnic Disparities in the Use of Mental Health Services in Poverty Areas. \u003cem\u003eAm. J. Public Health\u003c/em\u003e 93, 792\u0026ndash;797 (2003).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchreiter, S. \u003cem\u003eet al.\u003c/em\u003e Housing situation and healthcare for patients in a psychiatric centre in Berlin, Germany: a cross-sectional patient survey. \u003cem\u003eBMJ Open\u003c/em\u003e 9, e032576 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNarendorf, S. C. Intersection of homelessness and mental health: A mixed methods study of young adults who accessed psychiatric emergency services. \u003cem\u003eChild. Youth Serv. Rev.\u003c/em\u003e 81, 54\u0026ndash;62 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAmato, S., Nobay, F., Amato, D. P., Abar, B. \u0026amp; Adler, D. Sick and unsheltered: Homelessness as a major risk factor for emergency care utilization. \u003cem\u003eAm. J. Emerg. Med.\u003c/em\u003e 37, 415\u0026ndash;420 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKushel, M. B., Perry, S., Bangsberg, D., Clark, R. \u0026amp; Moss, A. R. Emergency Department Use Among the Homeless and Marginally Housed: Results From a Community-Based Study. \u003cem\u003eAm. J. Public Health\u003c/em\u003e 92, 778\u0026ndash;784 (2002).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSerper, M. R. \u003cem\u003eet al.\u003c/em\u003e Predictors of aggression on the psychiatric inpatient service. \u003cem\u003eCompr. Psychiatry\u003c/em\u003e 46, 121\u0026ndash;127 (2005).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMauri, M. C. \u003cem\u003eet al.\u003c/em\u003e Aggressiveness and violence in psychiatric patients: a clinical or social paradigm? \u003cem\u003eCNS Spectr.\u003c/em\u003e 24, 564\u0026ndash;573 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSanford, S., Roche, B., Molina, I., Weston, N. A. \u0026amp; Sirotich, F. Toronto Supportive Housing Growth Plan: Needs Assessment.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTahir, R., Due, C., Ward, P. \u0026amp; Ziersch, A. Understanding mental health from the perception of Middle Eastern refugee women: A critical systematic review. \u003cem\u003eSSM - Ment. Health\u003c/em\u003e 2, 100130 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAndrus, M., Spitzer, E., Brown, J. \u0026amp; Xiang, A. What We Can\u0026rsquo;t Measure, We Can\u0026rsquo;t Understand: Challenges to Demographic Data Procurement in the Pursuit of Fairness. in \u003cem\u003eProceedings of the\u003c/em\u003e 2021 \u003cem\u003eACM Conference on Fairness, Accountability, and Transparency\u003c/em\u003e 249\u0026ndash;260 (Association for Computing Machinery, New York, NY, USA, 2021). doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3442188.3445888\u003c/span\u003e\u003cspan address=\"10.1145/3442188.3445888\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSoliman, L., Jain, A., Rozel, J. \u0026amp; Rachal, J. Safe Spaces: Mitigating Potential Aggression in Acute Care Psychiatry. \u003cem\u003eFOCUS\u003c/em\u003e 21, 46\u0026ndash;51 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWe Ask Because We Care.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJabbour, S. \u003cem\u003eet al.\u003c/em\u003e Measuring the Impact of AI in the Diagnosis of Hospitalized Patients: A Randomized Clinical Vignette Survey Study. \u003cem\u003eJAMA\u003c/em\u003e 330, 2275\u0026ndash;2284 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLing, S., Cleverley, K. \u0026amp; Perivolaris, A. Understanding Mental Health Service User Experiences of Restraint Through Debriefing: A Qualitative Analysis. \u003cem\u003eCan. J. Psychiatry Rev. Can. Psychiatr.\u003c/em\u003e 60, 386\u0026ndash;392 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCaton, S. \u0026amp; Haas, C. Fairness in Machine Learning: A Survey. \u003cem\u003eACM Comput. Surv.\u003c/em\u003e 3616865 (2023) doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3616865\u003c/span\u003e\u003cspan address=\"10.1145/3616865\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGiddings, R. \u003cem\u003eet al.\u003c/em\u003e Factors influencing clinician and patient interaction with machine learning-based risk prediction models: a systematic review. \u003cem\u003eLancet Digit. Health\u003c/em\u003e 6, e131\u0026ndash;e144 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSax, D. R., Sturmer, L. R., Mark, D. G., Rana, J. S. \u0026amp; Reed, M. E. Barriers and Opportunities Regarding Implementation of a Machine Learning-Based Acute Heart Failure Risk Stratification Tool in the Emergency Department. \u003cem\u003eDiagnostics\u003c/em\u003e 12, 2463 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeng, Q., Du, M., Zou, N. \u0026amp; Hu, X. Fair Machine Learning in Healthcare: A Review. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://arxiv.org/abs/2206.14397\u003c/span\u003e\u003cspan address=\"http://arxiv.org/abs/2206.14397\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhu, Y. \u003cem\u003eet al.\u003c/em\u003e M\u003cspan\u003e$\u003c/span\u003e^3\u003cspan\u003e$\u003c/span\u003eFair: Mitigating Bias in Healthcare Data through Multi-Level and Multi-Sensitive-Attribute Reweighting Method. \u003cem\u003earXiv.org\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://arxiv.org/abs/2306.04118v1\u003c/span\u003e\u003cspan address=\"https://arxiv.org/abs/2306.04118v1\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYang, J., Soltan, A. A. S., Eyre, D. W., Yang, Y. \u0026amp; Clifton, D. A. An adversarial training framework for mitigating algorithmic biases in clinical machine learning. \u003cem\u003eNpj Digit. Med.\u003c/em\u003e 6, 1\u0026ndash;10 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi, F. \u003cem\u003eet al.\u003c/em\u003e Evaluating and mitigating bias in machine learning models for cardiovascular disease prediction. \u003cem\u003eJ. Biomed. Inform.\u003c/em\u003e 138, 104294 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHellstr\u0026ouml;m, T., Dignum, V. \u0026amp; Bensch, S. Bias in Machine Learning -- What is it Good for? Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.2004.00686\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2004.00686\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChin, M. H. \u003cem\u003eet al.\u003c/em\u003e Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care. \u003cem\u003eJAMA Netw. Open\u003c/em\u003e 6, e2345050 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAquino, Y. S. J. \u003cem\u003eet al.\u003c/em\u003e Practical, epistemic and normative implications of algorithmic bias in healthcare artificial intelligence: a qualitative study of multidisciplinary expert perspectives. \u003cem\u003eJ. Med. Ethics\u003c/em\u003e (2023) doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1136/jme-2022-108850\u003c/span\u003e\u003cspan address=\"10.1136/jme-2022-108850\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKleinberg, J., Mullainathan, S. \u0026amp; Raghavan, M. Inherent Trade-Offs in the Fair Determination of Risk Scores. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1609.05807\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1609.05807\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBinns, R. On the Apparent Conflict Between Individual and Group Fairness. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://arxiv.org/abs/1912.06883\u003c/span\u003e\u003cspan address=\"http://arxiv.org/abs/1912.06883\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSilva, M., Loureiro, A. \u0026amp; Cardoso, G. Social determinants of mental health: A review of the evidence. \u003cem\u003eEur. J. Psychiatry\u003c/em\u003e 30, 259\u0026ndash;292 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMovva, R. \u003cem\u003eet al.\u003c/em\u003e Coarse race data conceals disparities in clinical risk score performance. in \u003cem\u003eProceedings of the 8th Machine Learning for Healthcare Conference\u003c/em\u003e 443\u0026ndash;472 (PMLR, 2023).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"npj-mental-health-research","isNatureJournal":false,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"npjmentalhealth","sideBox":"Learn more about [npj Mental Health Research](https://www.nature.com/npjmentalhealth/)","snPcode":"44184","submissionUrl":"https://mts-npjmentalhealth.nature.com/cgi-bin/main.p...","title":"npj Mental Health Research","twitterHandle":"@npjmentalhealth\n","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"npj","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7781555/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7781555/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eManaging patient aggression is a major challenge in acute psychiatry, and machine learning (ML) applications are increasingly being developed to support individualized risk assessment and de-escalation. However, ML algorithms have been shown to exhibit unfair behavior based on protected characteristics, such as an individual\u0026rsquo;s sex or ethnicity. This is especially worrying in psychiatric contexts as social and systemic inequities - such as disparities in access to psychiatric care or racial profiling in admissions to hospital by police - can become embedded in training datasets. Despite the potential for ML algorithms to replicate and amplify such inequities, the fairness of ML-based predictions of aggression in acute psychiatry has received limited investigation. To address this gap, we trained an ML algorithm to predict aggressive incidents from structured electronic health records corresponding to 17,703 patients receiving acute care at a large psychiatric hospital between January 2016 and May 2022 (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;42,719 observation days). We analyzed predictions for fairness by assessing disparities in false positive rates [FPR] and true positive rates [TPR] (i.e., the equalized odds criterion), based on patient race/ethnicity, gender, admission mode, citizenship, and housing status, as well as intersections of race/ethnicity and gender. The random forest algorithm performed best (ROC-AUC\u0026thinsp;=\u0026thinsp;0.812). Fairness analyses revealed significant disparities in FPR and TPR across subgroups, such that FPR were higher for Middle Eastern and Black patients, men, those admitted into emergency care by the police, and those with unstable or supportive forms of housing. Middle eastern men had the highest FPR of any intersectional group. Our analysis demonstrates the potential for ML algorithms to exhibit unfairness across multiple demographic and social groups in predictions of inpatient aggression, reflecting known social and structural inequities. To prevent the reinforcement and amplification of existing disparities, it will be critical to apply strategies to mitigate unfairness in this context. At the same time, evaluating and exploring unfair ML behavior can reveal unique insights into underlying inequities that might be impacting patient experiences and care.\u003c/p\u003e","manuscriptTitle":"Fairness analysis of machine learning predictions of aggression in acute psychiatric care","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-31 10:07:20","doi":"10.21203/rs.3.rs-7781555/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-12-23T17:22:56+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-17T15:53:09+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"259898420059063839694496522337352888721","date":"2025-12-12T10:50:32+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-31T13:17:05+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"99358172332384640732314883423707989952","date":"2025-10-24T02:23:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"174576187158008227554097007709754721945","date":"2025-10-23T06:05:20+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-21T05:58:19+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-10-20T16:45:06+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-10-12T17:01:11+00:00","index":"","fulltext":""},{"type":"submitted","content":"npj Mental Health Research","date":"2025-10-04T18:42:02+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"npj-mental-health-research","isNatureJournal":false,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"npjmentalhealth","sideBox":"Learn more about [npj Mental Health Research](https://www.nature.com/npjmentalhealth/)","snPcode":"44184","submissionUrl":"https://mts-npjmentalhealth.nature.com/cgi-bin/main.p...","title":"npj Mental Health Research","twitterHandle":"@npjmentalhealth\n","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"npj","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"5f29f1da-68ee-4274-a051-5bd7e8bb27b8","owner":[],"postedDate":"October 31st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":57162232,"name":"Health sciences/Health care"},{"id":57162233,"name":"Biological sciences/Psychology"},{"id":57162234,"name":"Social science/Psychology"},{"id":57162235,"name":"Health sciences/Risk factors"}],"tags":[],"updatedAt":"2026-02-06T22:23:44+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-31 10:07:20","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7781555","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7781555","identity":"rs-7781555","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.