An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation

doi:10.21203/rs.3.rs-8838292/v1

An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation

2026 · doi:10.21203/rs.3.rs-8838292/v1

preprint OA: closed

Full text JSON View at publisher

Full text 119,488 characters · extracted from preprint-html · click to expand

An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation Jayashree Roul, Swetha Ghanta, Raheem Qudus, AVS Kamesh, Lalita Mohan Mohapatra, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8838292/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Employee attrition is a constant issue in competitive job markets. While predictive models can estimate attrition risk, turning those predictions into effective retention actions still poses challenges. This work aims to create a framework that links attrition prediction, explainable artificial intelligence (XAI), and generative AI to support data-driven and personalized retention strategies. Using the IBM HR Analytics Attrition dataset, we built a machine learning model to predict employee attrition. We applied SHAP explainability techniques to identify the main factors affecting individual attrition risk. We introduced an Employee Value Scoring (EVS) system to highlight high-value employees at risk. To translate insights into action, we used generative AI (Gemini) to create personalized retention recommendations based on the most important SHAP-derived features. The framework successfully identified high-risk employees and offered targeted, easy-to-understand recommendations based on individual attrition drivers. The results show how combining predictive modeling, explainability, and generative AI can help HR teams move from predicting risk to taking meaningful action. This work presents a new, unified approach that connects attrition prediction and effective retention planning. By integrating machine learning, XAI, and generative AI, the framework provides personalized and context-specific recommendations, improving the practical use of HR analytics for proactive talent management. Explainable AI SHAP Generative AI HR analytics Attrition prediction Personalized retention strategies Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 1. Introduction Employee attrition is a continued challenge for organizations, resulting in the loss of skilled personnel, reduced productivity, and substantial recruitment and training costs. High attrition disrupts team cohesion, weakens institutional knowledge, and negatively affects overall organizational performance (Hausknecht & Trevor, 2011 ). The employees leave the organizations for several reasons such as low job satisfaction, inadequate career growth, and personal or health-related concerns and organizations face mounting pressure to proactively identify and address the root causes of attrition (Mitravinda, 2022 ; Sangita, 2019 ). As the workforce becomes increasingly dynamic, accurately forecasting attrition and implementing effective retention strategies has become essential for preserving organizational stability and competitiveness. In this evolving landscape, Human Resource Analytics (HRA) has emerged as a transformative, data-driven approach to talent management. Advances in data processing and analytical tools are redefining traditional HR functions, shifting them from administrative operations to strategic, insight-driven practices (Xin et al., 2024). HRA leverages statistical techniques, predictive modeling, and interactive tools such as HR dashboards to derive actionable insights from employee data supporting workforce planning, performance evaluation, and attrition analysis (Opatha, 2020 ; Pandya, 2023). Organizations increasingly recognize the strategic value of HRA; nearly 20% of HR leaders expect analytics to become a core component of HR strategy in the coming years (Fernandez & Gallardo-Gallardo, 2020). Contemporary HR analytics increasingly relies on machine learning (ML) and artificial intelligence (AI) to process large, complex personnel datasets with speed and accuracy (Tambe et al., 2019 ). ML models have shown superior performance over traditional statistical approaches in predicting employee attrition (Fallucchi, 2020; Qutub, 2021), offering organizations the ability to detect hidden patterns related to job satisfaction, engagement, career progression, and work-life balance (Kakulapati, 2020; Avrahami, 2022). Despite their predictive power, a major limitation persists, referred as “black boxes”. ML models provide accurate predictions but have little transparency regarding the factors influencing individual attrition risk. This lack of interpretability reduces trust and limits their utility for HR decision-making. To address this challenge, XAI techniques such as SHAP (SHapley Additive Explanations) have gained importance. SHAP enables feature-level interpretation of predictions, making it possible to understand why a model identifies an employee as high-risk of attrition (Lundberg & Lee, 2017 ). In HR settings, this transparency is vital for building trust, providing actionable insights, and tailoring retention strategies to employee-specific drivers of attrition. However, even when explanations are available, HR professionals still face the challenge of converting analytical insights into meaningful and personalized retention actions. Traditional retention strategies tend to be generic, rule-based and may not address the specific factors contributing to an employee’s disengagement. Recent advancements in generative AI (GenAI) including models such as Gemini and ChatGPT offer a novel solution by transforming predictive insights into coherent, contextually relevant strategies tailored to individual employee needs (Dwivedi et al., 2023). These models can synthesize information from risk factors, performance indicators, and contextual variables to generate targeted recommendations, enabling more precise and effective HR interventions. In response to these needs, this work proposes an integrated HR analytics framework that combines ML-based attrition prediction, SHAP, EVS, and GenAI-based intervention design. Using the IBM HR Analytics Employee Attrition dataset, the framework evaluates multiple ML models, selects the best-performing algorithm, interprets its predictions through SHAP, and incorporates an EVS to assess the strategic importance of each employee. For high attrition risk employees, the framework uses the Gemini 2.5 Flash model to generate personalized retention strategies based on the top contributing factors identified by SHAP. This research addresses key gaps in existing literature, including the scarcity of comparative ML studies for attrition prediction, limited integration of XAI in HR contexts, and the absence of systems that bridge predictive insights with actionable, employee-specific retention strategies (Tursunbayeva et al., 2018; Zhang et al., 2019 ; Kraus et al., 2018). By evaluating multiple ML models using industry-standard metrics such as F1-score, recall, precision, and accuracy with particular emphasis on the minority class representing departing employees this work not only advances the application of ML in HR analytics but also demonstrates a practical, interpretable, and action-oriented approach for managing employee attrition. Overall, this work highlights the potential of combining predictive analytics, explainability, and generative AI to create a holistic, data-driven framework that supports HR professionals in identifying at-risk employees, understanding the reasons behind their potential exit, and implementing tailored interventions to improve retention and organizational outcomes. To further enhance practical applicability, this work also deploys the framework through a dedicated web platform that enables real-time utilization by HR teams, accessible at https://employeeretention.qudmeet.click/ . The website allows HR users to load employee details and immediately obtain model predictions, SHAP-driven explanations, and Employee Value Scores. Moreover, the GenAI module produces not only high-level retention strategies but fully articulated action plans that specify the action owner, timeline, next steps, and associated Key Performance Indicators (KPIs). For instance, for an employee with low Job Satisfaction and short role tenure, the system may recommend a manager-led meeting within two weeks to discuss career pathways, with a KPI such as increasing Job Satisfaction from 1 to at least 3 within six months. This deployment component demonstrates the real-world readiness of the proposed framework and its capability to support data-driven retention interventions at a larger scale. 2. Literature Review Effective employee management is a critical priority for modern organizations seeking to retain skilled employees and maintain competitive advantage. With HR processes becoming increasingly data-driven, researchers and experts are turning to advanced analytical methods including ML, XAI, and GenAI to enhance prediction accuracy, transparency, and decision quality in employee management (Arora et al., 2025). This section reviews key developments across these domains and identifies gaps addressed by the present study. 2.1 Machine Learning in HR Analytics Machine learning has become a prominent tool for analyzing workforce behavior, forecasting employee attrition, and optimizing HR decision-making. ML models such as decision trees, random forests, gradient boosting, XGBoost, and logistic regression have been widely tested on HR datasets to predict attrition (Zhao et al., 2020). These models offer substantial improvements over traditional statistical techniques due to their ability to capture complex, nonlinear relationships in employee data. Several empirical studies demonstrate high predictive performance. For example, random forest models have achieved up to 85% accuracy and an AUC of 0.90 in attrition prediction (Ma, 2024 ), while XGBoost has reached 92.3% accuracy in identifying employees likely to leave (Nagalakshmi, 2025). ML has also been applied in succession planning, performance forecasting, and identifying high-potential employees (Sharma & Dhingra, 2024 ). Despite strong performance, ML models face two major challenges: Black-box nature: HR professionals often struggle to trust predictions without understanding underlying drivers and Generalizability: Model performance depends heavily on data quality and contextual factors, limiting universal adoption. These limitations highlight the need for interpretability methods such as SHAP. 2.2 Explainable AI in HR Analytics XAI aims to provide transparency into complex ML models, enabling HR managers to understand why a prediction was made. Among existing techniques, SHAP is widely recognized and studies emphasize the value of SHAP in HR settings for its theoretical consistency and ability to quantify each feature’s contribution to an individual prediction (Lundberg & Lee, 2017 ). SHAP helps identify key factors driving attrition, such as job satisfaction, work-life balance, and career progression (Varkiani et al., 2025 ). Its integration into decision-support systems enables real-time insights and personalized retention strategies (Nagalakshmi, 2025). XAI enhances trust, reduces bias concerns, and supports ethical AI adoption within HR (Toreini et al., 2020; Sadeghi et al., 2024 ). By making ML models interpretable, XAI facilitates data-driven decision-making and empowers HR professionals to translate predictions into practical workforce interventions. However, challenges persist, including the requirement to interpret XAI outputs and the need for HR teams to understand model behavior. 2.3 Generative AI in HR Analytics GenAI, powered by large language models such as ChatGPT and Gemini, is transforming HR practices by automating content generation and providing personalized insights. Recent studies highlight GenAI’s role in HR analytics. GenAI was integrated for predicting employee engagement and identifying behavioral patterns related to dissatisfaction or burnout (Lenka & Chanda, 2024 ). Sentiment analysis and chatbots enable continuous feedback collection and enhanced communication between employees and HR (Zheng et al., 2023 ). The scope of GenAI in developing personalized retention plans, informed by historical attrition data, performance metrics, and motivational indicators was briefly discussed in Aldulaimi et al., 2021 . GenAI based career development support, including skill-matching, tailored growth recommendations, and individualized communication strategies are provided in Gupta et al., 2024 . GenAI’s strength lies in its ability to translate analytical insights into actionable, employee-specific interventions, an area where traditional HR systems fall short. However, researchers also note critical challenges, particularly the need for HR professionals to develop strong AI literacy to effectively interpret and act on algorithmic insights (Arora & Damarla, 2025 ). This underscores the importance of implementing GenAI within frameworks that support understanding, transparency, and ease of use. In response, the proposed framework incorporates an intuitive web-based platform that reduces the technical burden on HR users by automatically generating interpretable insights, SHAP-based explanations, Employee Value Scores, and complete, actionable retention plans. By translating complex analytical outputs into clear, manager-ready recommendations, complete with timelines, KPIs, and designated action owners, the system helps bridge the AI literacy gap and enables HR professionals to confidently integrate data-driven decision-making into their retention practices. Table 1 Summary of Research Gaps Topic Applications Benefits Challenges ML in HR Analytics Attrition prediction, performance forecasting High predictive accuracy, improved workforce insights Black-box models, limited interpretability XAI Model interpretability, decision support Transparency, trust, actionable insights Interpretation complexity Generative AI Retention strategy generation, sentiment analysis Personalized interventions, automation Need for HR upskilling 2.4 Identified Research Gaps Despite meaningful advancements in HR analytics, several critical gaps persist: a. Lack of unified frameworks that seamlessly integrate ML-based attrition prediction, SHAP-driven explainability, employee value assessment, and generative AI within a single, end-to-end system. b. Limited research on translating predictive insights into personalized, actionable retention strategies that are explicitly tied to the employee’s top attrition drivers. c. A predominant focus on model performance metrics, with fewer studies offering practical, user-friendly decision-support tools that HR teams can apply in real time. This work addresses these gaps by introducing a comprehensive HR analytics framework that integrates ML-driven attrition prediction, SHAP-based interpretation, Employee Value Score (EVS) assessment, and generative AI to produce tailored retention action plans. The framework is further operationalized through a user-friendly web application, enabling HR practitioners to enter employee data and immediately receive interpretable insights, high-value employee prioritization, and detailed, KPI-based retention action plans designed for direct managerial implementation. 3. Research Objectives The objectives of this work are: To develop an ML-based model for predicting employee attrition using the IBM HR Analytics dataset and evaluate multiple algorithms using industry-standard metrics. To enhance interpretability by applying SHAP to identify and explain the key factors contributing to each employee’s attrition risk. To incorporate an EVS to assess the strategic importance of employees and prioritize intervention efforts. To design a generative AI-based module that produces personalized retention strategies directly linked to top attrition drivers. To translate predictive and interpretive insights into actionable HR decision support through a real-time, web-based platform that generates complete retention action plans including timelines, KPIs, and action owners. 4. Methodology 4.1 Data Source and Description The IBM HR Analytics Employee Attrition dataset (Kaggle) is used as the primary data source. It contains information on employee demographics, job roles, performance ratings, work environment characteristics, and attrition status. The dataset includes records for 1,470 employees and comprises 35 feature columns. The prediction task is framed as a binary classification problem in which the target variable, Attrition , indicates whether an employee has left the organization (“ Attrition :Yes”) or remained (“ Attrition :No”). 4.2 Data Preprocessing The attributes of the dataset contained both categorical and numerical values. Since ML models require numerical input, preprocessing steps such as encoding categorical variables, Scaling Numerical Features and handling class imbalance are performed. All categorical features (e.g., Department, Gender, JobRole) are encoded into numeric form using ‘LabelEncoding’. This ensured that categorical attributes could be utilized effectively by the models while preserving their discrete nature. To ensure all features contributed equally to the model and to improve convergence of distance-based algorithms, the numerical features are standardized using StandardScaler, transforming them into a mean of 0 and a standard deviation of 1. The dataset exhibited class imbalance, with significantly fewer “Attrition: Yes” cases compared to “Attrition: No” cases. In the training split, 986 instances belonged to the “Attrition: No” class, while only 190 instances were labeled as “Attrition: Yes”. This imbalance can bias models toward the majority class. To address this, Synthetic Minority Oversampling Technique (SMOTE) is applied only on the training set after splitting the data as training and testing sets. SMOTE generates synthetic samples for the minority class by interpolating between existing instances, thereby balancing the dataset. After applying SMOTE, the class distribution was balanced, with 986 instances in the “Attrition: No” class and 986 instances in the “Attrition: Yes” class. This balancing allowed the model to learn equally from both attrition and non-attrition cases, improving generalization on minority class predictions. The class distribution before and after SMOTE are shown in Figure 1. 4.3 ML models To predict employee attrition, a variety of ML algorithms are employed. The models are selected to represent both linear and non-linear approaches, as well as ensemble learning methods to capture complex relationships in the dataset. Logistic Regression (LR) is a baseline classification algorithm widely used in HR analytics due to its interpretability. It models the probability of attrition using a logistic (sigmoid) function, making it suitable for binary classification problems. It provides coefficients that can be directly interpreted as feature importance, offering transparency in decision-making. Support Vector Machine (SVM) is effective in high-dimensional spaces and can handle non-linear decision boundaries using kernel functions. It finds the optimal hyperplane that maximizes the margin between attrition and non-attrition employees. It is robust to overfitting when appropriate kernels and regularization are applied. Random Forest (RF) is an ensemble method that reduces overfitting and improves prediction accuracy compared to single decision trees. It constructs multiple decision trees on bootstrapped subsets of data and aggregates predictions via majority voting. It handles non-linear relationships and feature interactions well and provides feature importance rankings. Extra Trees Classifier (ETC) is similar to RF but introduces greater randomness in feature splits, often improving generalization. Unlike RF, which searches for the best split, Extra Trees selects splits randomly and averages across many trees. It has faster training time and reduced variance compared to RF. Extreme Gradient Boosting (XGBoost) is a gradient boosting algorithm known for state-of-the-art performance in structured/tabular data competitions. It builds decision trees sequentially, where each tree corrects the errors of the previous ones using gradient descent optimization. It handles imbalanced data well, offers regularization, and provides strong predictive power. 4.4 Model Selection To identify the most suitable predictive model, all five algorithms are trained and evaluated on the IBM attrition dataset. Since the dataset was initially imbalanced, accuracy alone is insufficient to judge model performance. Therefore, a combination of metrics as shown in Table 2 are used: Table 2: ML model evaluation metrics Metric Description Accuracy Measures the overall proportion of correct predictions made by the model. Precision Indicates the proportion of predicted “Attrition: Yes” cases that are actually correct. Recall (Sensitivity) Measures the model’s ability to correctly identify actual “Attrition: Yes” cases. F1-Score The harmonic mean of precision and recall, balancing false positives and false negatives. ROC-AUC Represents the area under the ROC curve, indicating how well the model distinguishes between the two classes. 4.5 Explainability Using SHAP SHAP (SHapley Additive exPlanations) is an XAI framework based on cooperative game theory. It assigns each feature a contribution value that represents how much it increases or decreases the prediction for an individual instance. The method uses Shapley values, originally from game theory, which fairly distribute the “payout” (in our case, the prediction) among all contributing features. The key characteristics of SHAP include local explanations that identifies why a specific prediction was made for a given employee, global explanations aggregate feature contributions across all employees to highlight the most influential factors overall. SHAP works with a variety of ML models, including tree-based ensembles like Random Forest and XGBoost. In this work, SHAP is applied to the best-performing model to interpret its predictions of employee attrition. SHAP identified the contributing features that had the most significant influence on employee attrition predictions. For example, factors such as OverTime, JobRole, MonthlyIncome, WorkLifeBalance, and YearsAtCompany emerged as critical in driving employee attrition. This explainability using SHAP will provide HR managers with actionable insights beyond a simple leave/stay prediction. 4.6 Employee Value Score Calculation An EVS is calculated to quantify each employee’s organizational value, combining multiple weighted attributes: 'PerformanceRating' 'YearsAtCompany' 'YearsInCurrentRole' 'TrainingTimesLastYear' 'PercentSalaryHike' 'NumCompaniesWorked' Each attribute i is assigned a weight (w_i) based on organizational priorities and expert judgment. The EVS was calculated as a weighted sum: (1) where w_i represents the weight for attribute i, and a_i is the standardized score of that attribute. 4.7 Retention Strategy Recommendation While predictive modeling using ML and explainability with SHAP provides insights into employee satisfaction and attrition risk, organizations ultimately require actionable strategies to address these risks. To bridge this gap, the Google Gemini AI is integrated into the framework. The Google GPT model, Gemini, is employed to generate personalized employee retention and development strategies. Inputs included EVS, representing the organizational importance of an employee, derived from predefined features based on equation (1), SHAP explanations with top contributing factors that influenced the model’s prediction of attrition for the employee and prediction outcome whether the employee is classified as “Attrition: Yes” or “Attrition: No”. Using these inputs, the Gemini model is prompted to suggest tailored HR interventions. The prompt used to generate responses from the Gemini Flash 2.5 model is shown in the Figure 3. For high EVS employees at risk, Gemini suggested retention strategies such as competitive salary adjustments, flexible work arrangements, or career advancement opportunities. This ensures that highly valuable employees remain engaged and committed. For low-EVS employees identified as at risk, instead of overlooking their retention needs, Gemini recommended constructive, development-focused strategies such as targeted training, mentorship, or reassignment to roles better aligned with their skills. This enhances employee growth while benefiting organizational productivity. For satisfied employees, suggestions focused on sustaining engagement, such as recognition programs, wellness initiatives, and continued professional development. With Gemini Integration, raw predictive outputs are converted into practical HR actions. The recommended retention strategies are not just generic but tailored to each employee’s profile and organizational value. The strategies suggested balancing and benefitting both the employee growth and satisfaction, and the organization productivity. By integrating the Gemini API with predictive modeling and SHAP explainability, the framework transitions from being a predictive system to a prescriptive decision support tool, directly assisting HR managers in making informed and personalized interventions. 5. Results 5.1 Model Comparison Each model is evaluated on the test set after SMOTE was applied on the training set. Ensemble methods (Random Forest, Extra Trees, and XGBoost) consistently outperformed linear models (Logistic Regression, SVM) by capturing complex feature interactions. Logistic Regression performed reasonably with balanced accuracy but struggled to capture non-linear patterns. SVM provided moderate results but required extensive tuning and is computationally more expensive. Random Forest & Extra Trees delivered strong results with good recall and interpretability via feature importance. XGBoost achieved the highest overall performance across metrics, particularly excelling in recall and ROC-AUC, which are critical for correctly identifying employees at risk of leaving. 5.2 Final Selection Based on comparative analysis from Table 3 , XGBoost is selected as the best-performing model. Its ability to handle imbalanced data, capture complex non-linear relationships, and provide robust predictive performance made it the most suitable choice for the task. Furthermore, XGBoost integrates seamlessly with SHAP explainability, enabling detailed insights into the contribution of individual features toward employee satisfaction predictions. Table 3 Performance Comparison of different ML models Model Accuracy Precision Recall F1-Score ROC-AUC XGBoost 0.8605 0.6250 0.3191 0.4225 0.8043 Random Forest 0.8469 0.5385 0.2979 0.3836 0.7918 Support Vector Machine 0.8333 0.4762 0.4255 0.4494 0.7567 Extra Trees Classifier 0.8231 0.4419 0.4043 0.4222 0.7275 Logistic Regression 0.7585 0.3750 0.7660 0.5035 0.7949 5.3 SHAP results The SHAP summary plot as shown in Fig. 4 provides an overall view of how each feature contributes to the model’s predictions across all employees. It ranks features by their average absolute SHAP value, allowing quick identification of the most influential drivers of attrition. Each point in the plot represents an individual employee, with color indicating whether the feature value is high or low. Points positioned to the right contribute positively toward predicting “Attrition: Yes,” while points to the left push the prediction toward “Attrition: No.” This visualization reveals both the direction and magnitude of feature impact. In this work, the summary plot clearly highlights factors such as OverTime, and JobRole as key contributors. By presenting complex model behavior in an interpretable format, the SHAP summary plot supports transparent decision-making and helps HR professionals understand which factors most strongly influence attrition risk across the organization. The summary plot is show below. To better understand the factors influencing attrition predictions at individual employee level, the details of employees with high attrition risk and low attrition risk are passed to the SHAP explainer. The corresponding force plots obtained are shown below in Figs. 5 and 6 . These plots highlight the contribution of individual features toward the prediction, where red features increase the likelihood of attrition and blue features decrease the likelihood of attrition. For employees with higher attrition likelihood (e.g., predicted score = 0.61), risk factors such as tenure with current manager, moderate monthly income levels, total working years, job role, and distance from home played a stronger role in driving attrition risk. Although protective features such as lower overtime, business travel, and high environment satisfaction are present, their influence is not sufficient to counterbalance the dominant risk factors. As a result, the overall prediction indicates a higher attrition risk for this employee. In contrast, employee with very low attrition likelihood (predicted score = -6.52) is mainly characterized by protective features such as absence of overtime, good job satisfaction, supportive manager tenure, and higher work-life balance. These insights highlight that attrition is driven by heterogeneous factors across employees, supporting the need for tailored retention strategies. 5.4 Gemini retention strategies The top contributing factors identified by SHAP, along with EVS are given to the Gemini model to generate automated retention strategies. The sample strategies along with complete action plan suggested by Gemini are shown below: For high-EVS employees identified as high-risk, the GenAI module produces detailed, multi-step retention action plans grounded in the specific SHAP-identified drivers of attrition. As illustrated in the generated example Fig. 7 , the system provides three prioritized interventions, each structured with a clear action owner , next step , timeline , and measurable KPI . The recommendations address key factors such as low Environment Satisfaction, reduced Job Satisfaction, and lack of recent career progression. For instance, the model suggests targeted managerial or HR-led actions including one-on-one discussions, development of improvement plans, identifying stretch assignments, and reviewing promotion pathways. Each intervention includes time-bound execution steps (e.g., within 2 weeks or 30 days) and quantifiable KPIs (e.g., improving satisfaction scores), ensuring the output is actionable, trackable, and aligned with HR best practices. This structured approach demonstrates the system’s ability to translate predictive insights into practical retention strategies tailored for high-value employees. For low-EVS employees who exhibit elevated attrition risk, the GenAI module focuses on developmental and support-oriented interventions, ensuring that these employees are not overlooked but are instead guided toward improved engagement and stability. As illustrated in Fig. 8 , the system generates a set of structured recommendations addressing early-tenure risks, low job involvement, and work-life balance challenges which are the factors frequently associated with attrition in lower-EVS groups. In this case, the model proposes conducting an early “stay interview” for new hires, identifying task-level engagement opportunities to improve job involvement, and monitoring overtime patterns to prevent burnout. These suggestions are designed to provide targeted support, enhance early integration, and improve role alignment, demonstrating that even low-EVS employees receive meaningful, actionable development plans rather than being passively allowed to attrite. This reinforces the framework’s commitment to equitable, data-informed retention practices across all employee segments. 6. Discussion The work demonstrates that integrating predictive modeling, XAI, and GenAI can significantly strengthen organizational attrition management. Among the tested models, XGBoost achieved the highest performance, particularly in identifying the minority attrition class. Its superior recall and overall stability make it well-suited for HR contexts where missing high-risk employees can lead to substantial organizational loss. The use of SHAP explainability provided clear insight into the model’s decisions. SHAP consistently highlighted the top factors contributing to attrition (e.g., Overtime, JobRole, Job Satisfaction, Monthly Income, Years at Company), enabling HR practitioners to understand not only which variables are most influential but also how they affected individual predictions. The SHAP force plots further simplified interpretation, offering intuitive visual explanations that help HR teams quickly diagnose employee-specific risk drivers without technical expertise. The introduction of the EVS enhanced strategic decision-making by differentiating employees based not only on their attrition risk but also their strategic importance. This dual perspective supports more efficient allocation of retention resources, allowing HR to focus on high-risk, high-value employees. Finally, integrating these insights with Gemini-generated retention strategies transformed predictive outputs into actionable guidance. By combining SHAP factors, EVS, and risk levels, Gemini produced tailored recommendations that HR professionals can directly implement. This approach ensures that interventions are both personalized and aligned with organizational priorities. Overall, the results confirm that the proposed hybrid framework not only improves prediction accuracy but also provides interpretability, and decision support enabling organizations to strengthen retention efforts, support employee well-being, and enhance long-term workforce stability. 7. Conclusion This work demonstrates the value of a hybrid HR analytics framework that combines predictive modeling, explainable AI, and generative AI to improve employee attrition management. We compared different ML algorithms, and XGBoost proved to be the most effective model for predicting attrition. SHAP-based explainability added essential transparency, identifying the key factors driving attrition and providing intuitive, employee-level insights through force plot visualizations. The introduction of the EVS further enhanced decision-making by aligning attrition risk with strategic employee importance. By integrating these analytical outputs with Gemini-generated retention strategies, the framework successfully translated predictions into personalized, actionable interventions. This closes the gap between data-driven insights and practical HR action. Overall, the proposed approach supports more targeted and effective retention management, contributing to both organizational stability and employee well-being. Further, these analytical components are operationalized through a real-time web-based application developed as part of this work, enabling HR practitioners to interact with predictions, explanations, and Gemini-generated personalized retention strategies in a user-friendly environment. Future research could extend this framework by validating its effectiveness across multiple industries and larger, more heterogeneous employee datasets to ensure broader generalizability. Further enhancements may include integrating advanced deep learning models or multimodal data sources such as text-based performance reviews, employee surveys, or behavioral system logs to capture more complex predictors of attrition. Declarations Funding The authors did not receive any funding for the submitted work. • Conflict of Interest On behalf of all authors, the corresponding author states that there is no conflict of interest. • Ethics approval and consent to participate Not Applicable • Author contribution Conceptualization: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Methodology: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Formal analysis and investigation: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Writing - original draft preparation: Jayashree Roul, Swetha Ghanta, Raheem Qudus, AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan; Writing - review and editing: AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan; Supervision: AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan. • Research Involving Human and/or Animals Not Applicable • Informed Consent Not Applicable • Consent to publish statement Not Applicable • Data Availability statement The dataset used in this work is publicly available on Kaggle repository and can be accessed at https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset References Aldulaimi, S.H., Abdeldayem, M.M., Mowafak, B.M. and Abdulaziz, M.M., 2021. Experimental perspective of artificial intelligence technology in human resources management. In Applications of artificial intelligence in business, education and healthcare (pp. 487-511). Cham: Springer International Publishing. Arora, R. and Damarla, R.B., 2025. A Review on Generative AI Powered Talent Management, Employee Engagement and Retention Strategies: Applications, Benefits, and Challenges. Procedia Computer Science , 260 , pp.683-691. Avrahami, D., Pessach, D., Singer, G. and Chalutz Ben-Gal, H., 2022. A human resources analytics and machine-learning examination of turnover: implications for theory and practice. International Journal of Manpower , 43 (6), pp.1405-1424. Banh, L. and Strobel, G., 2023. Generative artificial intelligence. Electronic Markets , 33 (1), p.63. Dwivedi, Y.K., 2025. Generative Artificial Intelligence (GenAI) in entrepreneurial education and practice: emerging insights, the GAIN Framework, and research agenda. International Entrepreneurship and Management Journal , 21 (1), pp.1-21. Fallucchi, F., Coladangelo, M., Giuliano, R. and William De Luca, E., 2020. Predicting employee attrition using machine learning techniques. Computers , 9 (4), p.86. Fernandez, V. and Gallardo-Gallardo, E., 2021. Tackling the HR digitalization challenge: key factors and barriers to HR analytics adoption. Competitiveness Review: An International Business Journal , 31 (1), pp.162-187. Gupta, P., Ding, B., Guan, C. and Ding, D., 2024. Generative AI: A systematic review using topic modelling techniques. Data and Information Management , 8 (2), p.100066. Hausknecht, J.P. and Trevor, C.O., 2011. Collective turnover at the group, unit, and organizational levels: Evidence, issues, and implications. Journal of management , 37 (1), pp.352-388. Jain, R. and Nayyar, A., 2018, November. Predicting employee attrition using xgboost machine learning approach. In 2018 international conference on system modeling & advancement in research trends (smart) (pp. 113-120). IEEE. Kakulapati, V., Chaitanya, K.K., Chaitanya, K.V.G. and Akshay, P., 2020. Predictive analytics of HR-A machine learning approach. Journal of Statistics and Management Systems , 23 (6), pp.959-969. Kraus, M., Feuerriegel, S. and Oztekin, A., 2020. Deep learning in business analytics and operations research: Models, applications and managerial implications. European Journal of Operational Research , 281 (3), pp.628-641. Lenka, R. and Chanda, R., 2024, August. Generative AI for Predicting Employee Engagement in HR Analytics: A Bibliometric Analysis. In International Conference on ICT for Sustainable Development (pp. 201-209). Singapore: Springer Nature Singapore. Lundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems , 30 . Ma, X., 2024, December. Research and Application of Human Resources Demand Forecasting Model Based on Machine Learning. In Proceedings of the 2024 5th International Conference on Big Data Economy and Information Management (pp. 696-701). Mitravinda, K.M. and Shetty, S., 2022, December. Employee attrition: Prediction, analysis of contributory factors and recommendations for employee retention. In 2022 IEEE International conference for women in innovation, technology & entrepreneurship (ICWITE) (pp. 1-6). IEEE. Nagalakshmi, V., Sriram, B., Wasib, S.A., Sadiq, S. and Samhitha, M.D., 2025, May. Improving Workforce Management with Machine Learning: A Novel Approach to Employee Classification. In 2025 Third International Conference on Augmented Intelligence and Sustainable Systems (ICAISS) (pp. 1411-1415). IEEE. Opatha, H.H.D.P.J., 2020. HR analytics: A literature review and new conceptual model. International Journal of Scientific and Research Publications , 10 (6), pp.130-141. Pavansubhash 2017 IBM HR Analytics Employee Attrition & Performance. Kaggle https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset. Pratt, M., Boudhane, M. and Cakula, S., 2021. Employee attrition estimation using random Forest algorithm. Baltic Journal of Modern Computing , 9 (1), pp.49-66. Qutub, A., Al-Mehmadi, A., Al-Hssan, M., Aljohani, R. and Alghamdi, H.S., 2021. Prediction of employee attrition using machine learning and ensemble methods. International Journal of Machine Learning and Computing , 11 (2), pp.110-114. Raza, A., Munir, K., Almutairi, M., Younas, F. and Fareed, M.M.S., 2022. Predicting employee attrition using machine learning approaches. Applied Sciences , 12 (13), p.6424. Roul, J., Mohapatra, L.M., Pradhan, A.K. and Kamesh, A.V.S., 2024. Analysing the role of modern information technologies in HRM: management perspective and future agenda. Kybernetes . Sadeghi, K., Ojha, D., Kaur, P., Mahto, R.V. and Dhir, A., 2024. Explainable artificial intelligence and agile decision-making in supply chain cyber resilience. Decision Support Systems , 180 , p.114194. Sangita, U.G., 2019. A study of employee retention. Journal of Emerging Technologies and Innovative Research (JETIR) www. jetir. org , 6 (6), pp.331-337. Singamsetty, S., Ghanta, S., Biswas, S. and Pradhan, A., 2024. Enhancing machine learning-based forecasting of chronic renal disease with explainable AI. PeerJ Computer Science , 10 , p.e2291. Sharma, R. and Dhingra, L., 2024, August. Transforming HR with Machine Learning: Data-Driven Strategies for Talent Management. In 2024 4th Asian Conference on Innovation in Technology (ASIANCON) (pp. 1-5). IEEE. Sharma, R., Jain, A. and Manwal, M., 2024, June. Enhancing human resource management through deep learning: a predictive analytics approach to employee retention success. In 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS) (pp. 1-4). IEEE. Tambe, P., Cappelli, P. and Yakubovich, V., 2019. Artificial intelligence in human resources management: Challenges and a path forward. California management review , 61 (4), pp.15-42. Tursunbayeva, A., Bunduchi, R., Franco, M. and Pagliari, C., 2017. Human resource information systems in health care: a systematic evidence review. Journal of the American Medical Informatics Association , 24 (3), pp.633-654. Varkiani, S.M., Pattarin, F., Fabbri, T. and Fantoni, G., 2025. Predicting employee attrition and explaining its determinants. Expert Systems with Applications , 272 , p.126575. Xin, J.L.J. and Mahadi, N., 2024. HR Analytics for Data-Driven Employee Attrition Management. INTERNATIONAL JOURNAL OF ACADEMIC RESEARCH IN BUSINESS AND SOCIAL SCIENCES , 14 (12). Zhang, L., Lin, J., Liu, B., Zhang, Z., Yan, X. and Wei, M., 2019. A review on deep learning applications in prognostics and health management. Ieee Access , 7 , pp.162415-162438. Zhao, Y., Hryniewicki, M.K., Cheng, F., Fu, B. and Zhu, X., 2018, September. Employee turnover prediction with machine learning: A reliable approach. In Proceedings of SAI intelligent systems conference (pp. 737-758). Cham: Springer International Publishing. Zheng, Z., Qiu, Z., Hu, X., Wu, L., Zhu, H. and Xiong, H., 2023. Generative job recommendations with large language model. arXiv preprint arXiv:2307.02157 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8838292","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":600835719,"identity":"56b8ef45-24fc-4570-8e47-ee8cca1ee9bf","order_by":0,"name":"Jayashree Roul","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABE0lEQVRIiWNgGAWjYBACxgYYAQYFNjAWMxgRocUgjbAWhD6IlsNIWnAA5vYzhh9/7rgnx8/A+/Axj8F5e/Nph589YKiwTmxg5z2A1YKeHGNp3jPFxpIN7MbGPAa3E+fcTjM3YDiTntjAzJeA3U05BtKMbQmJGw6wsUkDtSRISCeYSTC2HQZq4THAqqX/jfHPn20J9fshWs7ZS0inf5Ng/IdHy4wcMwnetoQEAwawlgOMM6SBIowN+LQ8K7MGajGccYCN2XCOQXIiUEu5QcKxdOM2HFoM+5M33wQ6TJ6/gY3xwZsKO5DDtj34UGMt289/BruWBg6ouPwDuCAbQwKYxA7kGdgfYAjiUjwKRsEoGAUjFAAAX9VVujVLUEUAAAAASUVORK5CYII=","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":true,"prefix":"","firstName":"Jayashree","middleName":"","lastName":"Roul","suffix":""},{"id":600835722,"identity":"f12401d0-d286-43c0-a023-a99d85c59974","order_by":1,"name":"Swetha Ghanta","email":"","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":false,"prefix":"","firstName":"Swetha","middleName":"","lastName":"Ghanta","suffix":""},{"id":600835732,"identity":"8be7dec0-feb3-4052-b4dc-0119616ec4ec","order_by":2,"name":"Raheem Qudus","email":"","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":false,"prefix":"","firstName":"Raheem","middleName":"","lastName":"Qudus","suffix":""},{"id":600835734,"identity":"5b9cd0a5-8c85-47a3-8ec0-3f3480454e68","order_by":3,"name":"AVS Kamesh","email":"","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":false,"prefix":"","firstName":"AVS","middleName":"","lastName":"Kamesh","suffix":""},{"id":600835736,"identity":"0b759b13-f845-41bd-b195-e5ee90c34288","order_by":4,"name":"Lalita Mohan Mohapatra","email":"","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":false,"prefix":"","firstName":"Lalita","middleName":"Mohan","lastName":"Mohapatra","suffix":""},{"id":600835740,"identity":"f60f9863-7eb8-484c-af90-5c892ecf15fd","order_by":5,"name":"Ashok Kumar Pradhan","email":"","orcid":"","institution":"SRM University, Andhra Pradesh","correspondingAuthor":false,"prefix":"","firstName":"Ashok","middleName":"Kumar","lastName":"Pradhan","suffix":""}],"badges":[],"createdAt":"2026-02-10 08:23:53","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8838292/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8838292/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":104015238,"identity":"2af35eae-e4dc-48d7-947c-2201fb06f94f","added_by":"auto","created_at":"2026-03-05 16:47:29","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":54713,"visible":true,"origin":"","legend":"\u003cp\u003eBefore and After SMOTE\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/89af1e638c1311969ba2ce97.png"},{"id":104402739,"identity":"ca56f4d0-0597-4ff1-827a-7772442929db","added_by":"auto","created_at":"2026-03-11 12:16:16","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":70537,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart for proposed model\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/725d12f46da158ef207f6ac9.png"},{"id":104015242,"identity":"5b41c78f-3e16-4471-82fc-f48279938591","added_by":"auto","created_at":"2026-03-05 16:47:30","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":210123,"visible":true,"origin":"","legend":"\u003cp\u003eGemini prompt\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/e4a059ff9d1f481d0cc1e7b9.png"},{"id":104402896,"identity":"989ffa79-e7fb-4dde-bb25-9ff425bbdda2","added_by":"auto","created_at":"2026-03-11 12:16:50","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":184364,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP summary plot\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/1f2e22e57f93c1c8734751db.png"},{"id":104015244,"identity":"8ef1861e-7539-4f6f-8327-ea870ca1a19b","added_by":"auto","created_at":"2026-03-05 16:47:30","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":65763,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP plot for employee with high attrition risk\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/0804ff191806aec3c0b8eb5b.png"},{"id":104015245,"identity":"85e6090d-6fa2-4356-a869-a7a5a494bbd0","added_by":"auto","created_at":"2026-03-05 16:47:30","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":68095,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP plot for employee with low attrition risk\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/a5e9a9ddb3d27176e839403d.png"},{"id":104015240,"identity":"51d21de1-3412-4d0b-ae5a-16fe53900256","added_by":"auto","created_at":"2026-03-05 16:47:29","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":328752,"visible":true,"origin":"","legend":"\u003cp\u003eGemini retention strategies for high EVS employee\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/e27cd006a9cb248898f43192.png"},{"id":104015241,"identity":"34633f90-778d-468a-ab89-e85f38f88043","added_by":"auto","created_at":"2026-03-05 16:47:29","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":328224,"visible":true,"origin":"","legend":"\u003cp\u003eGemini retention strategies for low EVS employee\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/36bf77234d53f401cc5eaa90.png"},{"id":105397733,"identity":"454a01b2-6356-43ca-b458-285b8740fc70","added_by":"auto","created_at":"2026-03-25 14:42:57","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1883050,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8838292/v1/5f1161df-bcd9-4612-a8e3-bacf15c7e819.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eEmployee attrition is a continued challenge for organizations, resulting in the loss of skilled personnel, reduced productivity, and substantial recruitment and training costs. High attrition disrupts team cohesion, weakens institutional knowledge, and negatively affects overall organizational performance (Hausknecht \u0026amp; Trevor, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). The employees leave the organizations for several reasons such as low job satisfaction, inadequate career growth, and personal or health-related concerns and organizations face mounting pressure to proactively identify and address the root causes of attrition (Mitravinda, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Sangita, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). As the workforce becomes increasingly dynamic, accurately forecasting attrition and implementing effective retention strategies has become essential for preserving organizational stability and competitiveness. In this evolving landscape, Human Resource Analytics (HRA) has emerged as a transformative, data-driven approach to talent management. Advances in data processing and analytical tools are redefining traditional HR functions, shifting them from administrative operations to strategic, insight-driven practices (Xin et al., 2024). HRA leverages statistical techniques, predictive modeling, and interactive tools such as HR dashboards to derive actionable insights from employee data supporting workforce planning, performance evaluation, and attrition analysis (Opatha, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Pandya, 2023). Organizations increasingly recognize the strategic value of HRA; nearly 20% of HR leaders expect analytics to become a core component of HR strategy in the coming years (Fernandez \u0026amp; Gallardo-Gallardo, 2020).\u003c/p\u003e \u003cp\u003eContemporary HR analytics increasingly relies on machine learning (ML) and artificial intelligence (AI) to process large, complex personnel datasets with speed and accuracy (Tambe et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). ML models have shown superior performance over traditional statistical approaches in predicting employee attrition (Fallucchi, 2020; Qutub, 2021), offering organizations the ability to detect hidden patterns related to job satisfaction, engagement, career progression, and work-life balance (Kakulapati, 2020; Avrahami, 2022). Despite their predictive power, a major limitation persists, referred as \u0026ldquo;black boxes\u0026rdquo;. ML models provide accurate predictions but have little transparency regarding the factors influencing individual attrition risk. This lack of interpretability reduces trust and limits their utility for HR decision-making. To address this challenge, XAI techniques such as SHAP (SHapley Additive Explanations) have gained importance. SHAP enables feature-level interpretation of predictions, making it possible to understand why a model identifies an employee as high-risk of attrition (Lundberg \u0026amp; Lee, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). In HR settings, this transparency is vital for building trust, providing actionable insights, and tailoring retention strategies to employee-specific drivers of attrition.\u003c/p\u003e \u003cp\u003eHowever, even when explanations are available, HR professionals still face the challenge of converting analytical insights into meaningful and personalized retention actions. Traditional retention strategies tend to be generic, rule-based and may not address the specific factors contributing to an employee\u0026rsquo;s disengagement. Recent advancements in generative AI (GenAI) including models such as Gemini and ChatGPT offer a novel solution by transforming predictive insights into coherent, contextually relevant strategies tailored to individual employee needs (Dwivedi et al., 2023). These models can synthesize information from risk factors, performance indicators, and contextual variables to generate targeted recommendations, enabling more precise and effective HR interventions. In response to these needs, this work proposes an integrated HR analytics framework that combines ML-based attrition prediction, SHAP, EVS, and GenAI-based intervention design. Using the IBM HR Analytics Employee Attrition dataset, the framework evaluates multiple ML models, selects the best-performing algorithm, interprets its predictions through SHAP, and incorporates an EVS to assess the strategic importance of each employee. For high attrition risk employees, the framework uses the Gemini 2.5 Flash model to generate personalized retention strategies based on the top contributing factors identified by SHAP.\u003c/p\u003e \u003cp\u003eThis research addresses key gaps in existing literature, including the scarcity of comparative ML studies for attrition prediction, limited integration of XAI in HR contexts, and the absence of systems that bridge predictive insights with actionable, employee-specific retention strategies (Tursunbayeva et al., 2018; Zhang et al., \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Kraus et al., 2018). By evaluating multiple ML models using industry-standard metrics such as F1-score, recall, precision, and accuracy with particular emphasis on the minority class representing departing employees this work not only advances the application of ML in HR analytics but also demonstrates a practical, interpretable, and action-oriented approach for managing employee attrition. Overall, this work highlights the potential of combining predictive analytics, explainability, and generative AI to create a holistic, data-driven framework that supports HR professionals in identifying at-risk employees, understanding the reasons behind their potential exit, and implementing tailored interventions to improve retention and organizational outcomes.\u003c/p\u003e \u003cp\u003eTo further enhance practical applicability, this work also deploys the framework through a dedicated web platform that enables real-time utilization by HR teams, accessible at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://employeeretention.qudmeet.click/\u003c/span\u003e\u003cspan address=\"https://employeeretention.qudmeet.click/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. The website allows HR users to load employee details and immediately obtain model predictions, SHAP-driven explanations, and Employee Value Scores. Moreover, the GenAI module produces not only high-level retention strategies but fully articulated action plans that specify the action owner, timeline, next steps, and associated Key Performance Indicators (KPIs). For instance, for an employee with low Job Satisfaction and short role tenure, the system may recommend a manager-led meeting within two weeks to discuss career pathways, with a KPI such as increasing Job Satisfaction from 1 to at least 3 within six months. This deployment component demonstrates the real-world readiness of the proposed framework and its capability to support data-driven retention interventions at a larger scale.\u003c/p\u003e"},{"header":"2. Literature Review","content":"\u003cp\u003eEffective employee management is a critical priority for modern organizations seeking to retain skilled employees and maintain competitive advantage. With HR processes becoming increasingly data-driven, researchers and experts are turning to advanced analytical methods including ML, XAI, and GenAI to enhance prediction accuracy, transparency, and decision quality in employee management (Arora et al., 2025). This section reviews key developments across these domains and identifies gaps addressed by the present study.\u003c/p\u003e\n\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003e2.1 Machine Learning in HR Analytics\u003c/h2\u003e\n \u003cp\u003eMachine learning has become a prominent tool for analyzing workforce behavior, forecasting employee attrition, and optimizing HR decision-making. ML models such as decision trees, random forests, gradient boosting, XGBoost, and logistic regression have been widely tested on HR datasets to predict attrition (Zhao et al., 2020). These models offer substantial improvements over traditional statistical techniques due to their ability to capture complex, nonlinear relationships in employee data.\u003c/p\u003e\n \u003cp\u003eSeveral empirical studies demonstrate high predictive performance. For example, random forest models have achieved up to 85% accuracy and an AUC of 0.90 in attrition prediction (Ma, \u003cspan class=\"CitationRef\"\u003e2024\u003c/span\u003e), while XGBoost has reached 92.3% accuracy in identifying employees likely to leave (Nagalakshmi, 2025). ML has also been applied in succession planning, performance forecasting, and identifying high-potential employees (Sharma \u0026amp; Dhingra, \u003cspan class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\n \u003cp\u003eDespite strong performance, ML models face two major challenges: Black-box nature: HR professionals often struggle to trust predictions without understanding underlying drivers and Generalizability: Model performance depends heavily on data quality and contextual factors, limiting universal adoption. These limitations highlight the need for interpretability methods such as SHAP.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n \u003ch2\u003e2.2 Explainable AI in HR Analytics\u003c/h2\u003e\n \u003cp\u003eXAI aims to provide transparency into complex ML models, enabling HR managers to understand why a prediction was made. Among existing techniques, SHAP is widely recognized and studies emphasize the value of SHAP in HR settings for its theoretical consistency and ability to quantify each feature\u0026rsquo;s contribution to an individual prediction (Lundberg \u0026amp; Lee, \u003cspan class=\"CitationRef\"\u003e2017\u003c/span\u003e). SHAP helps identify key factors driving attrition, such as job satisfaction, work-life balance, and career progression (Varkiani et al., \u003cspan class=\"CitationRef\"\u003e2025\u003c/span\u003e). Its integration into decision-support systems enables real-time insights and personalized retention strategies (Nagalakshmi, 2025).\u003c/p\u003e\n \u003cp\u003eXAI enhances trust, reduces bias concerns, and supports ethical AI adoption within HR (Toreini et al., 2020; Sadeghi et al., \u003cspan class=\"CitationRef\"\u003e2024\u003c/span\u003e). By making ML models interpretable, XAI facilitates data-driven decision-making and empowers HR professionals to translate predictions into practical workforce interventions. However, challenges persist, including the requirement to interpret XAI outputs and the need for HR teams to understand model behavior.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n \u003ch2\u003e2.3 Generative AI in HR Analytics\u003c/h2\u003e\n \u003cp\u003eGenAI, powered by large language models such as ChatGPT and Gemini, is transforming HR practices by automating content generation and providing personalized insights. Recent studies highlight GenAI\u0026rsquo;s role in HR analytics. GenAI was integrated for predicting employee engagement and identifying behavioral patterns related to dissatisfaction or burnout (Lenka \u0026amp; Chanda, \u003cspan class=\"CitationRef\"\u003e2024\u003c/span\u003e). Sentiment analysis and chatbots enable continuous feedback collection and enhanced communication between employees and HR (Zheng et al., \u003cspan class=\"CitationRef\"\u003e2023\u003c/span\u003e). The scope of GenAI in developing personalized retention plans, informed by historical attrition data, performance metrics, and motivational indicators was briefly discussed in Aldulaimi et al., \u003cspan class=\"CitationRef\"\u003e2021\u003c/span\u003e. GenAI based career development support, including skill-matching, tailored growth recommendations, and individualized communication strategies are provided in Gupta et al., \u003cspan class=\"CitationRef\"\u003e2024\u003c/span\u003e. GenAI\u0026rsquo;s strength lies in its ability to translate analytical insights into actionable, employee-specific interventions, an area where traditional HR systems fall short.\u003c/p\u003e\n \u003cp\u003eHowever, researchers also note critical challenges, particularly the need for HR professionals to develop strong AI literacy to effectively interpret and act on algorithmic insights (Arora \u0026amp; Damarla, \u003cspan class=\"CitationRef\"\u003e2025\u003c/span\u003e). This underscores the importance of implementing GenAI within frameworks that support understanding, transparency, and ease of use. In response, the proposed framework incorporates an intuitive web-based platform that reduces the technical burden on HR users by automatically generating interpretable insights, SHAP-based explanations, Employee Value Scores, and complete, actionable retention plans. By translating complex analytical outputs into clear, manager-ready recommendations, complete with timelines, KPIs, and designated action owners, the system helps bridge the AI literacy gap and enables HR professionals to confidently integrate data-driven decision-making into their retention practices.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eSummary of Research Gaps\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTopic\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eApplications\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eBenefits\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eChallenges\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eML in HR Analytics\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAttrition prediction, performance forecasting\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHigh predictive accuracy, improved workforce insights\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBlack-box models, limited interpretability\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eXAI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eModel interpretability, decision support\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTransparency, trust, actionable insights\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eInterpretation complexity\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eGenerative AI\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eRetention strategy generation, sentiment analysis\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePersonalized interventions, automation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNeed for HR upskilling\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\n \u003ch2\u003e2.4 Identified Research Gaps\u003c/h2\u003e\n \u003cp\u003eDespite meaningful advancements in HR analytics, several critical gaps persist:\u003c/p\u003e\u003cspan\u003e\n \u003cp\u003ea. Lack of unified frameworks that seamlessly integrate ML-based attrition prediction, SHAP-driven explainability, employee value assessment, and generative AI within a single, end-to-end system.\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003eb. Limited research on translating predictive insights into personalized, actionable retention strategies that are explicitly tied to the employee\u0026rsquo;s top attrition drivers.\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003ec. A predominant focus on model performance metrics, with fewer studies offering practical, user-friendly decision-support tools that HR teams can apply in real time.\u003c/p\u003e\n \u003c/span\u003e\n \u003cp\u003eThis work addresses these gaps by introducing a comprehensive HR analytics framework that integrates ML-driven attrition prediction, SHAP-based interpretation, Employee Value Score (EVS) assessment, and generative AI to produce tailored retention action plans. The framework is further operationalized through a user-friendly web application, enabling HR practitioners to enter employee data and immediately receive interpretable insights, high-value employee prioritization, and detailed, KPI-based retention action plans designed for direct managerial implementation.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"3. Research Objectives","content":"\u003cp\u003eThe objectives of this work are:\u003c/p\u003e\n\u003col style=\"list-style-type: lower-alpha;\"\u003e\n \u003cli\u003eTo develop an ML-based model for predicting employee attrition using the IBM HR Analytics dataset and evaluate multiple algorithms using industry-standard metrics.\u003c/li\u003e\n \u003cli\u003eTo enhance interpretability by applying SHAP to identify and explain the key factors contributing to each employee\u0026rsquo;s attrition risk.\u003c/li\u003e\n \u003cli\u003eTo incorporate an EVS to assess the strategic importance of employees and prioritize intervention efforts.\u003c/li\u003e\n \u003cli\u003eTo design a generative AI-based module that produces personalized retention strategies directly linked to top attrition drivers.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eTo translate predictive and interpretive insights into actionable HR decision support through a real-time, web-based platform that generates complete retention action plans including timelines, KPIs, and action owners.\u003c/p\u003e"},{"header":"4. Methodology","content":"\u003cp\u003e\u003cstrong\u003e4.1 Data Source and Description\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe IBM HR Analytics Employee Attrition dataset (Kaggle) is used as the primary data source. It contains information on employee demographics, job roles, performance ratings, work environment characteristics, and attrition status. The dataset includes records for 1,470 employees and comprises 35 feature columns. The prediction task is framed as a binary classification problem in which the target variable, \u003cem\u003eAttrition\u003c/em\u003e, indicates whether an employee has left the organization (\u0026ldquo;\u003cem\u003eAttrition\u003c/em\u003e:Yes\u0026rdquo;) or remained (\u0026ldquo;\u003cem\u003eAttrition\u003c/em\u003e:No\u0026rdquo;).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.2 Data Preprocessing\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe attributes of the dataset contained both categorical and numerical values. Since ML models require numerical input, preprocessing steps such as encoding categorical variables, Scaling Numerical Features and handling class imbalance are performed. All categorical features (e.g., Department, Gender, JobRole) are encoded into numeric form using \u0026lsquo;LabelEncoding\u0026rsquo;. This ensured that categorical attributes could be utilized effectively by the models while preserving their discrete nature. To ensure all features contributed equally to the model and to improve convergence of distance-based algorithms, the numerical features are standardized using StandardScaler, transforming them into a mean of 0 and a standard deviation of 1.\u003c/p\u003e\n\u003cp\u003eThe dataset exhibited class imbalance, with significantly fewer \u0026ldquo;Attrition: Yes\u0026rdquo; cases compared to \u0026ldquo;Attrition: No\u0026rdquo; cases. In the training split, 986 instances belonged to the \u0026ldquo;Attrition: No\u0026rdquo; class, while only 190 instances were labeled as \u0026ldquo;Attrition: Yes\u0026rdquo;. This imbalance can bias models toward the majority class. To address this, Synthetic Minority Oversampling Technique (SMOTE) is applied only on the training set after splitting the data as training and testing sets. SMOTE generates synthetic samples for the minority class by interpolating between existing instances, thereby balancing the dataset. After applying SMOTE, the class distribution was balanced, with 986 instances in the \u0026ldquo;Attrition: No\u0026rdquo; class and 986 instances in the \u0026ldquo;Attrition: Yes\u0026rdquo; class. This balancing allowed the model to learn equally from both attrition and non-attrition cases, improving generalization on minority class predictions. The class distribution before and after SMOTE are shown in Figure 1.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.3 ML models\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo predict employee attrition, a variety of ML algorithms are employed. The models are selected to represent both linear and non-linear approaches, as well as ensemble learning methods to capture complex relationships in the dataset. Logistic Regression (LR) is a baseline classification algorithm widely used in HR analytics due to its interpretability. It models the probability of attrition using a logistic (sigmoid) function, making it suitable for binary classification problems. It provides coefficients that can be directly interpreted as feature importance, offering transparency in decision-making. Support Vector Machine (SVM) is effective in high-dimensional spaces and can handle non-linear decision boundaries using kernel functions. It finds the optimal hyperplane that maximizes the margin between attrition and non-attrition employees. It is robust to overfitting when appropriate kernels and regularization are applied.\u003c/p\u003e\n\u003cp\u003eRandom Forest (RF) is an ensemble method that reduces overfitting and improves prediction accuracy compared to single decision trees. It constructs multiple decision trees on bootstrapped subsets of data and aggregates predictions via majority voting. It handles non-linear relationships and feature interactions well and provides feature importance rankings. Extra Trees Classifier (ETC) is similar to RF but introduces greater randomness in feature splits, often improving generalization. Unlike RF, which searches for the best split, Extra Trees selects splits randomly and averages across many trees. It has faster training time and reduced variance compared to RF. Extreme Gradient Boosting (XGBoost) is a gradient boosting algorithm known for state-of-the-art performance in structured/tabular data competitions. It builds decision trees sequentially, where each tree corrects the errors of the previous ones using gradient descent optimization. It handles imbalanced data well, offers regularization, and provides strong predictive power.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.4 Model Selection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo identify the most suitable predictive model, all five algorithms are trained and evaluated on the IBM attrition dataset. Since the dataset was initially imbalanced, accuracy alone is insufficient to judge model performance. Therefore, a combination of metrics as shown in Table 2 are used:\u003c/p\u003e\n\u003cp\u003eTable 2: ML model evaluation metrics\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMetric\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eDescription\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAccuracy\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003eMeasures the overall proportion of correct predictions made by the model.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePrecision\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003eIndicates the proportion of predicted \u0026ldquo;Attrition: Yes\u0026rdquo; cases that are actually correct.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eRecall (Sensitivity)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003eMeasures the model\u0026rsquo;s ability to correctly identify actual \u0026ldquo;Attrition: Yes\u0026rdquo; cases.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eF1-Score\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003eThe harmonic mean of precision and recall, balancing false positives and false negatives.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 25.1248%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eROC-AUC\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 74.8752%;\"\u003e\n \u003cp\u003eRepresents the area under the ROC curve, indicating how well the model distinguishes between the two classes.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cstrong\u003e4.5 Explainability Using SHAP\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSHAP (SHapley Additive exPlanations) is an XAI framework based on cooperative game theory. It assigns each feature a contribution value that represents how much it increases or decreases the prediction for an individual instance. The method uses Shapley values, originally from game theory, which fairly distribute the \u0026ldquo;payout\u0026rdquo; (in our case, the prediction) among all contributing features. The key characteristics of SHAP include local explanations that identifies why a specific prediction was made for a given employee, global explanations aggregate feature contributions across all employees to highlight the most influential factors overall. SHAP works with a variety of ML models, including tree-based ensembles like Random Forest and XGBoost.\u003c/p\u003e\n\u003cp\u003eIn this work, SHAP is applied to the best-performing model to interpret its predictions of employee attrition. SHAP identified the contributing features that had the most significant influence on employee attrition predictions. For example, factors such as OverTime, JobRole, MonthlyIncome, WorkLifeBalance, and YearsAtCompany emerged as critical in driving employee attrition. This explainability using SHAP will provide HR managers with actionable insights beyond a simple leave/stay prediction.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.6 Employee Value Score Calculation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAn EVS is calculated to quantify each employee\u0026rsquo;s organizational value, combining multiple weighted attributes:\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\u0026apos;PerformanceRating\u0026apos;\u003c/li\u003e\n \u003cli\u003e\u0026apos;YearsAtCompany\u0026apos;\u003c/li\u003e\n \u003cli\u003e\u0026apos;YearsInCurrentRole\u0026apos;\u003c/li\u003e\n \u003cli\u003e\u0026apos;TrainingTimesLastYear\u0026apos;\u003c/li\u003e\n \u003cli\u003e\u0026apos;PercentSalaryHike\u0026apos;\u003c/li\u003e\n \u003cli\u003e\u0026apos;NumCompaniesWorked\u0026apos;\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eEach attribute i is assigned a weight (w_i) based on organizational priorities and expert judgment. The EVS was calculated as a weighted sum:\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;\u003cimg src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAIsAAAAVCAMAAABBq+3+AAADAFBMVEUAAAAAAAAAADoAAGYAOjoAOmYAOpAAZrY6AAA6ADo6AGY6OgA6Ojo6OmY6OpA6ZmY6ZrY6kLY6kNtmAABmADpmAGZmOgBmOjpmOmZmOpBmkJBmkLZmkNtmtttmtv+QOgCQOjqQOmaQOpCQZmaQttuQ29uQ2/+2ZgC2Zjq2Zma2kDq2kGa2kLa2tpC2tv+227a22/+2///bkDrbkGbbtmbbtpDbtrbb25Db////tmb/25D/27b//7b//9sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAApvTyTAAAAAXRSTlMAQObYZgAAAn9JREFUeF7tle1P01AUxs9t2VrGFNKyjYB0GEI71ChrRkJL+9fXGBOj+HEsakAnSmBgcJtsss372rVN56bEL4bfh+X2ntP7cp6nZwB33PFPsRFHMpOh2yElJ6azYKCtEaYKsF9Uamp2L5nxl0jhPRVXRXIVHITw4n5FRejeTjKbEqxlj8j2OQ1+PpROa3oyIQV72oF5gl9WyI8BNi26JeMjWNkdcHPVxAsCGylipICzFIulM+tZPN3AvxUTHHIWf8MUU5OxEI9bBljrT6bsE8OavDLWaPhDB9uoN9hE5woPUP5rqI9fFl4d362gNalvvZYK8uf+SzFPIS8Y4M6bRG+2s0PUJ6GKmVceaSwN2yAjKi8SiF+oOuR19z7d0C1K6zwtFQexBVOhotskIyyCw7YC/3Fpbo2roYGrhquQBFyXTnbkbfBJ/2SFXvKFZTZ/sxuuZzk5MybQb6rel+32nne1mowNOgvzdIfnrQuQS7GYBN55CYIPJC7lrqlChKButpkkaRq5bzcPxDgFNdMa9Jbki24vrh74Z0NzcfEBHe9v5l99jEUlGHZVXLEVWsObd5H2hdhCwTFpJoTQF95hgR+ZOj2kxr88uXT6bVkunl0mixfUGx143SRD943c3o37WILrgYbPe3lAHcsUsrE8u8fxAkbwP43ohbAbgqMGbkvMcP52jycEq/33WqB/J4smOeS1GPUBuufxmKvT+hNz+OWxw9HcZO/arKO4quHpotGwGdGRaE8gn4IIqBGBGVj6zFZZ9IaUhJlwhH9wLUgFx0zujjPx5/9H/knkobtMrsTOlYZHq84/6DSmJsyMV3j2NPJ4y7rcDk+Pqmxx3/0P/ALFyJv1frl7mwAAAABJRU5ErkJggg==\" alt=\"image\" width=\"139\" height=\"21\"\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;(1)\u003c/p\u003e\n\u003cp\u003ewhere w_i represents the weight for attribute i, and a_i is the standardized score of that attribute.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.7 Retention Strategy Recommendation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWhile predictive modeling using ML and explainability with SHAP provides insights into employee satisfaction and attrition risk, organizations ultimately require actionable strategies to address these risks. To bridge this gap, the Google Gemini AI is integrated into the framework. The Google GPT model, Gemini, is employed to generate personalized employee retention and development strategies. Inputs included EVS, representing the organizational importance of an employee, derived from predefined features based on equation (1), SHAP explanations with top contributing factors that influenced the model\u0026rsquo;s prediction of attrition for the employee and prediction outcome whether the employee is classified as \u0026ldquo;Attrition: Yes\u0026rdquo; or \u0026ldquo;Attrition: No\u0026rdquo;. Using these inputs, the Gemini model is prompted to suggest tailored HR interventions. The prompt used to generate responses from the Gemini Flash 2.5 model is shown in the Figure 3.\u003c/p\u003e\n\u003cp\u003eFor high EVS employees at risk, Gemini suggested retention strategies such as competitive salary adjustments, flexible work arrangements, or career advancement opportunities. This ensures that highly valuable employees remain engaged and committed. For low-EVS employees identified as at risk, instead of overlooking their retention needs, Gemini recommended constructive, development-focused strategies such as targeted training, mentorship, or reassignment to roles better aligned with their skills. This enhances employee growth while benefiting organizational productivity. For satisfied employees, suggestions focused on sustaining engagement, such as recognition programs, wellness initiatives, and continued professional development.\u003c/p\u003e\n\u003cp\u003eWith Gemini Integration, raw predictive outputs are converted into practical HR actions. The recommended retention strategies are not just generic but tailored to each employee\u0026rsquo;s profile and organizational value. The strategies suggested balancing and benefitting both the employee growth and satisfaction, and the organization productivity. By integrating the Gemini API with predictive modeling and SHAP explainability, the framework transitions from being a predictive system to a prescriptive decision support tool, directly assisting HR managers in making informed and personalized interventions.\u003c/p\u003e"},{"header":"5. Results","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e5.1 Model Comparison\u003c/h2\u003e \u003cp\u003eEach model is evaluated on the test set after SMOTE was applied on the training set. Ensemble methods (Random Forest, Extra Trees, and XGBoost) consistently outperformed linear models (Logistic Regression, SVM) by capturing complex feature interactions.\u003c/p\u003e \u003cp\u003eLogistic Regression performed reasonably with balanced accuracy but struggled to capture non-linear patterns. SVM provided moderate results but required extensive tuning and is computationally more expensive. Random Forest \u0026amp; Extra Trees delivered strong results with good recall and interpretability via feature importance. XGBoost achieved the highest overall performance across metrics, particularly excelling in recall and ROC-AUC, which are critical for correctly identifying employees at risk of leaving.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e5.2 Final Selection\u003c/h2\u003e \u003cp\u003eBased on comparative analysis from Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, XGBoost is selected as the best-performing model. Its ability to handle imbalanced data, capture complex non-linear relationships, and provide robust predictive performance made it the most suitable choice for the task. Furthermore, XGBoost integrates seamlessly with SHAP explainability, enabling detailed insights into the contribution of individual features toward employee satisfaction predictions.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePerformance Comparison of different ML models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eROC-AUC\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eXGBoost\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8605\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.6250\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.3191\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.4225\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.8043\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRandom Forest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8469\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.5385\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.2979\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.3836\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.7918\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSupport Vector Machine\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8333\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.4762\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.4255\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.4494\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.7567\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eExtra Trees Classifier\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.8231\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.4419\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.4043\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.4222\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.7275\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLogistic Regression\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.7585\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.3750\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.7660\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.5035\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.7949\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e5.3 SHAP results\u003c/h2\u003e \u003cp\u003eThe SHAP summary plot as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e provides an overall view of how each feature contributes to the model\u0026rsquo;s predictions across all employees. It ranks features by their average absolute SHAP value, allowing quick identification of the most influential drivers of attrition. Each point in the plot represents an individual employee, with color indicating whether the feature value is high or low. Points positioned to the right contribute positively toward predicting \u0026ldquo;Attrition: Yes,\u0026rdquo; while points to the left push the prediction toward \u0026ldquo;Attrition: No.\u0026rdquo; This visualization reveals both the direction and magnitude of feature impact. In this work, the summary plot clearly highlights factors such as OverTime, and JobRole as key contributors. By presenting complex model behavior in an interpretable format, the SHAP summary plot supports transparent decision-making and helps HR professionals understand which factors most strongly influence attrition risk across the organization. The summary plot is show below.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo better understand the factors influencing attrition predictions at individual employee level, the details of employees with high attrition risk and low attrition risk are passed to the SHAP explainer. The corresponding force plots obtained are shown below in Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e and \u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e. These plots highlight the contribution of individual features toward the prediction, where red features increase the likelihood of attrition and blue features decrease the likelihood of attrition.\u003c/p\u003e \u003cp\u003eFor employees with higher attrition likelihood (e.g., predicted score\u0026thinsp;=\u0026thinsp;0.61), risk factors such as tenure with current manager, moderate monthly income levels, total working years, job role, and distance from home played a stronger role in driving attrition risk. Although protective features such as lower overtime, business travel, and high environment satisfaction are present, their influence is not sufficient to counterbalance the dominant risk factors. As a result, the overall prediction indicates a higher attrition risk for this employee.\u003c/p\u003e \u003cp\u003eIn contrast, employee with very low attrition likelihood (predicted score = -6.52) is mainly characterized by protective features such as absence of overtime, good job satisfaction, supportive manager tenure, and higher work-life balance. These insights highlight that attrition is driven by heterogeneous factors across employees, supporting the need for tailored retention strategies.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e5.4 Gemini retention strategies\u003c/h2\u003e \u003cp\u003eThe top contributing factors identified by SHAP, along with EVS are given to the Gemini model to generate automated retention strategies. The sample strategies along with complete action plan suggested by Gemini are shown below:\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor high-EVS employees identified as high-risk, the GenAI module produces detailed, multi-step retention action plans grounded in the specific SHAP-identified drivers of attrition. As illustrated in the generated example Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e, the system provides three prioritized interventions, each structured with a clear \u003cem\u003eaction owner\u003c/em\u003e, \u003cem\u003enext step\u003c/em\u003e, \u003cem\u003etimeline\u003c/em\u003e, and \u003cem\u003emeasurable KPI\u003c/em\u003e. The recommendations address key factors such as low Environment Satisfaction, reduced Job Satisfaction, and lack of recent career progression. For instance, the model suggests targeted managerial or HR-led actions including one-on-one discussions, development of improvement plans, identifying stretch assignments, and reviewing promotion pathways. Each intervention includes time-bound execution steps (e.g., within 2 weeks or 30 days) and quantifiable KPIs (e.g., improving satisfaction scores), ensuring the output is actionable, trackable, and aligned with HR best practices. This structured approach demonstrates the system\u0026rsquo;s ability to translate predictive insights into practical retention strategies tailored for high-value employees.\u003c/p\u003e \u003cp\u003eFor low-EVS employees who exhibit elevated attrition risk, the GenAI module focuses on developmental and support-oriented interventions, ensuring that these employees are not overlooked but are instead guided toward improved engagement and stability. As illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e, the system generates a set of structured recommendations addressing early-tenure risks, low job involvement, and work-life balance challenges which are the factors frequently associated with attrition in lower-EVS groups. In this case, the model proposes conducting an early \u0026ldquo;stay interview\u0026rdquo; for new hires, identifying task-level engagement opportunities to improve job involvement, and monitoring overtime patterns to prevent burnout. These suggestions are designed to provide targeted support, enhance early integration, and improve role alignment, demonstrating that even low-EVS employees receive meaningful, actionable development plans rather than being passively allowed to attrite. This reinforces the framework\u0026rsquo;s commitment to equitable, data-informed retention practices across all employee segments.\u003c/p\u003e \u003c/div\u003e"},{"header":"6. Discussion","content":"\u003cp\u003eThe work demonstrates that integrating predictive modeling, XAI, and GenAI can significantly strengthen organizational attrition management. Among the tested models, XGBoost achieved the highest performance, particularly in identifying the minority attrition class. Its superior recall and overall stability make it well-suited for HR contexts where missing high-risk employees can lead to substantial organizational loss. The use of SHAP explainability provided clear insight into the model\u0026rsquo;s decisions. SHAP consistently highlighted the top factors contributing to attrition (e.g., Overtime, JobRole, Job Satisfaction, Monthly Income, Years at Company), enabling HR practitioners to understand not only which variables are most influential but also how they affected individual predictions. The SHAP force plots further simplified interpretation, offering intuitive visual explanations that help HR teams quickly diagnose employee-specific risk drivers without technical expertise.\u003c/p\u003e \u003cp\u003eThe introduction of the EVS enhanced strategic decision-making by differentiating employees based not only on their attrition risk but also their strategic importance. This dual perspective supports more efficient allocation of retention resources, allowing HR to focus on high-risk, high-value employees. Finally, integrating these insights with Gemini-generated retention strategies transformed predictive outputs into actionable guidance. By combining SHAP factors, EVS, and risk levels, Gemini produced tailored recommendations that HR professionals can directly implement. This approach ensures that interventions are both personalized and aligned with organizational priorities. Overall, the results confirm that the proposed hybrid framework not only improves prediction accuracy but also provides interpretability, and decision support enabling organizations to strengthen retention efforts, support employee well-being, and enhance long-term workforce stability.\u003c/p\u003e"},{"header":"7. Conclusion","content":"\u003cp\u003eThis work demonstrates the value of a hybrid HR analytics framework that combines predictive modeling, explainable AI, and generative AI to improve employee attrition management. We compared different ML algorithms, and XGBoost proved to be the most effective model for predicting attrition. SHAP-based explainability added essential transparency, identifying the key factors driving attrition and providing intuitive, employee-level insights through force plot visualizations. The introduction of the EVS further enhanced decision-making by aligning attrition risk with strategic employee importance. By integrating these analytical outputs with Gemini-generated retention strategies, the framework successfully translated predictions into personalized, actionable interventions. This closes the gap between data-driven insights and practical HR action. Overall, the proposed approach supports more targeted and effective retention management, contributing to both organizational stability and employee well-being. Further, these analytical components are operationalized through a real-time web-based application developed as part of this work, enabling HR practitioners to interact with predictions, explanations, and Gemini-generated personalized retention strategies in a user-friendly environment.\u003c/p\u003e \u003cp\u003eFuture research could extend this framework by validating its effectiveness across multiple industries and larger, more heterogeneous employee datasets to ensure broader generalizability. Further enhancements may include integrating advanced deep learning models or multimodal data sources such as text-based performance reviews, employee surveys, or behavioral system logs to capture more complex predictors of attrition.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eFunding\u003c/p\u003e\n\u003cp\u003eThe authors did not receive any funding for the submitted work.\u003c/p\u003e\n\u003cp\u003e• Conflict of Interest\u003c/p\u003e\n\u003cp\u003eOn behalf of all authors, the corresponding author states that there is no conflict of interest.\u003c/p\u003e\n\u003cp\u003e• Ethics approval and consent to participate\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e• Author contribution\u003c/p\u003e\n\u003cp\u003eConceptualization: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Methodology: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Formal analysis and investigation: Jayashree Roul, Swetha Ghanta, Raheem Qudus; Writing - original draft preparation: Jayashree Roul, Swetha Ghanta, Raheem Qudus, AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan; Writing - review and editing: AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan; Supervision: AVS Kamesh, Lalita Mohan Mohapatra, Ashok Kumar Pradhan.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e• Research Involving Human and/or Animals\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e• Informed Consent\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e• Consent to publish statement\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e• Data Availability statement\u003c/p\u003e\n\u003cp\u003eThe dataset used in this work is publicly available on Kaggle repository and can be accessed at https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAldulaimi, S.H., Abdeldayem, M.M., Mowafak, B.M. and Abdulaziz, M.M., 2021. Experimental perspective of artificial intelligence technology in human resources management. In \u003cem\u003eApplications of artificial intelligence in business, education and healthcare\u003c/em\u003e (pp. 487-511). Cham: Springer International Publishing.\u003c/li\u003e\n\u003cli\u003eArora, R. and Damarla, R.B., 2025. A Review on Generative AI Powered Talent Management, Employee Engagement and Retention Strategies: Applications, Benefits, and Challenges. \u003cem\u003eProcedia Computer Science\u003c/em\u003e, \u003cem\u003e260\u003c/em\u003e, pp.683-691.\u003c/li\u003e\n\u003cli\u003eAvrahami, D., Pessach, D., Singer, G. and Chalutz Ben-Gal, H., 2022. A human resources analytics and machine-learning examination of turnover: implications for theory and practice. \u003cem\u003eInternational Journal of Manpower\u003c/em\u003e, \u003cem\u003e43\u003c/em\u003e(6), pp.1405-1424.\u003c/li\u003e\n\u003cli\u003eBanh, L. and Strobel, G., 2023. Generative artificial intelligence. \u003cem\u003eElectronic Markets\u003c/em\u003e, \u003cem\u003e33\u003c/em\u003e(1), p.63.\u003c/li\u003e\n\u003cli\u003eDwivedi, Y.K., 2025. Generative Artificial Intelligence (GenAI) in entrepreneurial education and practice: emerging insights, the GAIN Framework, and research agenda. \u003cem\u003eInternational Entrepreneurship and Management Journal\u003c/em\u003e, \u003cem\u003e21\u003c/em\u003e(1), pp.1-21.\u003c/li\u003e\n\u003cli\u003eFallucchi, F., Coladangelo, M., Giuliano, R. and William De Luca, E., 2020. Predicting employee attrition using machine learning techniques. \u003cem\u003eComputers\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(4), p.86.\u003c/li\u003e\n\u003cli\u003eFernandez, V. and Gallardo-Gallardo, E., 2021. Tackling the HR digitalization challenge: key factors and barriers to HR analytics adoption. \u003cem\u003eCompetitiveness Review: An International Business Journal\u003c/em\u003e, \u003cem\u003e31\u003c/em\u003e(1), pp.162-187.\u003c/li\u003e\n\u003cli\u003eGupta, P., Ding, B., Guan, C. and Ding, D., 2024. Generative AI: A systematic review using topic modelling techniques. \u003cem\u003eData and Information Management\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e(2), p.100066.\u003c/li\u003e\n\u003cli\u003eHausknecht, J.P. and Trevor, C.O., 2011. Collective turnover at the group, unit, and organizational levels: Evidence, issues, and implications. \u003cem\u003eJournal of management\u003c/em\u003e, \u003cem\u003e37\u003c/em\u003e(1), pp.352-388.\u003c/li\u003e\n\u003cli\u003eJain, R. and Nayyar, A., 2018, November. Predicting employee attrition using xgboost machine learning approach. In \u003cem\u003e2018 international conference on system modeling \u0026amp; advancement in research trends (smart)\u003c/em\u003e (pp. 113-120). IEEE.\u003c/li\u003e\n\u003cli\u003eKakulapati, V., Chaitanya, K.K., Chaitanya, K.V.G. and Akshay, P., 2020. Predictive analytics of HR-A machine learning approach. \u003cem\u003eJournal of Statistics and Management Systems\u003c/em\u003e, \u003cem\u003e23\u003c/em\u003e(6), pp.959-969.\u003c/li\u003e\n\u003cli\u003eKraus, M., Feuerriegel, S. and Oztekin, A., 2020. Deep learning in business analytics and operations research: Models, applications and managerial implications. \u003cem\u003eEuropean Journal of Operational Research\u003c/em\u003e, \u003cem\u003e281\u003c/em\u003e(3), pp.628-641.\u003c/li\u003e\n\u003cli\u003eLenka, R. and Chanda, R., 2024, August. Generative AI for Predicting Employee Engagement in HR Analytics: A Bibliometric Analysis. In \u003cem\u003eInternational Conference on ICT for Sustainable Development\u003c/em\u003e (pp. 201-209). Singapore: Springer Nature Singapore.\u003c/li\u003e\n\u003cli\u003eLundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. \u003cem\u003eAdvances in neural information processing systems\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eMa, X., 2024, December. Research and Application of Human Resources Demand Forecasting Model Based on Machine Learning. In \u003cem\u003eProceedings of the 2024 5th International Conference on Big Data Economy and Information Management\u003c/em\u003e (pp. 696-701).\u003c/li\u003e\n\u003cli\u003eMitravinda, K.M. and Shetty, S., 2022, December. Employee attrition: Prediction, analysis of contributory factors and recommendations for employee retention. In \u003cem\u003e2022 IEEE International conference for women in innovation, technology \u0026amp; entrepreneurship (ICWITE)\u003c/em\u003e (pp. 1-6). IEEE.\u003c/li\u003e\n\u003cli\u003eNagalakshmi, V., Sriram, B., Wasib, S.A., Sadiq, S. and Samhitha, M.D., 2025, May. Improving Workforce Management with Machine Learning: A Novel Approach to Employee Classification. In \u003cem\u003e2025 Third International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)\u003c/em\u003e (pp. 1411-1415). IEEE.\u003c/li\u003e\n\u003cli\u003eOpatha, H.H.D.P.J., 2020. HR analytics: A literature review and new conceptual model. \u003cem\u003eInternational Journal of Scientific and Research Publications\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e(6), pp.130-141.\u003c/li\u003e\n\u003cli\u003ePavansubhash 2017 IBM HR Analytics Employee Attrition \u0026amp; Performance. Kaggle https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset.\u003c/li\u003e\n\u003cli\u003ePratt, M., Boudhane, M. and Cakula, S., 2021. Employee attrition estimation using random Forest algorithm. \u003cem\u003eBaltic Journal of Modern Computing\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(1), pp.49-66.\u003c/li\u003e\n\u003cli\u003eQutub, A., Al-Mehmadi, A., Al-Hssan, M., Aljohani, R. and Alghamdi, H.S., 2021. Prediction of employee attrition using machine learning and ensemble methods. \u003cem\u003eInternational Journal of Machine Learning and Computing\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(2), pp.110-114.\u003c/li\u003e\n\u003cli\u003eRaza, A., Munir, K., Almutairi, M., Younas, F. and Fareed, M.M.S., 2022. Predicting employee attrition using machine learning approaches. \u003cem\u003eApplied Sciences\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(13), p.6424.\u003c/li\u003e\n\u003cli\u003eRoul, J., Mohapatra, L.M., Pradhan, A.K. and Kamesh, A.V.S., 2024. Analysing the role of modern information technologies in HRM: management perspective and future agenda. \u003cem\u003eKybernetes\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eSadeghi, K., Ojha, D., Kaur, P., Mahto, R.V. and Dhir, A., 2024. Explainable artificial intelligence and agile decision-making in supply chain cyber resilience. \u003cem\u003eDecision Support Systems\u003c/em\u003e, \u003cem\u003e180\u003c/em\u003e, p.114194.\u003c/li\u003e\n\u003cli\u003eSangita, U.G., 2019. A study of employee retention. \u003cem\u003eJournal of Emerging Technologies and Innovative Research (JETIR) www. jetir. org\u003c/em\u003e, \u003cem\u003e6\u003c/em\u003e(6), pp.331-337.\u003c/li\u003e\n\u003cli\u003eSingamsetty, S., Ghanta, S., Biswas, S. and Pradhan, A., 2024. Enhancing machine learning-based forecasting of chronic renal disease with explainable AI. \u003cem\u003ePeerJ Computer Science\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e, p.e2291.\u003c/li\u003e\n\u003cli\u003eSharma, R. and Dhingra, L., 2024, August. Transforming HR with Machine Learning: Data-Driven Strategies for Talent Management. In \u003cem\u003e2024 4th Asian Conference on Innovation in Technology (ASIANCON)\u003c/em\u003e (pp. 1-5). IEEE.\u003c/li\u003e\n\u003cli\u003eSharma, R., Jain, A. and Manwal, M., 2024, June. Enhancing human resource management through deep learning: a predictive analytics approach to employee retention success. In \u003cem\u003e2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS)\u003c/em\u003e (pp. 1-4). IEEE.\u003c/li\u003e\n\u003cli\u003eTambe, P., Cappelli, P. and Yakubovich, V., 2019. Artificial intelligence in human resources management: Challenges and a path forward. \u003cem\u003eCalifornia management review\u003c/em\u003e, \u003cem\u003e61\u003c/em\u003e(4), pp.15-42.\u003c/li\u003e\n\u003cli\u003eTursunbayeva, A., Bunduchi, R., Franco, M. and Pagliari, C., 2017. Human resource information systems in health care: a systematic evidence review. \u003cem\u003eJournal of the American Medical Informatics Association\u003c/em\u003e, \u003cem\u003e24\u003c/em\u003e(3), pp.633-654.\u003c/li\u003e\n\u003cli\u003eVarkiani, S.M., Pattarin, F., Fabbri, T. and Fantoni, G., 2025. Predicting employee attrition and explaining its determinants. \u003cem\u003eExpert Systems with Applications\u003c/em\u003e, \u003cem\u003e272\u003c/em\u003e, p.126575.\u003c/li\u003e\n\u003cli\u003eXin, J.L.J. and Mahadi, N., 2024. HR Analytics for Data-Driven Employee Attrition Management. \u003cem\u003eINTERNATIONAL JOURNAL OF ACADEMIC RESEARCH IN BUSINESS AND SOCIAL SCIENCES\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(12).\u003c/li\u003e\n\u003cli\u003eZhang, L., Lin, J., Liu, B., Zhang, Z., Yan, X. and Wei, M., 2019. A review on deep learning applications in prognostics and health management. \u003cem\u003eIeee Access\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e, pp.162415-162438.\u003c/li\u003e\n\u003cli\u003eZhao, Y., Hryniewicki, M.K., Cheng, F., Fu, B. and Zhu, X., 2018, September. Employee turnover prediction with machine learning: A reliable approach. In \u003cem\u003eProceedings of SAI intelligent systems conference\u003c/em\u003e (pp. 737-758). Cham: Springer International Publishing.\u003c/li\u003e\n\u003cli\u003eZheng, Z., Qiu, Z., Hu, X., Wu, L., Zhu, H. and Xiong, H., 2023. Generative job recommendations with large language model. \u003cem\u003earXiv preprint arXiv:2307.02157\u003c/em\u003e.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Explainable AI, SHAP, Generative AI, HR analytics, Attrition prediction, Personalized retention strategies","lastPublishedDoi":"10.21203/rs.3.rs-8838292/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8838292/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eEmployee attrition is a constant issue in competitive job markets. While predictive models can estimate attrition risk, turning those predictions into effective retention actions still poses challenges. This work aims to create a framework that links attrition prediction, explainable artificial intelligence (XAI), and generative AI to support data-driven and personalized retention strategies. Using the IBM HR Analytics Attrition dataset, we built a machine learning model to predict employee attrition. We applied SHAP explainability techniques to identify the main factors affecting individual attrition risk. We introduced an Employee Value Scoring (EVS) system to highlight high-value employees at risk. To translate insights into action, we used generative AI (Gemini) to create personalized retention recommendations based on the most important SHAP-derived features. The framework successfully identified high-risk employees and offered targeted, easy-to-understand recommendations based on individual attrition drivers. The results show how combining predictive modeling, explainability, and generative AI can help HR teams move from predicting risk to taking meaningful action. This work presents a new, unified approach that connects attrition prediction and effective retention planning. By integrating machine learning, XAI, and generative AI, the framework provides personalized and context-specific recommendations, improving the practical use of HR analytics for proactive talent management.\u003c/p\u003e","manuscriptTitle":"An integrated explainable artificial intelligence framework for employee attrition prediction and retention strategy generation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-03-05 16:47:21","doi":"10.21203/rs.3.rs-8838292/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"0400ad5a-8158-45e7-993f-59bd9099f946","owner":[],"postedDate":"March 5th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-03-25T14:41:58+00:00","versionOfRecord":[],"versionCreatedAt":"2026-03-05 16:47:21","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8838292","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8838292","identity":"rs-8838292","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00