Random Forest Model for Predicting Claims for Outages in Telecommunications Operating Companies in Peru (2016–2024)

preprint OA: closed
Full text JSON View at publisher
AI-generated deep summary by claude@2026-06, 2026-06-24 · read from full text

This preprint studies development and validation of a Random Forest machine-learning model to forecast the number of telecommunications service outage claims submitted by Peruvian operators using longitudinal complaint records from 2016–2024 (87,003 records) drawn from Peru’s National Open Data Platform regulated by OSIPTEL. Using an applied, non-experimental quantitative approach with cross-validation, the authors report predictive performance of RMSE 42.80, MAE 33.38, and R² 0.72, stating that both general and specific hypotheses were confirmed and that the model can support operational trend estimation. The main caveat is that the work is presented as a preprint that has not been peer reviewed, and the model relies on aggregated complaint-log inputs rather than experimentally controlled or newly collected data. The paper does not explicitly discuss endometriosis or adenomyosis; it was included in the corpus via a keyword match in the upstream search index.

Read from the paper's body, not the abstract. Not a substitute for reading the paper. No clinical advice. How this works

Full text 105,008 characters · extracted from preprint-html · click to expand
Random Forest Model for Predicting Claims for Outages in Telecommunications Operating Companies in Peru (2016–2024) | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Random Forest Model for Predicting Claims for Outages in Telecommunications Operating Companies in Peru (2016–2024) Francis Homero Padilla Quispe, Erick Giovanny Flores Chacón This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8341643/v1 This work is licensed under a CC BY 4.0 License Status: Under Revision Version 1 posted 11 You are reading this latest preprint version Abstract Telecommunications operators face numerous challenges in maintaining service continuity due to the increasing complexity of their networks, and demand must be operationally monitored. Within this context, this research studies the development of a predictive model using Random Forest to forecast service outage claims in Peruvian telecommunications companies from 2016 to 2024. A quantitative, applied, non-experimental, and longitudinal design was used, utilizing 87,003 records from the National Open Data Platform. The model was validated using cross- validation and its accuracy was evaluated using the RMSE, MAE, and R² metrics. The results show good performance of the predictive model, achieving an RMSE of 42.80, an MAE of 33.38, and an R² of 0.72, reflecting its high accuracy. The general and specific hypotheses of the research were confirmed, demonstrating that Random Forest allows for reliable predictions of service outage claims. Furthermore, it aligns with SDG 9, which clarifies the value of its research in relation to service quality and infrastructure planning in Peru. Physical sciences/Energy science and technology Physical sciences/Engineering Physical sciences/Mathematics and computing Random Forest fault prediction telecommunications service complaints machine learning SDG 9 Figures Figure 1 Figure 2 Figure 3 Introduction Telecommunications services have become essential for the development of economic activity, for the population, and for social connectivity. With the growth of network infrastructure and the increasing demands on services, operators continually face the challenge of maintaining stable operation and sufficient operational quality. 23 The high volume of complaints regarding service outages is a key indicator of service degradation, reflecting user dissatisfaction and underlying faults in fixed and mobile networks. Reporting incidents often requires technical personnel to travel to customer premises, thus increasing operational repercussions and impacting contractual service level agreements. 2 In Peru, the regulator OSIPTEL oversees the quality of telecommunications services and makes information on service outage complaints available to the public through the National Open Data Platform. It is also important to note that these records cover many operators and service categories, making them a valuable source of information in terms of their temporal, operational, and behavioral nature. However, despite this widespread availability of data, we have observed that most operational decisions are reactive, based on historical trends and manual analysis rather than predictive intelligence. Consequently, when a user files a complaint about a service interruption, the outage must be addressed after the complaint has already been submitted, limiting the capacity for both proactive maintenance and resource planning. 12 In recent years, machine learning techniques have gained importance for tackling complex classification and forecasting problems in the telecommunications field. Among these are assembly models such as Random Forest, which have demonstrated good day-to-day performance in managing nonlinear relationships, noisy information, and high-dimensional variables, making them suitable for predicting service failures and customer behavior. Previous research has implemented these techniques for fault detection, anomaly classification, network performance forecasting, and outage prediction in multiple environments. However, within the regulatory and operational context required in Peru, predictive models for fault claims remain largely unexplored. 18 This analysis seeks to address this gap by developing and evaluating a Random Forest predictive model based on complaint records from the period 2016–2024. This model allows for the prediction of the number of outage complaints, the estimation of operational trends, and support for key decision-making by telecommunications operators and sector regulatory authorities. This research adopted a quantitative, applied, and non-experimental methodological approach, selecting real complaint data to construct the model. This model is designed to be practical for planning maintenance and services to achieve service quality. 4 Furthermore, this research aligns with the objectives of the telecommunications sector in Peru, which requires reliable analytical tools to support infrastructure expansion, service continuity, and user satisfaction. 14 Finally, the contribution of a highly accurate predictive model enriches the scientific literature and the operational ecosystem of telecommunications services. The results provide a replicable approach applicable to other contexts and support the evidence for the use of machine learning, in this case Random Forest, to manage service quality. 5 Literature Review The increasing heterogeneity of telecommunications systems has spurred efforts to predict service interruptions and increase operational reliability. Research in this field has so far focused on machine learning techniques capable of modeling nonlinear behavior, handling large volumes of data, and/or aiding in proactive decision-making. Among machine learning techniques, Random Forest has gained the most prominence due to its efficiency, interpretability, and resistance to overfitting. 19 Early work on network reliability was conducted using statistical techniques for modeling service interruptions, but these techniques were deemed insufficient to address the dynamic nature of modern telecommunications. Thus, with the advent of data-driven analysis methodologies, machine learning algorithms, particularly assembly models, have shown far more satisfactory results in predicting service failures, detecting anomalies, and aiding in network optimization. Random Forest is considered a robust approach because it refines the model by arranging multiple decision trees, which, to some extent, add variance that aids generalization, in addition to providing interpretable scores for the input variables. 19 Numerous studies have explored the use of the Random Forest algorithm for predicting network failures and errors, demonstrating promising results. Previous studies include examples such as anomaly detection in mobile networks, classification of different types of service degradation events, prediction of service usage patterns, and optimization of resource allocation. These studies consistently deliver highly accurate predictive results and demonstrate the algorithm's remarkable ability to adapt to diverse network conditions and data structures. Furthermore, Random Forest has outperformed traditional regression models and standalone trees in predictive tasks involving heterogeneous and noisy operational data, characteristics naturally found in fault complaint logs. 20 However, international literature also reports on the potential of machine learning to support service quality control, where predictive models can help regulatory bodies identify early indicators of service deterioration, improve existing control frameworks, and guide compliance assessments. In Latin America, however, the use of machine learning for regulatory support is quite limited, especially in the analysis of public data and in consumer protection systems. 10 In the Peruvian context, the telecommunications regulatory environment developed in response to incidents and the analysis of complaints is largely based on manual complaint analysis. 9 Despite the availability of historical complaint datasets from the National Open Data Platform, empirical knowledge of how to apply these datasets to generate predictions for operational intelligence is limited. This lack of knowledge presents an opportunity to apply machine learning- based prediction tools that can serve as a basis for future management by operators and regulators, anticipating service failures, more effectively allocating technical resources, and maintaining service continuity. 15 To corroborate this, a review of the literature shows that Random Forest is also a robust and reliable option for predicting outage claims. Its ability to handle time-dependent patterns, categorical and numerical variables, and complex interactions makes it a suitable alternative for the characteristics of telecommunications data in Peru. These studies are based on the application of Random Forest to a large, longitudinal dataset, providing new evidence to encourage the application of predictive analytics in the Peruvian telecommunications sector. 28 Methods Research design. This study employed a quantitative, applied, and non-experimental longitudinal research design. The objective was to develop a predictive model to estimate the number of service outage complaints filed by Peruvian telecommunications companies between 2016 and 2024. Variables were not manipulated, nor was data collected through records; instead, historical data compiled by the Peruvian government was analyzed. The proposed model requires data input to identify temporal patterns and train the machine learning model to generate projections. 16 Data source. The database of complaints was obtained from Peru's National Open Data Platform, which provides datasets regulated by OSIPTEL, the national telecommunications supervisory body. This dataset contained 87,003 complaint records for service interruptions, operational failures, and technical malfunctions requiring on-site technical support at customer premises. 13 Population and Sample. The population consisted of all outage complaints reported nationwide by each telecommunications operator from January 2016 to June 2024; as the database is exhaustive and publicly available, the entire database was used as a sample, making sampling techniques unnecessary. 27 Variables The study included operational, temporal, and categorical variables that could be relevant to the evolution of claims, which comprised: Date of claim (day, month, year) Operator (categorical) Type of service (fixed internet, mobile, cable, etc.) Geographic location Claim classification (events associated with breakdowns) Number of claims per period (aggregated for forecasting) The dependent variable was the number of breakdown claims, which was modeled as a continuous outcome phenomenon. 27 Data preprocessing The procedures for cleaning and preparing the data were carried out prior to training model 21 : Handling missing values : records with NULL or inconsistent fields were corrected where possible, and deleted where necessary. Temporal aggregation : the claims count was aggregated with the time interval that was appropriate from the exploratory data analysis. Categorical coding : Operator names, service categories, or regions were coded through one-hot encoding Feature engineering : variables such as month, year and seasonality indicators were generated to ensure that the variables reflect patterns in the time domain of the data Normalization : When necessary, continuous variables were normalized to seek stability in the model. Model architecture: Random Forest Regressor A random forest regressor was used for its adequate performance with heterogeneous and nonlinear datasets. The model was designed with: A set of several decision trees built on random subsets of the dataset. Bootstrap aggregation (bagging) to reduce variance. Random selection of features for each node to avoid overfitting. Hyperparameters such as the number of trees, maximum depth, and minimum number of samples per division were optimized during the experimentation process. 29 Fig. 1 illustrates the proposed scheme. Training and validation The input data is divided into a training set (80%) and a test set (20%). To ensure a robust evaluation, k-fold cross-validation was applied during the hyperparameter fitting phase. 29 The training pipeline is illustrated in Fig. 2 . The training pipeline has included : Model initialization. Grid optimization of hyperparameters. Cross-validation score. Final training of the model with the complete training set. Try with unseen data. Table 1 complements the information described above. Model RMSE MAE R² Performance level Model 1 242.36 43.05 0.43 Low Model 2 246.31 45.05 0.41 Low Model 3 (final) 198.97 33.38 0.72 High Table 1 . Comparison of model performance Evaluation metrics The model's behavior was evaluated by applying regression metrics: Root Mean Square Error (RMSE): indicates the error in the model's behavior in magnitudes. Mean Absolute Error (MAE). reflects the absolute deviation of the average error in the model's behavior. Coefficient of Determination (R²): reflects the proportion of variance explained by the model, and indicates how much the model is able to explain beyond the mean. Thresholds for acceptable performance were defined based on the literature and in accordance with the project requirements. 28 Ethical considerations This work has used only publicly available data, in an anonymized format, provided in Peru by the Peruvian government. Access to personal identifiers or sensitive data is unavailable. The research adheres to open data policies and ethical 8 standards for secondary data analysis. 7 Results Model performance The Random Forest model exhibited robust predictive performance across all evaluated metrics. In the test set, the model showed an RMSE of 42.80, an MAE of 33.38, and an R² of 0.72, suggesting that 72% of the variance in outage claims is explained by the model. All these results met the predefined acceptance criteria and align with performance levels reported in the telecommunications demand forecasting literature. 30 Comparison of expected values vs. predicted values A visual comparison of predicted and observed claim counts showed a clear correlation throughout the analyzed time period. The model continued to link major peaks to seasonal increases in service outages and regulatory events that have historically shown increases in claims. Smaller variations were also predicted, although a slight underestimation was noted in isolated months with high variance. 21 Relevance of the variables The Random Forest model was used to provide information on the relative contribution of each variable to the accuracy of the prediction, and the most influential predictors identified were the following: 22 Year and month of the claim, reconstructing very marked temporal patterns. Type of service, reflecting the differential frequencies of breakdowns that our population presented between those corresponding to fixed and mobile services. Operator, testifying to the variability in service performance he found among the providers. Geographic location, highlighting regional differences in infrastructure robustness. Indicators of seasonality, collecting evident climatic and operational cycles. Temporal attributes were the most important, confirming the cyclical nature of claim patterns. 22 As observed in Fig. 3 Performance by analytical dimensions Prediction of claim frequency (FR) The results demonstrate robust performance in predicting an overall number of outage claims, with accuracy values similar for fixed and mobile services, confirming its generalizability. Performance metrics fell within the expected range defined before the analysis began, confirming the main hypothesis. 12 Claims Severity Prediction (SR) The categories linked to the severity of claims also showed strong predictive behavior, with RMSE and MAE values remaining within the pre-established acceptance limits, confirming that Random Forest can model not only the frequency but also the severity of operational problems. 25 Operational efficiency analysis (EO) The model also demonstrates its ability to estimate levels of operational efficiency, defined by the regularity and predictability of claims behavior on a monthly basis. Operators with good operational performance showed more predictable patterns, while those with highly variable behavior exhibited a greater prediction error, which was perfectly acceptable given the behavior of the assembly models. 1 Validation of the hypotheses All of the hypotheses formulated in the research have been empirically supported: General hypothesis The Random Forest model is able to correctly predict outage claims, using historical telecommunications data; validated, R² = 0.72. 17 Specific hypotheses: The model can predict the frequency of claims with acceptable accuracy; The model can predict the severity of claims within expected margins of error; The model can predict operational efficiency linked to claims with acceptable consistency. These results validate the statistical and operational feasibility of Random Forest for the proactive prediction of breakdown claims. 3 Table 2. Metric Results by Dimensions. Metrics FR MR EO RMSE 42.80 = Low 58.64 = Medium 71.55 = Medium MAE 33.38 = Medium 40.22 = Medium 46.10 = Medium R2 0.72 = High 0.65 = High 0.65 = Medium Ruler TRUE TRUE TRUE Discussion The evidence from this research indicates that the Random Forest method is a very robust and effective method for predicting service outage claims in the telecommunications sector in Peru. The model shows a low error rate, explaining more than 70% of the variance in claim behavior, and meets all pre-established accuracy thresholds. These results illustrate the capacity of machine learning methods to support operational decisions in complex, heterogeneous, and time-dependent data contexts. 23 The high performance of the time variables indicates that there are indeed strongly cyclical complaint patterns. This aligns with previous studies that report seasonality in service interruptions due to weather conditions, infrastructure strain, and variability in user demand. 12 The importance of service type, operator, and geolocation variables also coincides with existing literature, which indicates that outage behavior varies across technologies, providers, and regional contexts. These results validate the suitability of the Random Forest method for capturing multilevel interactions that would be very difficult to assess using traditional statistical models. 18 The results offer new empirical evidence in the Latin American region, where the use of machine learning applied to regulatory data is limited. By demonstrating that forecasts can be generated from public complaint records, this research highlights the incorporation of predictive analytics methods into service quality monitoring designs. 10 For telecommunications companies, the model presents an opportunity to improve the management of technical operations human resources, predict problems, and increase customer satisfaction through preventative interventions. For regulatory agencies, the predictive findings can better contribute to compliance monitoring, infrastructure forecasting, and enable more effective follow-up actions. 12 The research offers practical, social, and theoretical contributions. Practically, the predictive model provides a useful and operational tool, allowing for the anticipation of complaint spikes and the identification of high-risk service categories or geographic areas. 5 From a social perspective, the model can improve the user experience by enabling the early detection of service failures. Theoretically, the findings expand the existing body of knowledge on machine learning applications in telecommunications by demonstrating the advantages of using assembly methods with real-world regulatory data. 2 Despite these strengths, some limitations should also be noted. The model is based on historical complaint data, which does not necessarily reflect real-time network conditions or new service technologies. 14 Furthermore, the underreporting of complaints, quite common in telecommunications databases, can negatively influence predictions. 19 The research was also unable to incorporate external variables (such as weather indicators, network capacity, and infrastructure quality) that could have increased the accuracy of the predictions. 20 Future research could employ hybrid models, use complementary operational data, or make more extensive predictions to classify specific categories of telecommunications failures. 15 In summary, this study demonstrates that Random Forest is a reliable and scalable foundation for predictive telecommunications and offers practical benefits for operators and regulatory authorities. The results reinforce the importance of applying data-driven approaches to telecommunications research and provide a solid basis for managing quality of service prediction in Peru. 28 Conclusions This study confirms that a predictive model built using Random Forest effectively describes telecommunications industry outage claims in Peru, based on historical data from 2016–2024. The model demonstrates good performance with an RMSE of 42.80, an MAE of 33.38, and an R² of 0.72, as it successfully describes the temporal, operational, and regional variations in claim behavior. These results confirm the general hypothesis, as well as all the specific hypotheses proposed in the research. The predictive model also provides operational value by anticipating increases in complaints from operators, facilitating the allocation of technical resources and ensuring service continuity. For regulatory agencies like OSIPTEL, the model allows for a degree of analytical capacity in improving service quality monitoring, guiding supervisory actions, and collaborating in infrastructure planning based on proactive insights rather than reactive reports. 1 The theoretical contributions include expanding the literature on the application of machine learning techniques in telecommunications and demonstrating the benefits of ensemble techniques in modeling heterogeneous and nonlinear regulatory datasets. From a social perspective, the model fosters a better user experience by enabling the early detection of service degradation; from an operational standpoint, it allows for more efficient technical management of interventions. Furthermore, the results are also aligned with SDG 9: resilient infrastructure, innovation, and sustainable industrialization. 22 For future work, it is suggested that additional datasets (such as climate indicators, network capacity, or real-time operational metrics) be integrated to improve predictive capabilities. Furthermore, it is suggested that hybrid techniques combining machine learning and deep learning models be explored to provide greater detail in the classification of specific failure categories. Finally, expanding this framework to other regions of Latin America could also contribute to strengthening its applicability, as well as modernizing regulations in the region. 5 In summary, this study corroborated that Random Forest constitutes a reliable, scalable, and interpretive basis for predictive analysis of service quality in telecommunications in Peru, resulting in significant benefit rates for operators, regulators, and users alike. 9 Declarations Conflicts of interest The authors declare no conflicts of interest. Funding No external funding was received for this research. Author Contribution FHPQ and EGFC jointly developed the research concept and study design.FHPQ performed data preprocessing, model training limits, and statistical evaluation.EGFC contributed to the literature review, methodological structure, and discussion of results. Both authors participated in the drafting, review, and approval of the final manuscript.Both authors are designated as corresponding authors. Acknowledgement The authors wish to thank César Vallejo University for the continuous and valuable assistance provided during the development of this research. The authors also wish to thank the Peruvian National Open Data Platform and OSIPTEL for making public telecommunications data available, which, in turn, enabled the analytical and predictive components of this study. No external funding was received for this research. Data Availability The data used in this study were obtained from the Peruvian Government’s Open Data platform (https://www.datosabiertos.gob.pe), which collects historical records of complaints and outages in telecommunications services. From this source, the dataset used to train and evaluate the Random Forest prediction model was constructed. References Alice, A. ¿Qué es la eficiencia operativa? IBM, 26 de marzo de 2024. [En línea]. Disponible: https://www.ibm.com/es-es/topics/operational-efficiency . [Consultado: 27 de mayo de 2025]. Banco Interamericano de Desarrollo (BID). Al menos 77 millones de personas sin acceso a internet de calidad en áreas rurales, BID, 1 de julio de 2020. [En línea]. Disponible: https://www.iadb.org/es/noticias/al-menos-77-millones-de-personas-sin-acceso- internet-de-calidad-en-areas-rurales . [Consultado: 27 de mayo de 2025]. Carvalho, J. Innovación proactiva vs. reactiva: ¿Cuál es el mejor enfoque para las micro y pequeñas empresas? PODIUM , no. 45, pp. 163–176, [En línea]. Disponible: (2024). https://revistas.uees.edu.ec/index.php/Podium/article/view/1179 Castro et al. La investigación aplicada y el desarrollo experimental en el fortalecimiento de las competencias de la sociedad del siglo XXI, Tecnura, vol. 27, no. 75, pp. 140–174, [En línea]. Disponible: (2023). https://doi.org/10.14483/22487638.19171 CEPAL, Tecnologías digitales para un nuevo futuro. [En línea]. Disponible: (2021). https://repositorio.cepal.org/server/api/core/bitstreams/879779be-c0a0-4e11- 8e08-cf80b41a4fd9/content. [Consultado: 27 de mayo de 2025]. Código de Ética en Investigación de la Universidad César Vallejo, Universidad César Vallejo. [En línea]. Disponible: (2022). https://webadminportal.ucv.edu.pe/uploads/files/backup/RCUN-470-2022-UCV- Aprueba-actualizacion-del-Codigo-de-Etica-en-Investigacion-V01.pdf [Consultado: 27 de mayo de 2025]. Directiva, C. O. N. C. Y. T. E. C. Nº 004-2016-CONCYTEC-DEGC, (s.f.). [En línea]. Disponible: https://transparencia.concytec.gob.pe/images/transparencia/2016/R.P.087-2016- P.pdf [Consultado: 27 de mayo de 2025]. Decreto Supremo Nº 033-2023-PCM, Presidencia del Consejo de Ministros, 2023. [En línea]. Disponible: https://cdn.www.gob.pe/uploads/document/file/4324311/DS%20N%C2%B0%20033- 2023-PCM.pdf.pdf?v = 1679694212 [Consultado: 27 de mayo de 2025]. López et al. Application of a Data Mining Model to Predict Customer Defection. Case of a Telecommunications Company in Peru, Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, vol. 14, no. 1, pp. 144–158, [En línea]. Disponible: (2023). https://doi.org/10.58346/JOWUA.2023.I1.012 Miranda, P. Estudio de predicción en la radicación de quejas y reclamos por parte de usuarios inconformes en empresa de sanidad colombiana, Trabajo de Maestría, Universidad Complutense de Madrid, Madrid, España, 2021. [En línea]. Disponible: https://www.proquest.com/openview/d09cc7a0dcf3ae44bca1e3796a49a682/1?pq - origsite = gscholar&cbl = 2026366&diss = y [Consultado: 27 de mayo de 2025]. ORCA, G. R. C. 6 métricas imprescindibles para la evaluación de riesgos operacionales, 9 de septiembre de 2024. [En línea]. Disponible: https://blog.orcagrc.com/6- m%C3%A9tricas-evaluaci%C3%B3n-riesgos-operacionales . [Consultado: 27 de mayo de 2025]. OSIPTEL. Diferencias entre reclamos, quejas y denuncias, (s.f.). [En línea]. Disponible: https://www.osiptel.gob.pe/portal-del-usuario/lo-que-debes-saber/guia-de- informacion-y-orientacion/diferencias-entre-reclamos-quejas-y- denuncias?utm_source = chatgpt.com [Consultado: 27 de mayo de 2025]. OSIPTEL & OSIPTEL: te explicamos. cuándo y cómo presentar un reclamo por problemas con la prestación de tus servicios de telecomunicaciones, 30 de junio de 2022. [En línea]. Disponible: https://www.osiptel.gob.pe/portal-del-usuario/noticias/osiptel-te- explicamos-cuando-y-como-presentar-un-reclamo-por-problemas-con-la-prestacion- de-tus-servicios-de-telecomunicaciones/ [Consultado: 27 de mayo de 2025]. Peng et al. RLclean: An unsupervised integrated data cleaning framework based on deep reinforcement learning, Information Sciences, vol. 682, p. 121281, [En línea]. Disponible: (2024). https://doi.org/10.1016/j.ins.2024.121281 Plataforma Nacional de Datos Abiertos. Obtener datos abiertos del gobierno, 3 de agosto de 2022. [En línea]. Disponible: https://www.gob.pe/pl/7381-obtener-datos- abiertos-del-gobierno . [Consultado: 27 de mayo de 2025]. Ramos, C. Diseños de investigación experimental, CienciAmérica, vol. 10, no. 1, pp. 1–7, [En línea]. Disponible: (2021). https://doi.org/10.33210/ca.v10i1.356 Roca y, C., Mullor, M. & CONTRASTE O PRUEBA DE HIPÓTESIS E INTRODUCCIÓN AL ANÁLISIS DE REGRESIÓN LINEAL O AJUSTE DE MÍNIMOS CUADRADOS. NOTAS PARA DOCTORANDOS, Revista Ingeniería, Matemáticas Y Ciencias De La Información, vol. 11, no. 21, pp. 13–25, [En línea]. Disponible: (2024). https://ojs.urepublicana.edu.co/index.php/ingenieria/article/view/921 Salman et al. Random Forest Algorithm Overview, Babylonian Journal of Machine Learning, pp. 69–79, 2024. [En línea]. Disponible: (2024). https://doi.org/10.58496/BJML/2024/007 Severino, M. K. & Peng, Y. y Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata, Machine Learning with Applications, vol. 5, p. 100074, [En línea]. Disponible: (2021). https://doi.org/10.1016/j.mlwa.2021.100074 Singh, P. Harnessing machine learning for predictive troubleshooting in telecom networks, Australian Journal of Machine Learning Research & Applications, vol. 3, no. 2, [En línea]. Disponible: (2023). https://sydneyacademics.com/index.php/ajmlra/article/view/108 Sumithra et al. Improving brain tumor diagnosis: A self-calibrated 1D residual network with random forest integration, Brain Research, [En línea]. Disponible: (2025). https://doi.org/10.1016/j.brainres.2025.149704 Diamantopoulou et al. Evaluation of the random forest regression machine learning technique as an alternative to ecoregional based regression taper modelling. Comput. Electron. Agric. 239 , 110964. https://doi.org/10.1016/j.compag.2025.110964 (Dec. 2025). Part A. Valdivieso, D. & Cañar, J. y Transformational Leadership in Decision Making at the Virgen del Cisne Savings and Loan Cooperative, Latacunga Canton, LATAM Revista Latinoamericana De Ciencias Sociales Y Humanidades, vol. 6, no. 1, pp. 888–906, [En línea]. Disponible: (2025). https://doi.org/10.56712/latam.v6i1.3388 Ventura et al. Predicting the success of banking telemarketing through the use of decision trees, Innovación y Software, vol. 4, no. 1, pp. 122–137, [En línea]. Disponible: (2023). https://www.redalyc.org/journal/6738/673874721009/673874721009.pdf Viana et al. Detección de fraudes por reclamos engañosos de clientes en entidades bancarias a través de técnicas de minería de datos: una revisión sistemática, Revista Ibérica de Sistemas e Tecnologias de Informação, no. 43, pp. 276–286, [En línea]. Disponible: (2021). https://www.proquest.com/openview/d6e8abf6973ed45892d16d473917df88/1?pq - origsite = gscholar&cbl = 1006393. [Consultado: 27 de mayo de 2025]. Villarreal et al. Las familia como unidad de análisis en la investigación científica en medicina familiar, Revista mexicana de medicina familiar, vol. 9, no. 1, pp. 31–34, [En línea]. Disponible: (2022). https://doi.org/10.24875/rmf.21000064 Vizcaíno et al. Metodología de la investigación científica: guía práctica, Ciencia Latina Revista Científica Multidisciplinar, vol. 7, no. 4, pp. 9723–9762, [En línea]. Disponible: (2023). https://doi.org/10.37811/cl_rcm.v7i4.7658 Wen et al. A hybrid-optimized Random Forest interpretable model for debris flow susceptibility by prior model-based negative sampling, Advances in Space Research, [En línea]. Disponible: (2025). https://doi.org/10.1016/j.asr.2025.04.055 Diamantopoulou et al. Evaluation of the random forest regression machine learning technique as an alternative to ecoregional based regression taper modelling. Comput. Electron. Agric. 239 , 110964. https://doi.org/10.1016/j.compag.2025.110964 (Dec. 2025). Part A. Fellah, M., Ouhaibi, S., Belouaggadia, N. & Mansouri, K. Energy consumption forecasting and thermal insulator selection with random forest regression. Sci. Afr. 29 , e. https://doi.org/10.1016/j.sciaf.2025.e02870 (Sep. 2025). Additional Declarations No competing interests reported. Supplementary Files datasetreclamos.xlsx Cite Share Download PDF Status: Under Revision Version 1 posted Editorial decision: Revision requested 04 Feb, 2026 Reviews received at journal 03 Feb, 2026 Reviewers agreed at journal 29 Jan, 2026 Reviews received at journal 03 Jan, 2026 Reviewers agreed at journal 22 Dec, 2025 Reviewers agreed at journal 22 Dec, 2025 Reviewers invited by journal 19 Dec, 2025 Editor invited by journal 16 Dec, 2025 Editor assigned by journal 15 Dec, 2025 Submission checks completed at journal 15 Dec, 2025 First submitted to journal 11 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8341643","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":563111523,"identity":"4cf044a1-265a-47c1-8aad-c7435e12e427","order_by":0,"name":"Francis Homero Padilla Quispe","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABHklEQVRIie2PsWrDMBCGTxSaxel8IJe+goLBuKTFryITiBdDAlk8hGIwOEtp1+Yt7DcQBNxFDxCTxd0zGAqlU6nkNNCCStZC9YHE6cTHfwdgsfxBBhkA8nlwqFStLyIAzn5VHKEVhodKK466TyugFeRfnZMKzV+2LcOri2Y/eh0WwSzESSsgHUcZzYVRcWsvUION1rvEo8MCr+9xygTIOMrcmpuUELmvdyHlLvHJk0TmqKcgxSbKMGHGFIzftBKWjTwqcSfIh1ZmnVlJ+pSo3Do+dKlWEpWS9Snm9V250LtM1nK6wF6R+7ngdewVbm0ejK6q5j29uX143lQdZ3dssIqrtluOLx9p3hpj8FiI712uzjmCGbPy88tisVj+OZ+1o1+bmMig6wAAAABJRU5ErkJggg==","orcid":"","institution":"Universidad César Vallejo","correspondingAuthor":true,"prefix":"","firstName":"Francis","middleName":"Homero Padilla","lastName":"Quispe","suffix":""},{"id":563111524,"identity":"70c8a092-0da5-4d6c-9500-db3cbde3009c","order_by":1,"name":"Erick Giovanny Flores Chacón","email":"","orcid":"","institution":"Universidad César Vallejo","correspondingAuthor":false,"prefix":"","firstName":"Erick","middleName":"Giovanny Flores","lastName":"Chacón","suffix":""}],"badges":[],"createdAt":"2025-12-12 04:23:47","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8341643/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8341643/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":99307841,"identity":"3b2146aa-28fd-4928-9a32-e13f1066c7ce","added_by":"auto","created_at":"2025-12-31 16:06:55","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":131916,"visible":true,"origin":"","legend":"","description":"","filename":"PADILLAManuscriptScientificReports.docx","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/bd917bccc766bf6c59896de4.docx"},{"id":98823746,"identity":"63dba948-4ac7-4c17-9366-b17a5661121b","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4755,"visible":true,"origin":"","legend":"","description":"","filename":"04cb4b69237f482aa320f40bf6d380ce.json","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/c5019259af710227ce9480ac.json"},{"id":98823752,"identity":"50b83242-9df0-4b63-a10c-8234bfd9d3a6","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3162505,"visible":true,"origin":"","legend":"","description":"","filename":"datasetreclamos.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/24b039ba58f3d240da26b8ea.xlsx"},{"id":99308009,"identity":"3a719d4f-f11b-46b1-b842-c7e5a6b5944b","added_by":"auto","created_at":"2025-12-31 16:07:23","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":76228,"visible":true,"origin":"","legend":"","description":"","filename":"04cb4b69237f482aa320f40bf6d380ce1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/8726a8e512b2feee2ab8828b.xml"},{"id":99308063,"identity":"c277273a-05c7-44c6-bca7-4acc9b29a227","added_by":"auto","created_at":"2025-12-31 16:07:40","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":21717,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/a05fe0ac475a75b5235b5e32.png"},{"id":99307893,"identity":"3f0a89f8-a536-4ab2-8362-d8d3f77590ef","added_by":"auto","created_at":"2025-12-31 16:06:59","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":14784,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/995327033c292072b70c5ddf.png"},{"id":98823748,"identity":"a29b5e9a-8cca-4d98-b356-eed954eadae9","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":29965,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/682dfb2815d22bd19cf74206.png"},{"id":98823753,"identity":"476caf7b-bf08-41a0-a1c0-80c23f9b9489","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"xml","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":73505,"visible":true,"origin":"","legend":"","description":"","filename":"04cb4b69237f482aa320f40bf6d380ce1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/06aabf29e71e953707075fec.xml"},{"id":98823750,"identity":"5da54c4d-4b1a-4f4f-a4e9-334eb337ef54","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"html","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":85806,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/4ce0e87823392f18b885e37c.html"},{"id":98823742,"identity":"06322374-8549-463c-9213-3ed57e1cb289","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":50185,"visible":true,"origin":"","legend":"\u003cp\u003eTechnological Architecture\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/391c25041693a9a132ba4e84.jpeg"},{"id":98823751,"identity":"4442574b-7914-46a6-8714-3a369edb49da","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":83465,"visible":true,"origin":"","legend":"\u003cp\u003eEvolution of the Random Forest model's performance per iteration\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/a2846b36ee1a5a6257ecff74.jpeg"},{"id":98823743,"identity":"a2cb053d-ff25-46db-b74a-9e5a7ac43963","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":136119,"visible":true,"origin":"","legend":"\u003cp\u003eImportance test of predictor variables\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/5ad968a9ddb8b2ca46764339.jpeg"},{"id":99322367,"identity":"c883791b-11f1-4115-90bb-266ea15b4f68","added_by":"auto","created_at":"2025-12-31 16:43:29","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1059838,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/4d2f8e17-5f16-4878-8aa0-c16075b1f104.pdf"},{"id":98823754,"identity":"5fb3b176-7524-4f90-a5d5-0d150a6cc65e","added_by":"auto","created_at":"2025-12-22 18:05:23","extension":"xlsx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":3162505,"visible":true,"origin":"","legend":"","description":"","filename":"datasetreclamos.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8341643/v1/f189af21c1391d99057a0372.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Random Forest Model for Predicting Claims for Outages in Telecommunications Operating Companies in Peru (2016–2024)","fulltext":[{"header":"Introduction","content":"\u003cp\u003eTelecommunications services have become essential for the development of economic activity, for the population, and for social connectivity. With the growth of network infrastructure and the increasing demands on services, operators continually face the challenge of maintaining stable operation and sufficient operational quality.\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e The high volume of complaints regarding service outages is a key indicator of service degradation, reflecting user dissatisfaction and underlying faults in fixed and mobile networks. Reporting incidents often requires technical personnel to travel to customer premises, thus increasing operational repercussions and impacting contractual service level agreements. \u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn Peru, the regulator OSIPTEL oversees the quality of telecommunications services and makes information on service outage complaints available to the public through the National Open Data Platform. It is also important to note that these records cover many operators and service categories, making them a valuable source of information in terms of their temporal, operational, and behavioral nature. However, despite this widespread availability of data, we\u003c/p\u003e \u003cp\u003ehave observed that most operational decisions are reactive, based on historical trends and manual analysis rather than predictive intelligence. Consequently, when a user files a complaint about a service interruption, the outage must be addressed after the complaint has already been submitted, limiting the capacity for both proactive maintenance and resource planning. \u003csup\u003e12\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn recent years, machine learning techniques have gained importance for tackling complex classification and forecasting problems in the telecommunications field. Among these are assembly models such as Random Forest, which have demonstrated good day-to-day performance in managing nonlinear relationships, noisy information, and high-dimensional variables, making them suitable for predicting service failures and customer behavior. Previous research has implemented these techniques for fault detection, anomaly classification, network performance forecasting, and outage prediction in multiple environments. However, within the regulatory and operational context required in Peru, predictive models for fault claims remain largely unexplored. \u003csup\u003e18\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eThis analysis seeks to address this gap by developing and evaluating a Random Forest predictive model based on complaint records from the period 2016–2024. This model allows for the prediction of the number of outage complaints, the estimation of operational trends, and support for key decision-making by telecommunications operators and sector regulatory authorities. This research adopted a quantitative, applied, and non-experimental methodological approach, selecting real complaint data to construct the model. This model is designed to be practical for planning maintenance and services to achieve service quality.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003eFurthermore, this research aligns with the objectives of the telecommunications sector in Peru, which requires reliable analytical tools to support infrastructure expansion, service continuity, and user satisfaction.\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eFinally, the contribution of a highly accurate predictive model enriches the scientific literature and the operational ecosystem of telecommunications services. The results provide a replicable approach applicable to other contexts and support the evidence for the use of machine learning, in this case Random Forest, to manage service quality. \u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e\n\u003ch3\u003eLiterature Review\u003c/h3\u003e\n\u003cp\u003eThe increasing heterogeneity of telecommunications systems has spurred efforts to predict service interruptions and increase operational reliability. Research in this field has so far focused on machine learning techniques capable of modeling nonlinear behavior, handling large volumes of data, and/or aiding in proactive decision-making. Among machine learning techniques, Random Forest has gained the most prominence due to its efficiency, interpretability, and resistance to overfitting.\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eEarly work on network reliability was conducted using statistical techniques for modeling service interruptions, but these techniques were deemed insufficient to address the dynamic nature of modern telecommunications. Thus, with the advent of data-driven analysis methodologies, machine learning algorithms, particularly assembly models, have shown far more satisfactory results in predicting service failures, detecting anomalies, and aiding in network optimization. Random Forest is considered a robust approach because it refines the model by arranging multiple decision trees, which, to some extent, add variance that aids generalization, in addition to providing interpretable scores for the input variables. \u003csup\u003e19\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eNumerous studies have explored the use of the Random Forest algorithm for predicting network failures and errors, demonstrating promising results. Previous studies include examples such as anomaly detection in mobile networks, classification of different types of service degradation events, prediction of service usage patterns, and optimization of resource allocation. These studies consistently deliver highly accurate predictive results and demonstrate the algorithm's remarkable ability to adapt to diverse network conditions and data structures. Furthermore, Random Forest has outperformed traditional regression models and standalone trees in predictive tasks involving heterogeneous and noisy operational data, characteristics naturally found in fault complaint logs.\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eHowever, international literature also reports on the potential of machine learning to support service quality control, where predictive models can help regulatory bodies identify early indicators of service deterioration, improve existing control frameworks, and guide compliance assessments. In Latin America, however, the use of machine learning for regulatory support is quite limited, especially in the analysis of public data and in consumer protection systems. \u003csup\u003e10\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn the Peruvian context, the telecommunications regulatory environment developed in response to incidents and the analysis of complaints is largely based on manual complaint analysis.\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e Despite the availability of historical complaint datasets from the National Open Data Platform, empirical knowledge of how to apply these datasets to generate predictions for operational intelligence is limited. This lack of knowledge presents an opportunity to apply machine learning- based prediction tools that can serve as a basis for future management by operators and regulators, anticipating service failures, more effectively allocating technical resources, and maintaining service continuity.\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eTo corroborate this, a review of the literature shows that Random Forest is also a robust and reliable option for predicting outage claims. Its ability to handle time-dependent patterns, categorical and numerical variables, and complex interactions makes it a suitable alternative for the characteristics of telecommunications data in Peru. These studies are based on the application of Random Forest to a large, longitudinal dataset, providing new evidence to encourage the application of predictive analytics in the Peruvian telecommunications sector. \u003csup\u003e28\u003c/sup\u003e\u003c/p\u003e "},{"header":"Methods","content":"\u003cp\u003e \u003cb\u003eResearch design.\u003c/b\u003e \u003c/p\u003e\u003cp\u003eThis study employed a quantitative, applied, and non-experimental longitudinal research design. The objective was to develop a predictive model to estimate the number of service outage complaints filed by Peruvian telecommunications companies between 2016 and 2024. Variables were not manipulated, nor was data collected through records; instead, historical data compiled by the Peruvian government was analyzed. The proposed model requires data input to identify temporal patterns and train the machine learning model to generate projections. \u003csup\u003e16\u003c/sup\u003e\u003c/p\u003e\u003cp\u003e \u003cb\u003eData source.\u003c/b\u003e \u003c/p\u003e\u003cp\u003eThe database of complaints was obtained from Peru's National Open Data Platform, which provides datasets regulated by OSIPTEL, the national telecommunications supervisory body. This dataset contained 87,003 complaint records for service interruptions, operational failures, and technical malfunctions requiring on-site technical support at customer premises. \u003csup\u003e13\u003c/sup\u003e\u003c/p\u003e\u003cp\u003e \u003cb\u003ePopulation and Sample.\u003c/b\u003e \u003c/p\u003e\u003cp\u003eThe population consisted of all outage complaints reported nationwide by each telecommunications operator from January 2016 to June 2024; as the database is exhaustive and publicly available, the entire database was used as a sample, making sampling techniques unnecessary. \u003csup\u003e27\u003c/sup\u003e\u003c/p\u003e\u003ch3\u003eVariables\u003c/h3\u003e\u003cp\u003eThe study included operational, temporal, and categorical variables that could be relevant to the evolution of claims, which comprised:\u003c/p\u003e\u003cul\u003e \u003cli\u003e \u003cp\u003eDate of claim (day, month, year)\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eOperator (categorical)\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eType of service (fixed internet, mobile, cable, etc.)\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eGeographic location\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eClaim classification (events associated with breakdowns)\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eNumber of claims per period (aggregated for forecasting)\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e\u003cp\u003eThe dependent variable was the number of breakdown claims, which was modeled as a continuous outcome phenomenon. \u003csup\u003e27\u003c/sup\u003e\u003c/p\u003e\n\u003ch3\u003eData preprocessing\u003c/h3\u003e\n\u003cp\u003eThe procedures for cleaning and preparing the data were carried out prior to training model \u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eHandling missing values\u003c/b\u003e: records with NULL or inconsistent fields were corrected where possible, and deleted where necessary.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eTemporal aggregation\u003c/b\u003e: the claims count was aggregated with the time interval that was appropriate from the exploratory data analysis.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eCategorical coding\u003c/b\u003e: Operator names, service categories, or regions were coded through one-hot encoding\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eFeature engineering\u003c/b\u003e: variables such as month, year and seasonality indicators were generated to ensure that the variables reflect patterns in the time domain of the data\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eNormalization\u003c/b\u003e: When necessary, continuous variables were normalized to seek stability in the model.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e\n\u003ch3\u003eModel architecture: Random Forest Regressor\u003c/h3\u003e\n\u003cp\u003eA random forest regressor was used for its adequate performance with heterogeneous and nonlinear datasets. The model was designed with:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eA set of several decision trees built on random subsets of the dataset.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eBootstrap aggregation (bagging) to reduce variance.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eRandom selection of features for each node to avoid overfitting.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eHyperparameters such as the number of trees, maximum depth, and minimum number of samples per division were optimized during the experimentation process. \u003csup\u003e29\u003c/sup\u003e Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates the proposed scheme.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eTraining and validation\u003c/h3\u003e\n\u003cp\u003eThe input data is divided into a training set (80%) and a test set (20%). To ensure a robust evaluation, k-fold cross-validation was applied during the hyperparameter fitting phase. \u003csup\u003e29\u003c/sup\u003e The training pipeline is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cb\u003eThe training pipeline has included\u003c/b\u003e:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eModel initialization.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eGrid optimization of hyperparameters.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eCross-validation score.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eFinal training of the model with the complete training set.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eTry with unseen data.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ecomplements the information described above.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRMSE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMAE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eR\u0026sup2;\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003ePerformance level\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel 1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e242.36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e43.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.43\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLow\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel 2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e246.31\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e45.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.41\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLow\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eModel 3 (final)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003e198.97\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e33.38\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e0.72\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cb\u003eHigh\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Comparison of model performance\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation metrics\u003c/h2\u003e \u003cp\u003eThe model's behavior was evaluated by applying regression metrics:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eRoot Mean Square Error (RMSE): indicates the error in the model's behavior in magnitudes.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eMean Absolute Error (MAE). reflects the absolute deviation of the average error in the model's behavior.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eCoefficient of Determination (R\u0026sup2;): reflects the proportion of variance explained by the model, and indicates how much the model is able to explain beyond the mean.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eThresholds for acceptable performance were defined based on the literature and in accordance with the project requirements. \u003csup\u003e28\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eEthical considerations\u003c/h3\u003e\n\u003cp\u003eThis work has used only publicly available data, in an anonymized format, provided in Peru by the Peruvian government. Access to personal identifiers or sensitive data is unavailable. The research adheres to open data policies and ethical \u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e standards for secondary data analysis. \u003csup\u003e7\u003c/sup\u003e\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eModel performance\u003c/h2\u003e \u003cp\u003eThe Random Forest model exhibited robust predictive performance across all evaluated metrics. In the test set, the model showed an RMSE of 42.80, an MAE of 33.38, and an R\u0026sup2; of 0.72, suggesting that 72% of the variance in outage claims is explained by the model. All these results met the predefined acceptance criteria and align with performance levels reported in the telecommunications demand forecasting literature. \u003csup\u003e30\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eComparison of expected values vs. predicted values\u003c/h2\u003e \u003cp\u003eA visual comparison of predicted and observed claim counts showed a clear correlation throughout the analyzed time period. The model continued to link major peaks to seasonal increases in service outages and regulatory events that have historically shown increases in claims. Smaller variations were also predicted, although a slight underestimation was noted in isolated months with high variance. \u003csup\u003e21\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eRelevance of the variables\u003c/h2\u003e \u003cp\u003eThe Random Forest model was used to provide information on the relative contribution of each variable to the accuracy of the prediction, and the most influential predictors identified were the following: \u003csup\u003e22\u003c/sup\u003e\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eYear and month of the claim, reconstructing very marked temporal patterns.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eType of service, reflecting the differential frequencies of breakdowns that our population presented between those corresponding to fixed and mobile services.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eOperator, testifying to the variability in service performance he found among the providers.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eGeographic location, highlighting regional differences in infrastructure robustness.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eIndicators of seasonality, collecting evident climatic and operational cycles.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eTemporal attributes were the most important, confirming the cyclical nature of claim patterns.\u003csup\u003e22\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eAs observed in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003ePerformance by analytical dimensions Prediction of claim frequency (FR)\u003c/h2\u003e \u003cp\u003eThe results demonstrate robust performance in predicting an overall number of outage claims, with accuracy values similar for fixed and mobile services, confirming its generalizability. Performance metrics fell within the expected range defined before the analysis began, confirming the main hypothesis. \u003csup\u003e12\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eClaims Severity Prediction (SR)\u003c/h2\u003e \u003cp\u003eThe categories linked to the severity of claims also showed strong predictive behavior, with RMSE and MAE values remaining within the pre-established acceptance limits, confirming that Random Forest can model not only the frequency but also the severity of operational problems.\u003csup\u003e25\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eOperational efficiency analysis (EO)\u003c/h2\u003e \u003cp\u003eThe model also demonstrates its ability to estimate levels of operational efficiency, defined by the regularity and predictability of claims behavior on a monthly basis. Operators with good operational performance showed more predictable patterns, while those with highly variable behavior exhibited a greater prediction error, which was perfectly acceptable given the behavior of the assembly models. \u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eValidation of the hypotheses\u003c/h2\u003e \u003cp\u003eAll of the hypotheses formulated in the research have been empirically supported:\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eGeneral hypothesis\u003c/strong\u003e \u003cp\u003eThe Random Forest model is able to correctly predict outage claims, using historical telecommunications data; validated, R\u0026sup2; = 0.72. 17\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eSpecific hypotheses:\u003c/h2\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eThe model can predict the frequency of claims with acceptable accuracy;\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eThe model can predict the severity of claims within expected margins of error;\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eThe model can predict operational efficiency linked to claims with acceptable consistency.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eThese results validate the statistical and operational feasibility of Random Forest for the proactive prediction of breakdown claims. \u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e\u003cp\u003e \u003cb\u003eTable\u0026nbsp;2.\u003c/b\u003e Metric Results by Dimensions.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMetrics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFR\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMR\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEO\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRMSE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e42.80\u0026thinsp;=\u0026thinsp;Low\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e58.64\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e71.55\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMAE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33.38\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e40.22\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e46.10\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.72\u0026thinsp;=\u0026thinsp;High\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.65\u0026thinsp;=\u0026thinsp;High\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.65\u0026thinsp;=\u0026thinsp;Medium\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRuler\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTRUE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTRUE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTRUE\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe evidence from this research indicates that the Random Forest method is a very robust and effective method for predicting service outage claims in the telecommunications sector in Peru. The model shows a low error rate, explaining more than 70% of the variance in claim behavior, and meets all pre-established accuracy thresholds. These results illustrate the capacity of machine learning methods to support operational decisions in complex, heterogeneous, and time-dependent data contexts. \u003csup\u003e23\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eThe high performance of the time variables indicates that there are indeed strongly cyclical complaint patterns. This aligns with previous studies that report seasonality in service interruptions due to weather conditions, infrastructure strain, and variability in user demand.\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e The importance of service type, operator, and geolocation variables also coincides with existing literature, which indicates that outage behavior varies across technologies, providers, and regional contexts. These results validate the suitability of the Random Forest method for capturing multilevel interactions that would be very difficult to assess using traditional statistical models. \u003csup\u003e18\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eThe results offer new empirical evidence in the Latin American region, where the use of machine learning applied to regulatory data is limited. By demonstrating that forecasts can be generated from public complaint records, this research highlights the incorporation of predictive analytics methods into service quality monitoring designs. \u003csup\u003e10\u003c/sup\u003e For telecommunications companies, the model presents an opportunity to improve the management of technical operations human resources, predict problems, and increase customer satisfaction through preventative interventions. For regulatory agencies, the predictive findings can better contribute to compliance monitoring, infrastructure forecasting, and enable more effective follow-up actions.\u003csup\u003e12\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eThe research offers practical, social, and theoretical contributions. Practically, the predictive model provides a useful and operational tool, allowing for the anticipation of complaint spikes and the identification of high-risk service categories or geographic areas. \u003csup\u003e5\u003c/sup\u003e From a social perspective, the model can improve the user experience by enabling the early detection of service failures. Theoretically, the findings expand the existing body of knowledge on machine learning applications in telecommunications by demonstrating the advantages of using assembly methods with real-world regulatory data. \u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eDespite these strengths, some limitations should also be noted. The model is based on historical complaint data, which does not necessarily reflect real-time network conditions or new service technologies. \u003csup\u003e14\u003c/sup\u003e Furthermore, the underreporting of complaints, quite common in telecommunications databases, can negatively influence predictions. \u003csup\u003e19\u003c/sup\u003e The research was also unable to incorporate external variables (such as weather indicators, network capacity, and\u003c/p\u003e \u003cp\u003einfrastructure quality) that could have increased the accuracy of the predictions. \u003csup\u003e20\u003c/sup\u003e Future research could employ hybrid models, use complementary operational data, or make more extensive predictions to classify specific categories of telecommunications failures. \u003csup\u003e15\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIn summary, this study demonstrates that Random Forest is a reliable and scalable foundation for predictive telecommunications and offers practical benefits for operators and regulatory authorities. The results reinforce the importance of applying data-driven approaches to telecommunications research and provide a solid basis for managing quality of service prediction in Peru. \u003csup\u003e28\u003c/sup\u003e\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThis study confirms that a predictive model built using Random Forest effectively describes telecommunications industry outage claims in Peru, based on historical data from 2016\u0026ndash;2024. The model demonstrates good performance with an RMSE of 42.80, an MAE of 33.38, and an R\u0026sup2; of 0.72, as it successfully describes the temporal, operational, and regional variations in claim behavior. These results confirm the general hypothesis, as well as all the specific hypotheses proposed in the research.\u003c/p\u003e \u003cp\u003eThe predictive model also provides operational value by anticipating increases in complaints from operators, facilitating the allocation of technical resources and ensuring service continuity. For regulatory agencies like OSIPTEL, the model allows for a degree of analytical capacity in improving service quality monitoring, guiding supervisory actions, and collaborating in infrastructure planning based on proactive insights rather than reactive reports. 1\u003c/p\u003e \u003cp\u003eThe theoretical contributions include expanding the literature on the application of machine learning techniques in telecommunications and demonstrating the benefits of ensemble techniques in modeling heterogeneous and nonlinear regulatory datasets. From a social perspective, the model fosters a better user experience by enabling the early detection of service degradation; from an operational standpoint, it allows for more efficient technical management of interventions. Furthermore, the results are also aligned with SDG 9: resilient infrastructure, innovation, and sustainable industrialization. \u003csup\u003e22\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eFor future work, it is suggested that additional datasets (such as climate indicators, network capacity, or real-time operational metrics) be integrated to improve predictive capabilities. Furthermore, it is suggested that hybrid techniques combining machine learning and deep learning models be explored to provide greater detail in the classification of specific failure categories. Finally, expanding this framework to other regions of Latin America could also contribute to strengthening its applicability, as well as modernizing regulations in the region. 5\u003c/p\u003e \u003cp\u003eIn summary, this study corroborated that Random Forest constitutes a reliable, scalable, and interpretive basis for predictive analysis of service quality in telecommunications in Peru, resulting in significant benefit rates for operators, regulators, and users alike. \u003csup\u003e9\u003c/sup\u003e\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eConflicts of interest\u003c/h2\u003e \u003cp\u003eThe authors declare no conflicts of interest.\u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e \u003cp\u003eNo external funding was received for this research.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eFHPQ and EGFC jointly developed the research concept and study design.FHPQ performed data preprocessing, model training limits, and statistical evaluation.EGFC contributed to the literature review, methodological structure, and discussion of results. Both authors participated in the drafting, review, and approval of the final manuscript.Both authors are designated as corresponding authors.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eThe authors wish to thank C\u0026eacute;sar Vallejo University for the continuous and valuable assistance provided during the development of this research. The authors also wish to thank the Peruvian National Open Data Platform and OSIPTEL for making public telecommunications data available, which, in turn, enabled the analytical and predictive components of this study. No external funding was received for this research.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThe data used in this study were obtained from the Peruvian Government\u0026rsquo;s Open Data platform (https://www.datosabiertos.gob.pe), which collects historical records of complaints and outages in telecommunications services. From this source, the dataset used to train and evaluate the Random Forest prediction model was constructed.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAlice, A. \u0026iquest;Qu\u0026eacute; es la eficiencia operativa? IBM, 26 de marzo de 2024. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ibm.com/es-es/topics/operational-efficiency\u003c/span\u003e\u003cspan address=\"https://www.ibm.com/es-es/topics/operational-efficiency\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBanco Interamericano de Desarrollo (BID). Al menos 77 millones de personas sin acceso a internet de calidad en \u0026aacute;reas rurales, BID, 1 de julio de 2020. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.iadb.org/es/noticias/al-menos-77-millones-de-personas-sin-acceso- internet-de-calidad-en-areas-rurales\u003c/span\u003e\u003cspan address=\"https://www.iadb.org/es/noticias/al-menos-77-millones-de-personas-sin-acceso- internet-de-calidad-en-areas-rurales\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarvalho, J. Innovaci\u0026oacute;n proactiva vs. reactiva: \u0026iquest;Cu\u0026aacute;l es el mejor enfoque para las micro y peque\u0026ntilde;as empresas? \u003cem\u003ePODIUM\u003c/em\u003e, no. 45, pp. 163\u0026ndash;176, [En l\u0026iacute;nea]. Disponible: (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://revistas.uees.edu.ec/index.php/Podium/article/view/1179\u003c/span\u003e\u003cspan address=\"https://revistas.uees.edu.ec/index.php/Podium/article/view/1179\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCastro et al. La investigaci\u0026oacute;n aplicada y el desarrollo experimental en el fortalecimiento de las competencias de la sociedad del siglo XXI, Tecnura, vol. 27, no. 75, pp. 140\u0026ndash;174, [En l\u0026iacute;nea]. Disponible: (2023). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.14483/22487638.19171\u003c/span\u003e\u003cspan address=\"10.14483/22487638.19171\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCEPAL, Tecnolog\u0026iacute;as digitales para un nuevo futuro. [En l\u0026iacute;nea]. Disponible: (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://repositorio.cepal.org/server/api/core/bitstreams/879779be-c0a0-4e11-\u003c/span\u003e\u003cspan address=\"https://repositorio.cepal.org/server/api/core/bitstreams/879779be-c0a0-4e11-\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e 8e08-cf80b41a4fd9/content. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eC\u0026oacute;digo de \u0026Eacute;tica en Investigaci\u0026oacute;n de la Universidad C\u0026eacute;sar Vallejo, Universidad C\u0026eacute;sar Vallejo. [En l\u0026iacute;nea]. Disponible: (2022). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://webadminportal.ucv.edu.pe/uploads/files/backup/RCUN-470-2022-UCV- Aprueba-actualizacion-del-Codigo-de-Etica-en-Investigacion-V01.pdf\u003c/span\u003e\u003cspan address=\"https://webadminportal.ucv.edu.pe/uploads/files/backup/RCUN-470-2022-UCV- Aprueba-actualizacion-del-Codigo-de-Etica-en-Investigacion-V01.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDirectiva, C. O. N. C. Y. T. E. C. N\u0026ordm; 004-2016-CONCYTEC-DEGC, (s.f.). [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://transparencia.concytec.gob.pe/images/transparencia/2016/R.P.087-2016-\u003c/span\u003e\u003cspan address=\"https://transparencia.concytec.gob.pe/images/transparencia/2016/R.P.087-2016-\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e P.pdf [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDecreto Supremo N\u0026ordm; 033-2023-PCM, Presidencia del Consejo de Ministros, 2023. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://cdn.www.gob.pe/uploads/document/file/4324311/DS%20N%C2%B0%20033- 2023-PCM.pdf.pdf?v\u0026thinsp;=\u0026thinsp;1679694212\u003c/span\u003e\u003cspan address=\"https://cdn.www.gob.pe/uploads/document/file/4324311/DS%20N%C2%B0%20033- 2023-PCM.pdf.pdf?v\u0026thinsp;=\u0026thinsp;1679694212\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eL\u0026oacute;pez et al. Application of a Data Mining Model to Predict Customer Defection. Case of a Telecommunications Company in Peru, Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, vol. 14, no. 1, pp. 144\u0026ndash;158, [En l\u0026iacute;nea]. Disponible: (2023). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.58346/JOWUA.2023.I1.012\u003c/span\u003e\u003cspan address=\"10.58346/JOWUA.2023.I1.012\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMiranda, P. Estudio de predicci\u0026oacute;n en la radicaci\u0026oacute;n de quejas y reclamos por parte de usuarios inconformes en empresa de sanidad colombiana, Trabajo de Maestr\u0026iacute;a, Universidad Complutense de Madrid, Madrid, Espa\u0026ntilde;a, 2021. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.proquest.com/openview/d09cc7a0dcf3ae44bca1e3796a49a682/1?pq\u003c/span\u003e\u003cspan address=\"https://www.proquest.com/openview/d09cc7a0dcf3ae44bca1e3796a49a682/1?pq\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e- origsite\u0026thinsp;=\u0026thinsp;gscholar\u0026amp;cbl\u0026thinsp;=\u0026thinsp;2026366\u0026amp;diss\u0026thinsp;=\u0026thinsp;y [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eORCA, G. R. C. 6 m\u0026eacute;tricas imprescindibles para la evaluaci\u0026oacute;n de riesgos operacionales, 9 de septiembre de 2024. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://blog.orcagrc.com/6- m%C3%A9tricas-evaluaci%C3%B3n-riesgos-operacionales\u003c/span\u003e\u003cspan address=\"https://blog.orcagrc.com/6- m%C3%A9tricas-evaluaci%C3%B3n-riesgos-operacionales\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOSIPTEL. Diferencias entre reclamos, quejas y denuncias, (s.f.). [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.osiptel.gob.pe/portal-del-usuario/lo-que-debes-saber/guia-de- informacion-y-orientacion/diferencias-entre-reclamos-quejas-y-\u003c/span\u003e\u003cspan address=\"https://www.osiptel.gob.pe/portal-del-usuario/lo-que-debes-saber/guia-de- informacion-y-orientacion/diferencias-entre-reclamos-quejas-y-\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e denuncias?utm_source\u0026thinsp;=\u0026thinsp;chatgpt.com [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOSIPTEL \u0026amp; OSIPTEL: te explicamos. cu\u0026aacute;ndo y c\u0026oacute;mo presentar un reclamo por problemas con la prestaci\u0026oacute;n de tus servicios de telecomunicaciones, 30 de junio de 2022. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.osiptel.gob.pe/portal-del-usuario/noticias/osiptel-te- explicamos-cuando-y-como-presentar-un-reclamo-por-problemas-con-la-prestacion- de-tus-servicios-de-telecomunicaciones/\u003c/span\u003e\u003cspan address=\"https://www.osiptel.gob.pe/portal-del-usuario/noticias/osiptel-te- explicamos-cuando-y-como-presentar-un-reclamo-por-problemas-con-la-prestacion- de-tus-servicios-de-telecomunicaciones/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeng et al. RLclean: An unsupervised integrated data cleaning framework based on deep reinforcement learning, Information Sciences, vol. 682, p. 121281, [En l\u0026iacute;nea]. Disponible: (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ins.2024.121281\u003c/span\u003e\u003cspan address=\"10.1016/j.ins.2024.121281\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePlataforma Nacional de Datos Abiertos. Obtener datos abiertos del gobierno, 3 de agosto de 2022. [En l\u0026iacute;nea]. Disponible: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gob.pe/pl/7381-obtener-datos- abiertos-del-gobierno\u003c/span\u003e\u003cspan address=\"https://www.gob.pe/pl/7381-obtener-datos- abiertos-del-gobierno\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRamos, C. Dise\u0026ntilde;os de investigaci\u0026oacute;n experimental, CienciAm\u0026eacute;rica, vol. 10, no. 1, pp. 1\u0026ndash;7, [En l\u0026iacute;nea]. Disponible: (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.33210/ca.v10i1.356\u003c/span\u003e\u003cspan address=\"10.33210/ca.v10i1.356\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRoca y, C., Mullor, M. \u0026amp; CONTRASTE O PRUEBA DE HIP\u0026Oacute;TESIS E INTRODUCCI\u0026Oacute;N AL AN\u0026Aacute;LISIS DE REGRESI\u0026Oacute;N LINEAL O AJUSTE DE M\u0026Iacute;NIMOS CUADRADOS. NOTAS PARA DOCTORANDOS, Revista Ingenier\u0026iacute;a, Matem\u0026aacute;ticas Y Ciencias De La Informaci\u0026oacute;n, vol. 11, no. 21, pp. 13\u0026ndash;25, [En l\u0026iacute;nea]. Disponible: (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ojs.urepublicana.edu.co/index.php/ingenieria/article/view/921\u003c/span\u003e\u003cspan address=\"https://ojs.urepublicana.edu.co/index.php/ingenieria/article/view/921\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalman et al. Random Forest Algorithm Overview, Babylonian Journal of Machine Learning, pp. 69\u0026ndash;79, 2024. [En l\u0026iacute;nea]. Disponible: (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.58496/BJML/2024/007\u003c/span\u003e\u003cspan address=\"10.58496/BJML/2024/007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeverino, M. K. \u0026amp; Peng, Y. y Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata, Machine Learning with Applications, vol. 5, p. 100074, [En l\u0026iacute;nea]. Disponible: (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.mlwa.2021.100074\u003c/span\u003e\u003cspan address=\"10.1016/j.mlwa.2021.100074\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSingh, P. Harnessing machine learning for predictive troubleshooting in telecom networks, Australian Journal of Machine Learning Research \u0026amp; Applications, vol. 3, no. 2, [En l\u0026iacute;nea]. Disponible: (2023). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://sydneyacademics.com/index.php/ajmlra/article/view/108\u003c/span\u003e\u003cspan address=\"https://sydneyacademics.com/index.php/ajmlra/article/view/108\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSumithra et al. Improving brain tumor diagnosis: A self-calibrated 1D residual network with random forest integration, Brain Research, [En l\u0026iacute;nea]. Disponible: (2025). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.brainres.2025.149704\u003c/span\u003e\u003cspan address=\"10.1016/j.brainres.2025.149704\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDiamantopoulou et al. Evaluation of the random forest regression machine learning technique as an alternative to ecoregional based regression taper modelling. \u003cem\u003eComput. Electron. Agric.\u003c/em\u003e \u003cb\u003e239\u003c/b\u003e, 110964. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compag.2025.110964\u003c/span\u003e\u003cspan address=\"10.1016/j.compag.2025.110964\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (Dec. 2025). Part A.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eValdivieso, D. \u0026amp; Ca\u0026ntilde;ar, J. y Transformational Leadership in Decision Making at the Virgen del Cisne Savings and Loan Cooperative, Latacunga Canton, LATAM Revista Latinoamericana De Ciencias Sociales Y Humanidades, vol. 6, no. 1, pp. 888\u0026ndash;906, [En l\u0026iacute;nea]. Disponible: (2025). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.56712/latam.v6i1.3388\u003c/span\u003e\u003cspan address=\"10.56712/latam.v6i1.3388\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVentura et al. Predicting the success of banking telemarketing through the use of decision trees, Innovaci\u0026oacute;n y Software, vol. 4, no. 1, pp. 122\u0026ndash;137, [En l\u0026iacute;nea]. Disponible: (2023). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.redalyc.org/journal/6738/673874721009/673874721009.pdf\u003c/span\u003e\u003cspan address=\"https://www.redalyc.org/journal/6738/673874721009/673874721009.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eViana et al. Detecci\u0026oacute;n de fraudes por reclamos enga\u0026ntilde;osos de clientes en entidades bancarias a trav\u0026eacute;s de t\u0026eacute;cnicas de miner\u0026iacute;a de datos: una revisi\u0026oacute;n sistem\u0026aacute;tica, Revista Ib\u0026eacute;rica de Sistemas e Tecnologias de Informa\u0026ccedil;\u0026atilde;o, no. 43, pp. 276\u0026ndash;286, [En l\u0026iacute;nea]. Disponible: (2021). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.proquest.com/openview/d6e8abf6973ed45892d16d473917df88/1?pq\u003c/span\u003e\u003cspan address=\"https://www.proquest.com/openview/d6e8abf6973ed45892d16d473917df88/1?pq\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e- origsite\u0026thinsp;=\u0026thinsp;gscholar\u0026amp;cbl\u0026thinsp;=\u0026thinsp;1006393. [Consultado: 27 de mayo de 2025].\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVillarreal et al. Las familia como unidad de an\u0026aacute;lisis en la investigaci\u0026oacute;n cient\u0026iacute;fica en medicina familiar, Revista mexicana de medicina familiar, vol. 9, no. 1, pp. 31\u0026ndash;34, [En l\u0026iacute;nea]. Disponible: (2022). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.24875/rmf.21000064\u003c/span\u003e\u003cspan address=\"10.24875/rmf.21000064\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVizca\u0026iacute;no et al. Metodolog\u0026iacute;a de la investigaci\u0026oacute;n cient\u0026iacute;fica: gu\u0026iacute;a pr\u0026aacute;ctica, Ciencia Latina Revista Cient\u0026iacute;fica Multidisciplinar, vol. 7, no. 4, pp. 9723\u0026ndash;9762, [En l\u0026iacute;nea]. Disponible: (2023). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.37811/cl_rcm.v7i4.7658\u003c/span\u003e\u003cspan address=\"10.37811/cl_rcm.v7i4.7658\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWen et al. A hybrid-optimized Random Forest interpretable model for debris flow susceptibility by prior model-based negative sampling, Advances in Space Research, [En l\u0026iacute;nea]. Disponible: (2025). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.asr.2025.04.055\u003c/span\u003e\u003cspan address=\"10.1016/j.asr.2025.04.055\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDiamantopoulou et al. Evaluation of the random forest regression machine learning technique as an alternative to ecoregional based regression taper modelling. \u003cem\u003eComput. Electron. Agric.\u003c/em\u003e \u003cb\u003e239\u003c/b\u003e, 110964. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compag.2025.110964\u003c/span\u003e\u003cspan address=\"10.1016/j.compag.2025.110964\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (Dec. 2025). Part A.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFellah, M., Ouhaibi, S., Belouaggadia, N. \u0026amp; Mansouri, K. Energy consumption forecasting and thermal insulator selection with random forest regression. \u003cem\u003eSci. Afr.\u003c/em\u003e \u003cb\u003e29\u003c/b\u003e, e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.sciaf.2025.e02870\u003c/span\u003e\u003cspan address=\"10.1016/j.sciaf.2025.e02870\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (Sep. 2025).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Random Forest, fault prediction, telecommunications, service complaints, machine learning, SDG 9","lastPublishedDoi":"10.21203/rs.3.rs-8341643/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8341643/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eTelecommunications operators face numerous challenges in maintaining service continuity due to the increasing complexity of their networks, and demand must be operationally monitored. Within this context, this research studies the development of a predictive model using Random Forest to forecast service outage claims in Peruvian telecommunications companies from 2016 to 2024. A quantitative, applied, non-experimental, and longitudinal design was used, utilizing 87,003 records from the National Open Data Platform. The model was validated using cross- validation and its accuracy was evaluated using the RMSE, MAE, and R\u0026sup2; metrics. The results show good performance of the predictive model, achieving an RMSE of 42.80, an MAE of 33.38, and an R\u0026sup2; of 0.72, reflecting its high accuracy. The general and specific hypotheses of the research were confirmed, demonstrating that Random Forest allows for reliable predictions of service outage claims. Furthermore, it aligns with SDG 9, which clarifies the value of its research in relation to service quality and infrastructure planning in Peru.\u003c/p\u003e","manuscriptTitle":"Random Forest Model for Predicting Claims for Outages in Telecommunications Operating Companies in Peru (2016–2024)","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-22 18:05:16","doi":"10.21203/rs.3.rs-8341643/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-02-04T09:44:46+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-02-04T01:33:24+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"6573773451689247718532186361181543143","date":"2026-01-29T16:22:55+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-03T15:41:49+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"108140368109030835186114302629053467327","date":"2025-12-23T01:32:27+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"156495200316185636913716483848343221309","date":"2025-12-22T10:13:04+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-12-19T16:48:20+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-12-16T18:30:05+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-12-15T12:57:24+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-15T12:54:32+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-12-12T04:06:14+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"6cf0d264-834b-44b6-ae00-fa3cad806145","owner":[],"postedDate":"December 22nd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"in-revision","subjectAreas":[{"id":59968868,"name":"Physical sciences/Energy science and technology"},{"id":59968869,"name":"Physical sciences/Engineering"},{"id":59968870,"name":"Physical sciences/Mathematics and computing"}],"tags":[],"updatedAt":"2026-02-04T09:58:58+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-22 18:05:16","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8341643","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8341643","identity":"rs-8341643","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00