Data quality and Big Data in the health industry: a scoping review protocol

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 3,256 characters · extracted from oa-doi-fallback · 3 sections · click to expand

Abstract

Introduction Big Data is characterized by the large volume of data, the variety of types and formats, the speed with which they are generated, and the veracity and value that can be extracted from the data. However, the result obtained with this technology will depend on the quality of the information obtained from the data. Big Data has great potential in healthcare and can be used to advance diagnosis, treatment, and healthcare management. Health data is highly vulnerable due to its sensitive nature, as it contains personal and confidential information. If exposed or compromised, it could lead to privacy violations, inaccuracies, misuse, incorrect diagnoses, or misguided decision-making in patient care. It is important to prioritize confidentiality, adhere to regulatory compliance, and maintain data integrity; for that, it is essential to use efficient methods to obtain quality data and make them able to reach the proposed objective.

Objective

In this context, the scoping review protocol aims to identify and map existing strategies, methods, or models that improve the quality of medical and health data in Big Data environments. This review explores the methods to support the effective use of Big Data in healthcare while addressing the challenges to maintain data integrity and ensure safe decision-making.

Methods

and analysis This scoping review will be conducted based on the six-step process outlined in the framework proposed by Levac et al. in “Scoping Studies: Advancing the methodology” and will be reported following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist. The research team will use Data Quality, Big Data, and Health terms to search for primary studies in the Scopus Document Search, IEEE Xplore Digital Library, and ACM Digital Library databases. Competing Interest Statement The authors have declared no competing interest. Funding Statement This study did not receive any funding Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes Data Availability All data produced in the present work are contained in the manuscript

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-06-17T06:32:23.968882+00:00