Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework

doi:10.22541/au.177226207.79822218/v1

Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework

2026 · doi:10.22541/au.177226207.79822218/v1

preprint OA: closed

Full text JSON View at publisher

Full text 17,620 characters · extracted from preprint-html · click to expand

Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework | Authorea try { document.documentElement.classList.add('js'); } catch (e) { } var _gaq = _gaq || []; _gaq.push(['_setAccount', 'G-8VDV14Y67G']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Skip to main content Preprints Collections Wiley Open Research IET Open Research Ecological Society of Japan All Collections About About Authorea FAQs Contact Us Quick Search anywhere Search for preprint articles, keywords, etc. Search Search ADVANCED SEARCH SCROLL This is a preprint and has not been peer reviewed. Data may be preliminary. 28 February 2026 V1 Latest version Share on Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework Authors : Arturo Martí-Carvajal 0000-0001-8677-3351 [email protected] and David L. Streiner Authors Info & Affiliations https://doi.org/10.22541/au.177226207.79822218/v1 189 views 80 downloads Contents Abstract Information & Authors Metrics & Citations View Options References Figures Tables Media Share Abstract Abstract Background: The GRADE framework is widely used to assess the certainty of evidence in systematic reviews and guideline development. Its four categories (high, moderate, low, very low) represent structured judgments regarding how likely the probability that the true effect lies within a specified range around the estimate. However, the construct labeled as “certainty” is operationalized as a graded probabilistic judgment, raising questions about potential conceptual ambiguity. Objective: To examine whether the terminology “certainty of evidence” aligns with its operational definition within GRADE and to explore whether terminological clarification could enhance interpretive precision without altering methodological structure. Methods: We conducted a conceptual analysis of key GRADE publications from 2004 to 2025, including GRADE Guidance papers, examining the evolution of terminology and its alignment with principles of construct validity in measurement theory. Results: Early GRADE publications framed judgments primarily in terms of “confidence in estimates.” Subsequent guidance consolidated the terminology “certainty of evidence,” while retaining probabilistic and graded operational criteria. From a construct validity perspective, the operational definition corresponds to graded confidence rather than categorical epistemic certainty. Although this does not undermine the methodological integrity of GRADE, it may introduce interpretive ambiguity, particularly in interdisciplinary or high-stakes contexts. Conclusions: Reframing “certainty of evidence” as “confidence in evidence” would preserve the analytic structure of GRADE while improving semantic alignment between construct label and operational function. Terminological refinement represents an incremental clarification consistent with GRADE’s tradition of methodological development. Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework Authors Arturo Martí-Carvajal¹ , ²*, David L. Streiner³ ¹ Cátedra Rectoral de Medicina Basada en la Evidencia, Universidad de Carabobo, Valencia, Venezuela ² Facultad de Medicina, Universidad Francisco de Vitoria, Madrid, Spain ³ Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Ontario, Canada *Corresponding author: Arturo Martí-Carvajal Email: [email protected] Abstract Background: The GRADE framework is widely used to assess the certainty of evidence in systematic reviews and guideline development. Its four categories (high, moderate, low, very low) represent structured judgments regarding how likely the probability that the true effect lies within a specified range around the estimate. However, the construct labeled as “certainty” is operationalized as a graded probabilistic judgment, raising questions about potential conceptual ambiguity. Objective: To examine whether the terminology “certainty of evidence” aligns with its operational definition within GRADE and to explore whether terminological clarification could enhance interpretive precision without altering methodological structure. Methods: We conducted a conceptual analysis of key GRADE publications from 2004 to 2025, including GRADE Guidance papers, examining the evolution of terminology and its alignment with principles of construct validity in measurement theory. Results: Early GRADE publications framed judgments primarily in terms of “confidence in estimates.” Subsequent guidance consolidated the terminology “certainty of evidence,” while retaining probabilistic and graded operational criteria. From a construct validity perspective, the operational definition corresponds to graded confidence rather than categorical epistemic certainty. Although this does not undermine the methodological integrity of GRADE, it may introduce interpretive ambiguity, particularly in interdisciplinary or high-stakes contexts. Conclusions: Reframing “certainty of evidence” as “confidence in evidence” would preserve the analytic structure of GRADE while improving semantic alignment between construct label and operational function. Terminological refinement represents an incremental clarification consistent with GRADE’s tradition of methodological development. Highlights • GRADE operationalizes graded probabilistic judgments. • The construct label “certainty” may not fully align with this operationalization. • Construct-label alignment enhances interpretive clarity. • A minimal terminological refinement could preserve methodology while improving precision. 1. Introduction The GRADE framework (Grading of Recommendations, Assessment, Development and Evaluations) has become central to global evidence synthesis practice. Since its introduction in 2004, it has provided a structured and transparent approach for rating the certainty of evidence across five domains: risk of bias, inconsistency, indirectness, imprecision, and publication bias [1,2]. GRADE defines certainty as the extent to which one can be confident that an estimate of effect is correct [1]. Judgments are expressed in four ordered categories: high, moderate, low, and very low. While the operational criteria are explicit and structured, the terminology used to describe these judgments raises a conceptual question: does the construct labeled as “certainty” align with its graded probabilistic judgment? The purpose of this paper is not to challenge the analytic domains of GRADE. Rather, we examine whether terminological clarification could enhance conceptual alignment and interpretive precision, consistent with principles of construct validity in measurement theory [3]. 2. Construct Validity and Terminological Alignment In measurement science, a construct must demonstrate coherence between its conceptual label and its operational definition [3]. A misalignment between construct label and function does not invalidate a tool, but it may introduce interpretive ambiguity. GRADE operationalizes certainty as a graded judgment regarding the likelihood that the true effect lies close to the estimated effect or beyond a decision threshold [4,5]. This structure corresponds to graded probabilistic judgment of confidence in effect estimates. However, in broader epistemological discourse, certainty commonly denotes a categorical state exempt from reasonable doubt. By contrast, probabilistic confidence is inherently gradable. The issue identified here concerns construct labeling rather than methodological function. GRADE evaluates domains that influence confidence in effect estimates; it does not claim to establish philosophical certainty. Nonetheless, the term “certainty” may carry stronger categorical connotations in some contexts, potentially generating interpretive variability. From a construct validity perspective, aligning terminology with operational definition enhances clarity and reduces semantic ambiguity. 3. Evolution of Terminology within GRADE The foundational GRADE publication described quality of evidence in terms of “confidence that an estimate of effect is correct” [1]. Over time, subsequent publications consolidated the phrase “certainty of evidence” [2,4,6]. Recent guidance has clarified that certainty reflects the probability that the true effect lies within a specified range or beyond a defined threshold [4,6]. These clarifications emphasize the probabilistic nature of the construct. However, the categorical phrasing of “high certainty” or “very low certainty” remains. This suggests that the tension is terminological rather than structural. The analytic domains and rating procedures remain consistent; the question concerns the semantic alignment of the label. 4. Interpretive Implications Most experienced users of GRADE understand that its categories represent structured judgments of confidence. However, systematic reviews increasingly inform regulatory decisions, public policy, regulatory, and legal contexts. In such contexts, language may influence interpretation beyond methodological communities. When graded probabilistic judgments are labeled as “certainty,” readers outside evidence-synthesis disciplines may interpret such judgments as stronger epistemic claims than intended. Terminological refinement may therefore enhance interpretive precision without altering methodological content. This proposal should be understood as a refinement within the GRADE tradition of iterative methodological development, not as a challenge to its foundational architecture. 5. A Minimal Terminological Clarification We propose replacing “certainty of evidence” with “confidence in evidence” or “confidence in effect estimates.” Such a change would: • Align terminology with the original operational framing [1]. • Preserve the four-level structure. • Maintain compatibility with existing GRADE tools. • Reduce potential ambiguity between graded probabilistic judgment and categorical epistemic certainty. Terminological refinement would not represent correction of methodological error but clarification of conceptual alignment between probabilistic judgment and linguistic expression. 6. Conclusion GRADE represents a major advance in evidence synthesis methodology. Its structured domains and transparent processes have strengthened global practice. Our analysis suggests that the construct currently labeled as “certainty” corresponds operationally to graded probabilistic judgment. Clarifying terminology may enhance semantic precision while preserving methodological integrity. Conceptual refinement at the level of construct labeling represents an incremental improvement consistent with ongoing methodological evolution. Declarations Conflicts of Interest: Arturo Martí-Carvajal has authored Cochrane systematic reviews using GRADE. David L. Streiner has extensive experience in measurement theory. The authors declare no conflicts of interest related to this manuscript. Funding: No specific funding was received. Author Contributions: AMC: Conceptualization, drafting, revision. DLS: Conceptualization, measurement-theory framing, critical revision. References 1. Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, et al. Grading quality of evidence and strength of recommendations. BMJ. 2004;328(7454):1490. doi:10.1136/bmj.328.7454.1490 2. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schünemann HJ; GRADE Working Group. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924-926. doi:10.1136/bmj.39489.470347.AD 3. Streiner DL, Norman GR, Cairney J. Health Measurement Scales: A Practical Guide to Their Development and Use. 5th ed. Oxford University Press; 2015. 4. Hultcrantz M, Rind D, Akl EA, Treweek S, Mustafa RA, Iorio A, et al. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol. 2017;87:4-13. doi:10.1016/j.jclinepi.2017.05.006. 5. Alper BS, Oettgen P, Kunnamo I, Iorio A, Ansari MT, Murad MH, et al. Defining certainty of net benefit: a GRADE concept paper. BMJ Open. 2019;9(3):e027445. doi:10.1136/bmjopen-2018-027445. 6. Hultcrantz M, Schünemann HJ, Mustafa RA, Rind DM, Murad MH, Mayer M, et al. GRADE certainty ratings: thresholds rather than categories of contextualization (GRADE Guidance 41). Ann Intern Med. 2025;178(8):1183-1186. doi:10.7326/ANNALS-25-00548 COVER LETTER Dear Editors, We are pleased to submit our manuscript entitled “Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework” for consideration in Cochrane Evidence Synthesis and Methods . GRADE has profoundly shaped global evidence synthesis practice. In this manuscript, we examine whether the construct labeled as “certainty of evidence” aligns conceptually with its graded probabilistic judgment. Drawing on construct validity principles in measurement theory and reviewing the evolution of terminology in GRADE publications from 2004 to 2025, we suggest that a minimal terminological clarification—reframing “certainty” as “confidence”—could enhance semantic precision while preserving analytic structure. Our intention is not to challenge the methodological architecture of GRADE, but to contribute to its ongoing refinement through conceptual clarification consistent with its probabilistic intent. We believe this manuscript fits the journal’s focus on methodological development and debate in evidence synthesis. This manuscript is original, not under consideration elsewhere, and approved by all authors. Sincerely, Arturo Martí-Carvajal David L. Streiner Information & Authors Information Version history V1 Version 1 28 February 2026 Copyright This work is licensed under a Non Exclusive No Reuse License. Authors Affiliations Arturo Martí-Carvajal 0000-0001-8677-3351 [email protected] Universidad de Carabobo Facultad de Ciencias de la Salud - Escuela de Medicina (Nucleo Carabobo) View all articles by this author David L. Streiner McMaster University Department of Psychiatry and Behavioural Neurosciences View all articles by this author Metrics & Citations Metrics Article Usage 189 views 80 downloads .FvxKWukQNSOunydq8rnd { width: 100px; } Citations Download citation Arturo Martí-Carvajal, David L. Streiner. Certainty or Confidence? A Conceptual Clarification of Construct Labeling in the GRADE Framework. Authorea . 28 February 2026. DOI: https://doi.org/10.22541/au.177226207.79822218/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu . Format Please select one from the list RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks Direct import Tips for downloading citations document.getElementById('citMgrHelpLink').addEventListener('click', function() { popupHelp(this.href); return false; }); $(".js__slcInclude").on("change", function(e){ if ($(this).val() == 'refworks') $('#direct').prop("checked", false); $('#direct').prop("disabled", ($(this).val() == 'refworks')); }); View Options View options PDF View PDF Figures Tables Media Share Share Share article link Copy Link Copied! Copying failed. Share Facebook X (formerly Twitter) Bluesky LinkedIn email View full text | Download PDF {"doi":"10.22541/au.177226207.79822218/v1","type":"Article"} Now Reading: Share Figures Tables Close figure viewer Back to article Figure title goes here Change zoom level Go to figure location within the article Download figure Toggle share panel Toggle share panel Share Toggle information panel Toggle information panel Go to previous graphic Go to next graphic Go to previous table Go to next table All figures All tables View all material View all material xrefBack.goTo xrefBack.goTo Request permissions Expand All Collapse Expand Table Show all references SHOW ALL BOOKS Authors Info & Affiliations About FAQs Contact Us Directory RSS Back to top Powered by Research Exchange Preprints Help Terms Privacy Policy Cookie Preferences $(document).ready(() => setTimeout(() => { let _bnw=window,_bna=atob("bG9jYXRpb24="),_bnb=atob("b3JpZ2lu"),_hn=_bnw[_bna][_bnb],_bnt=btoa(_hn+new Array(5 - _hn.length % 4).join(" ")); $.get("/resource/lodash?t="+_bnt); },4000)); (function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9fe3d71d386958f4',t:'MTc3OTIwMTQ1Mw=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00