Exam Security and Technology Governance Questionnaire (ESTG-Q): A Modular Instrument with an Item-Level Attitude-Marking Mechanism for Exam Integrity Research

preprint OA: closed
Full text JSON View at publisher
Full text 171,185 characters · extracted from preprint-html · click to expand
Exam Security and Technology Governance Questionnaire (ESTG-Q): A Modular Instrument with an Item-Level Attitude-Marking Mechanism for Exam Integrity Research | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Exam Security and Technology Governance Questionnaire (ESTG-Q): A Modular Instrument with an Item-Level Attitude-Marking Mechanism for Exam Integrity Research Dengfeng Hui, Shamala K. Subramaniam, Lili Nurliyana binti Abdullah, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8502983/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background: High-stakes examinations are increasingly challenged by technology-enabled cheating and by governance dilemmas in implementing countermeasures. While prior research has examined student misconduct and proctoring technologies, the governance perspective of examination administrators—the frontline implementers of integrity policies—remains under-examined. Methods: We developed the Exam Security and Technology Governance Questionnaire (ESTG‑Q), a modular instrument comprising 280 questions. Development was guided by a provisional domain map (initially drafted as 16 topical categories and 22 hypothesized dimensions) that remains subject to refinement as the broader program progresses. For administration and reporting, items are presented in numerical order and organized into chapter-aligned thematic blocks; these blocks are navigation/reporting units rather than validated latent constructs. The instrument combines 233 Likert-type items with scenario-based dilemmas and contextual/background items. Item development followed domain mapping and iterative review. To improve traceability during item refinement, we used an LLM-assisted, CVI-style content-relevance screening (six independent runs under a fixed 4-point rubric) as additional evidence alongside human judgment. An optional “thumbs-up/thumbs-down/abstain” attitude-marking feature captured respondent-centered item acceptability. The instrument was fielded in a national sample of examination administrators in China (N = 207). Results: The evidence indicated high item relevance agreement (>90% of items had I-CVI ≥ 0.75; S-CVI/Ave = 0.92 for all items and 0.97 for core-construct items). Whole-instrument internal consistency across the 233 Likert-type items was high (Cronbach’s α = 0.963). A seven-item Cheating Attribution subscale (Q236–Q242) showed acceptable reliability (α = 0.723) and a two-factor structure in EFA (Factor 1 = institutional deterrence beliefs, 39.77%; Factor 2 = role-based moral permissiveness, 28.09%; cumulative = 67.86%), supported by CFA (χ²/df = 2.10, CFI = 0.97, RMSEA = 0.06). Conclusions: The ESTG-Q enables modular, profile-based assessment of exam security governance at scale while documenting item-level acceptability in sensitive integrity contexts. It supports diagnostic use beyond a single total score and facilitates comparison between deterrence-oriented beliefs and role-based discretionary judgments. Educational integrity exam security technology governance proctoring questionnaire development content relevance psychometrics administrators China Figures Figure 1 Figure 2 1. Introduction Educational integrity is governed at the boundary between rules, technology, and human judgment. As examination systems scale and digitize, cheating often adapts faster than governance, fueling an arms race between misconduct and control. Recent syntheses of online exam cheating describe an expanding repertoire of tactics that exploit consumer electronics, communication platforms, and loopholes in supervision (Noorbehbahani et al. 2022). Beyond opportunistic in-situ cheating, outsourcing markets and impersonation pressures—often discussed under the umbrella of contract cheating—add further governance load to assessment systems (Clarke & Lancaster 2013; Newton 2018; Bretag et al. 2019). Countermeasures increasingly rely on surveillance and automation, including identity verification, signal detection/blocking, automated invigilation, and remote proctoring. While such tools may strengthen deterrence, they can also reshape the moral ecology of testing by altering trust, increasing stress, and raising concerns about privacy and procedural fairness (Kharbat & Abu Daabes 2021; Coghlan et al. 2021). Evidence on remote proctoring suggests that deterrence gains can coexist with concerns about student experience and validity; studies report score shifts after webcam proctoring and mixed evidence on whether cheating increased (Dendir & Maxwell 2020; Arnold 2022). Governance questions therefore extend beyond detection efficacy: they include legitimacy, proportionality, and the social acceptability of enforcement. A conspicuous gap in the integrity literature is the governance perspective of the actors who implement exam policies. Examination administrators interpret rules, deploy monitoring systems, manage staff, and decide how incidents are handled. Classic work on policy implementation emphasizes how such street-level discretion shapes policy as practiced (Lipsky 1980). Yet validated instruments that capture administrators’ perspectives across the legal, technical, and ethical dimensions of exam security remain scarce. China’s high-stakes examination system provides a distinctive governance context in which deterrence technologies, legal sanctions, and institutional accountability are tightly coupled—making administrators’ perspectives particularly informative for integrity research. This paper introduces the Exam Security and Technology Governance Questionnaire (ESTG-Q), a modular instrument designed for examination administrators. The ESTG-Q is built to (i) support domain-specific scoring rather than forcing a single total score across heterogeneous constructs, and (ii) embed an item-level attitude-marking mechanism to capture respondent-centered acceptability signals for sensitive items. We report the instrument’s development and evidence package (content relevance indices, whole-instrument reliability, and respondent-centered acceptability signals), and provide a worked psychometric demonstration using the two-factor Cheating Attribution subscale (Q236–Q242), complemented by descriptive evidence from additional thematic item sets. 2. Background Exam integrity is a moving target in high-stakes assessment contexts, as cheating methods increasingly leverage consumer electronics and communication tools (Noorbehbahani et al. 2022). In response, governance measures now combine legal deterrence with technical controls. In China, for example, organizing cheating in legally prescribed national examinations and providing technical assistance can entail criminal liability under the Criminal Law (Amendment IX) (Standing Committee of the National People’s Congress 2015), alongside administrative sanctions and multi-agency enforcement campaigns documented by official government releases (The State Council of the People’s Republic of China 2025). This reflects a broader governance logic in which legal sanctions, technological countermeasures, and institutional accountability are expected to reinforce one another. Such interventions can strengthen deterrence while introducing ethical and practical trade-offs. Advanced proctoring technologies—from radio-frequency signal jammers and biometric identity checks to AI-supported invigilation and online video monitoring—can improve detection and deterrence of cheating. Yet they may also generate privacy, fairness, and psychological-stress concerns among test-takers, and can affect trust in institutions (Kharbat & Abu Daabes 2021; Coghlan et al. 2021). Criticism of remote proctoring frequently centres on perceived intrusiveness, surveillance risk, and opaque automated judgments (Coghlan et al. 2021). A persistent blind spot in the literature is the governance perspective of examination administrators. Much academic integrity research has focused on student behavior (e.g., why students cheat and how to prevent it), while comparatively less has examined the attitudes of the educators and implementers who decide on, operate, and oversee exam security measures (Balash et al. 2023; Chaudhry et al. 2023). In practice, administrators translate policy into action: they interpret rules, deploy monitoring technology, manage testing staff, and handle incidents. Classic work on policy implementation highlights how such “street-level” discretion can shape how rules are enacted in practice (Lipsky 1980). Their beliefs about why cheating occurs, which interventions are justified, and where responsibility lies can therefore influence both enforcement consistency and the perceived legitimacy of integrity measures. This gap suggests that understanding administrators’ viewpoints is critical for a complete picture of exam integrity governance. Measuring exam security governance is challenging because it is not a single attitude or behaviour; it is an interlocking profile of institutional, technological, and ethical domains. Governance work includes knowledge of policies and laws, awareness of emerging cheating tactics, openness to new monitoring technologies, perceptions of procedural fairness, and readiness to invest resources. Integrity policy research similarly emphasizes that policy design and accessibility, responsibility structures, and institutional support shape how integrity standards are enacted (Bretag et al. 2011). Collapsing such heterogeneous domains into one total score can obscure diagnostic information and encourage overinterpretation (Kline 2015). A modular instrument therefore offers a more faithful representation of governance as practiced. To address these gaps, we developed the Exam Security and Technology Governance Questionnaire (ESTG-Q). The ESTG-Q is a structured instrument designed specifically for examination administrators, aiming to capture their perceptions of exam security, technology governance, policy trust, and integrity-related attitudes. We intentionally built the ESTG-Q as a modular framework: it contains multiple thematic sections that can be administered and analysed independently or in purposeful combinations, rather than imposing a monolithic, one-form instrument across heterogeneous governance domains. This design allows researchers or institutions to focus on specific governance components as needed (for example, item sets targeting technology readiness or perceptions of policy legitimacy and trust). In addition, because many ESTG-Q items probe sensitive or controversial practices (e.g. invasive proctoring, strict punishments), we embedded an item-level “attitude marking” feature – a simple thumbs-up / thumbs-down / abstain tag that respondents can apply to each item. This feature lets administrators discreetly mark items they find well-phrased or problematic, providing real-time feedback on item acceptability and perceived clarity as respondent-centered face-validity signals. In summary, this paper presents the development and initial validation of the ESTG-Q, emphasizing two methodological contributions: (1) a modular measurement architecture suited to complex, multi-factor governance domains, and (2) an embedded respondent-centric attitude-marking mechanism that augments standard psychometrics with direct acceptability signals. We posit that treating exam integrity governance as a socio-technical system – and measuring it with finer granularity – can yield richer insights to inform both scholarship and practice in educational integrity. Study Objectives The study had four main objectives: i. Instrument design: Document the instrument development workflow, including translation and a traceable item-review process (human review supplemented by an LLM-assisted, CVI-style relevance screening step that does not replace expert judgment). ii. Innovation – Attitude Marking: Introduce and operationalize an item-level attitude-marking mechanism that captures respondent-centered acceptability and perceived clarity signals in real time, providing respondent-centred acceptability and perceived-clarity signals relevant to face-validity appraisal. iii. Psychometric evaluation: Provide initial psychometric evidence, including whole-instrument reliability for the Likert pool and a worked demonstration of structural validity in one reflective subscale, within a broader staged programme of module-level analyses to be reported in subsequent publications. iv. Demonstration of use: Demonstrate mixed measurement logic via two applications: (a) a reflective, factor-analysable Cheating Attribution subscale (Q236–Q242) evaluated with EFA/CFA; and (b) a diagnostic anchoring item set on emerging technology-aided cheating and early governance (Q40–Q50), reported descriptively and via attitude-marking metrics (thumbs-up, thumbs-down, abstain). 3. Methods Instrument Design and Modular Structure The ESTG-Q comprises a total of 280 questions. Item generation was guided by a provisional domain map developed during early design (initially organised into 16 topical categories) and remains subject to refinement as the broader research programme progresses. Within the questionnaire, 233 core items are Likert-type statements capturing attitudes or perceptions, scored on a six-point agreement scale (0–5). We chose an even-point Likert scale to discourage routine selection of a neutral midpoint and to encourage respondents to take a position on sensitive governance dilemmas—a design choice discussed in survey-methodology research on midpoint response options (Garland 1991 ; Chyung et al. 2017 ). The specific phrasing of the scale anchors was tailored to each thematic item set—for example, from “strongly disagree” to “strongly agree” for attitude statements, or “very unfamiliar” to “very familiar” for knowledge-related items—so that administrators could express nuanced judgments in context. In addition to Likert items, the ESTG-Q includes scenario-based questions and a set of factual or background items. The scenario-based prompts present hypothetical situations (e.g. an exam-room dilemma involving a suspected cheater or a technical failure) to elicit role-embedded ethical judgments. The background items gather contextual information such as the administrator’s experience level, the types of exams they oversee, and any institutional constraints on resources. These items support interpretation by situating responses in real-world conditions (for instance, gauging whether certain attitudes correlate with working in resource-limited settings or with having encountered sophisticated cheating incidents before). Because exam-security governance spans distinct legal, technical, and ethical domains, ESTG-Q is designed as a modular, profile-based instrument rather than a single-score scale. Table 1 summarises the questionnaire in chapter-aligned thematic blocks defined by stable Q-number ranges to support transparent administration and reporting. These blocks are practical reporting units; construct-level modelling is undertaken only where psychometrically warranted. Accordingly, we report block-level profiles and provide worked examples of reflective subscale validation, enabling researchers and practitioners to deploy the full instrument or select fit-for-purpose item sets for specific governance questions (DeVellis 2016 ). Table 1 Provisional chapter-aligned thematic blocks of the ESTG-Q (current development stage) and intended reporting logic. Chapter-aligned thematic block Question range Illustrative submodules (examples) Reporting logic 1. Historical lens of exam fraud prevention Q1–Q39 1.1 Traditional cognition; 1.2 Modern governance cognition; 1.3 Situation appraisal & failure attribution; 1.4 Current situation & comparative cognition Profile indicators (descriptive) 2. Foundations of high-tech cheating and early governance Q40–Q90 2.1 Emerging-technology cheating map & early governance (Q40–Q50); 2.2 Concept shift in prevention and control (Q51–Q76); 2.3 CBT-era emerging methods (Q77–Q90). Q82–Q84 were reassigned to thematic block 8. Mixed: diagnostic anchors + profile indicators (not forced into a single latent scale) 3. Metal detection and signal shielding practice Q91–Q119 On-site screening (Q91–Q107); signal shielding use & controversies (Q108–Q119) Profile indicators (descriptive) 4. Implementation challenges of technical measures Q120–Q163 Implementation constraints and failure points Profile indicators / optional subscales 5. Systematic infrastructure building for technical governance Q164–Q199 System building, standardisation, interoperability Profile indicators / optional subscales 6. Trend appraisal and deeper analysis of high-tech cheating Q200–Q252 Trend appraisal (Q200–Q229); individual motives (Q230–Q239); governance confidence & punishment cognition (Q240–Q252) Profile indicators; includes reflective subscale candidates 7. Procurement preferences and collaboration/adoption attitudes Q253–Q272 Procurement preference (Q253–Q260); collaboration/adoption attitudes (Q261–Q272) Primarily functional/context items (reported descriptively) 8. Respondent background Q273–Q280 Demographics / role / tenure Covariates / stratification variables Note: These blocks are reporting/administration units and are interpreted as governance profiles rather than collapsed into a single total score. A small number of items were reclassified during iterative refinement; stable Q-number identifiers are retained for traceability. To mitigate common-method bias in self-report data, responses were de-identified prior to analysis and reported only in aggregate. The questionnaire was presented in stable Q-number order while mixing response formats (Likert statements, scenario prompts, and background items) to reduce patterned responding. Such procedural remedies help limit method effects such as social desirability and acquiescence (Podsakoff et al. 2003 ). Sample and Data Collection Data were collected between 2021 and 2023 under COVID‑19‑related constraints. We used convenience sampling to recruit examination administrators involved in high‑stakes examinations. With logistical support from participating examination bodies and partner institutions, paper questionnaires were distributed to eligible respondents and returned via prepaid courier mail. Before participation, respondents received a written information sheet describing the study purpose, voluntary participation, confidentiality protections, data‑retention/access arrangements, and de‑identified analytic practice. Respondents completed the questionnaire independently and returned it directly to the research team for data entry and secure storage. To enable traceability of returns while protecting privacy, each paper questionnaire was assigned a unique response ID. The analytic dataset contained ID‑coded responses only. Where identifying information was voluntarily provided for administrative follow‑up, the linkage between the response ID and any identifiers was stored separately in a sealed, access‑restricted archive and was not used in statistical analyses. As a procedural check, Q273 assessed respondents’ satisfaction with the confidentiality and anonymisation arrangements; among respondents who answered this item (n = 199), the median rating was 4 (0–5 scale) and 69.3% selected 4 or 5. Approximately 300 individuals initially indicated willingness to participate, and 253 questionnaires were returned. After manual prescreening, 12 clearly invalid questionnaires were removed. A further 34 cases were excluded following data‑quality checks (e.g., internal consistency and response‑pattern screening), yielding a final valid sample of 207 (valid yield ≈ 81.8% of returned questionnaires). The valid sample covered 27 provincial‑level administrative regions in mainland China (excluding Anhui, Guizhou, Tibet, and Ningxia; Hong Kong, Macao, and Taiwan were not included), reflecting substantial geographic diversity. Respondents spanned multiple organisational levels, examination domains, and operational roles relevant to exam security and technology governance (Table 2 ). Experience in this field was substantial: approximately three‑quarters reported more than five years of work experience, and more than half reported over ten years (Table 2 ). Because recruitment relied on convenience sampling, category proportions should not be interpreted as population estimates; findings are presented as exploratory. Table 2 Data source and sample characteristics (N = 207) Variable and category Code n % Examination management level (Q274) — Company / agent / vendor / manufacturer 0 13 6 — Primary school–vocational college test site 1 47 23 — University / research institute test site 2 58 28 — District-level exam administration agency 3 40 19 — Municipal (city)-level exam administration agency 4 24 12 — Provincial-level (or above) exam administration agency 5 21 10 Examination domain (Q275)* — Education examinations 0 119 57 — Personnel examinations 1 25 12 — Judicial examinations 2 7 3 — Health/medical examinations 3 20 10 — Overseas examination bodies operating in China 4 23 11 — Other examinations 5 29 14 Role in the examination process (Q276)* — Other 0 24 12 — Frontline test-centre staff / invigilator 1 74 36 — Frontline site lead / chief invigilator 2 23 11 — Staff member at a higher-level supervising authority 3 42 20 — Section/department head at a higher-level authority 4 19 9 — Senior leader in charge at a higher-level authority 5 21 10 Procurement role (Q277)* — Not involved in procurement 0 52 25 — Assist / influence procurement 1 21 10 — Define procurement requirements 2 52 25 — Decide technical or commercial requirements 3 40 19 — Approve payment / acceptance 4 20 10 — Final decision maker 5 12 6 Years of work experience (Q279) — 0–3 years 0 28 14 — 3–5 years 1 19 9 — 5–10 years 2 38 18 — 10–15 years 3 54 26 — 15–20 years 4 31 15 — 20 + years 5 32 15 Note. Q275–Q277 are multi‑response items (respondents could select multiple options). Percentages for these items are reported as the share of the full valid sample (N = 207) selecting each option and may therefore sum to more than 100%. Some respondents skipped background items; accordingly, totals for some variables are less than N = 207. Content Validity Evaluation Instrument development followed an iterative cycle of domain mapping, item generation, expert review, and wording refinement. The initial item pool was drafted from the literature, policy documents, and examination-governance practice, covering governance structures, technology deployment, ethical trade-offs, and implementation constraints. An interdisciplinary review group (examination administrators and academic researchers) screened items for clarity, relevance, and sensitivity, leading to iterative revisions and removals. To provide a traceable and reproducible item-relevance screen, we implemented an LLM-assisted, CVI-style workflow. Using a fixed prompt and a four-point relevance rubric (1 = not relevant; 4 = highly relevant), the model generated six independent rating runs. We computed CVI-style indices at the item level (I‑CVI) and scale level (S‑CVI/Ave) and applied commonly used screening cut-offs (I‑CVI ≥ 0.75; S‑CVI/Ave ≥ 0.90) (Polit et al. 2007 ). These model-based indices were used as additional screening evidence only and did not replace human judgment; final item-retention decisions were made by the research team. During refinement, we conducted cognitive-clarity checks of both language versions with domain experts and bilingual reviewers, focusing on comprehensibility, cultural appropriateness, and role realism in scenario-based items. Translation Workflow and AI Assistance The ESTG-Q was originally authored in Chinese for administration with examination administrators in China. To support international use and cross-cultural research, we produced an English version via a translation and back-translation procedure guided by established cross-cultural adaptation recommendations (Beaton et al. 2000 ), with a focus on conceptual equivalence rather than literal translation. Initial English drafts were generated with assistance from a GPT-series large language model to accelerate drafting and maintain terminological consistency, then reviewed and edited by bilingual researchers for conceptual equivalence and domain-appropriate phrasing. An independent professional translator, not involved in drafting, performed back-translation into Chinese; discrepancies were reconciled iteratively with attention to semantic equivalence and policy-relevant nuance. Small-scale clarity checks with bilingual experts were then conducted for both language versions. AI assistance was used for drafting and consistency checks and for the additional CVI-style screening described above; all substantive decisions on wording, retention, and interpretation were made by the authors. We then proceeded to field administration and psychometric analyses, and report reliability, structural evidence for a reflective subscale, and illustrative uses of the attitude-marking data. 4. Results Whole-Scale Reliability and Feedback Overview Additional LLM-assisted content-relevance screening showed high item-level agreement: >90% of items met the screening benchmark (I-CVI ≥ 0.75), with S-CVI/Ave = 0.92 for all items and 0.97 for core-construct items (Table 3 ). Internal consistency for the 233 Likert-type items was high (Cronbach’s α = 0.963). Attitude-marking uptake was broad (~ 90% of participants marked at least one item); overall thumbs-up and thumbs-down rates were 34.1% and 2.8%, respectively (Table 3 ; rate definitions in Section 3 ). “Thumbs-up rate” and “thumbs-down rate” were computed as full-sample proportions over item-by-respondent opportunities. For an item set containing K items, the thumbs-up rate equals \(\:\frac{{\sum\:}_{i=1}^{N}{\sum\:}_{k=1}^{K}1(At{t}_{ik}=Up)}{N\times\:K}\times\:100\text{%}\) , and the thumbs-down rate is defined analogously. Unmarked responses and explicit abstentions were included in the denominator but not the numerator. Table 3 Summary of content-relevance screening (CVI-style indices), internal consistency, and respondent-centered acceptability indicators. Indicator Criterion / rationale Result I-CVI (item relevance) I-CVI (item relevance; CVI-style) — Screening benchmark: I-CVI ≥ 0.75 (reported for transparency; interpretation guidelines in expert-panel CVI studies: Polit et al. 2007 ) > 90% of items ≥ 0.75 S-CVI/Ave (all items) S-CVI/Ave (all items) — Screening benchmark: S-CVI/Ave ≥ 0.90 (commonly used in CVI reporting; Polit et al. 2007 ) 0.92 S-CVI/Ave (core-construct items) Excluding primarily functional/context items (Q253–Q280) 0.97 Whole-instrument internal consistency Cronbach’s α for 233 Likert-type items α = 0.963 Thumbs-up rate (acceptability) Respondent-centered item approval signal 34.1% Thumbs-down rate (acceptability) Respondent-centered item objection signal 2.8% Note: CVI-style indices were computed from LLM-generated rating runs under a fixed prompt and 4-point rubric and are reported as additional screening evidence; final item-retention decisions were made by the research team based on substantive review. Cheating Attribution: Institutional Deterrence and Role-based Discretion (Q236–Q242) Subscale structure and reliability: The seven-item Cheating Attribution subscale (Q236–Q242) was analysed as a worked example of a reflective subscale. It was designed to capture two theorised dimensions: (1) institutional deterrence /governance beliefs (Q236–Q238) and (2) role-based moral permissiveness (Q239–Q242). For composite scoring and factor modelling, Q236–Q238 were reverse-scored so that higher values indicate greater permissiveness (Section 3 ); item-level descriptives are reported in original coding for interpretability (Table 6 ). Internal consistency was acceptable for an early-stage subscale (Cronbach’s α = 0.723; Nunnally & Bernstein 1994 ). Table 4 Item analysis and factor loadings for the Cheating Attribution subscale (Q236–Q242) using principal axis factoring with oblimin rotation. CITC = corrected item–total correlation. Items Q236–Q238 are reverse-scored. Item CITC α if deleted Factor 1 loading (institutional) Factor 2 loading (Role-based) Reverse-scored Q236 0.174 0.741 0.58 -0.06 Yes Q237 0.197 0.737 0.95 0.05 Yes Q238 0.269 0.724 0.58 0.08 Yes Q239 0.568 0.654 0.03 0.72 No Q240 0.665 0.625 0.05 0.85 No Q241 0.534 0.665 -0.02 0.75 No Q242 0.581 0.650 0.00 0.76 No Exploratory and confirmatory factor analyses: EFA used principal-axis factoring with oblimin rotation (Fabrigar et al. 1999 ; Costello & Osborne 2005 ). Sampling adequacy supported factor analysis (KMO = 0.728; Bartlett’s test of sphericity χ² = 474.26, df = 21, p 1 were extracted, explaining 67.86% of the total variance (Factor 1: 39.77%; Factor 2: 28.09%). The loading pattern was clean, with minimal cross-loadings (absolute values < 0.10), supporting the intended separation between institutional deterrence/governance beliefs/governance beliefs and role-based moral permissiveness (Table 4 ). Confirmatory factor analysis (CFA) further supported the two-factor model with good fit (χ²/df = 2.10, CFI = 0.97, RMSEA = 0.06), and the two-factor model outperformed a one-factor alternative (Δχ² test, p < 0.01) (Hu & Bentler 1999 ; Brown 2015 ). Key EFA and CFA indices are summarized in Table 5 . Table 5 Key indices for exploratory (EFA) and confirmatory (CFA) factor analyses for Q236–Q242 (N = 207). Analysis Index Value Suggested criterion EFA Eigenvalue (Factor 1) 2.784 > 1 EFA Eigenvalue (Factor 2) 1.966 > 1 EFA Variance explained (Factor 1) 39.77% — EFA Variance explained (Factor 2) 28.09% — EFA Cumulative variance explained 67.86% ≥ 50% EFA KMO 0.728 ≥ 0.70 EFA Bartlett’s test χ²(21) = 474.26; p < 0.001 Significant CFA Model fit χ²/df = 2.10; CFI = 0.97; RMSEA = 0.06 χ²/df < 3; CFI ≥ 0.95; RMSEA ≤ 0.06 Scenario-based moral leniency: Item-level response patterns underscored the practical difficulty of relying exclusively on human discretion at the point of enforcement. Institutional deterrence beliefs were comparatively consensual. For example, Q236 showed a high central tendency (M = 3.61, SD = 1.29), with 30.0% selecting endpoint 5 and 4.4% selecting endpoint 0 (Table 6 ). By contrast, role-based scenario items showed wider dispersion. Q242 showed a lower mean and wider dispersion (M = 1.81, SD = 1.84), with strong condemnation (42.5% at 0) but also a non-trivial permissive tail (10.5% at 5) (Table 6 ). Taken together, the role-based items (Q239–Q242) suggest that a persistent minority selects highly permissive response options (approximately one in ten at the endpoint 5 across items), while a larger middle band falls between strict prohibition and explicit tolerance. This distributional heterogeneity is not direct evidence of real-world misconduct; rather, it indicates that administrators’ attitudinal responses to governance dilemmas are more variable than their confidence in institutional deterrence. In operational terms, these patterns highlight a governance vulnerability: even when formal rules and technical controls are viewed as strong, integrity outcomes may hinge on discretionary judgments in ambiguous scenarios—especially in cases where reporting creates reputational or relational costs. Attitude-marking data indicated that this subscale was broadly acceptable to respondents (Table 7 ), with thumbs-up marks substantially exceeding thumbs-down marks across items —reflecting the sensitive nature of these ethical scenarios. Still, the imbalance of positive versus negative marks on all these items indicates that most respondents found the questions themselves fair and pertinent, even if their answers diverged. In sum, the Cheating Attribution subscale demonstrated a meaningful two-factor division with practical significance: administrators largely endorse the importance of institutional measures (Factor 1), yet diverge in role-based moral discretion (Factor 2), with a minority expressing permissive attitudes that may complicate enforcement. Descriptive Insights from Other Modules Beyond Q236–Q242, the ESTG‑Q yielded descriptive evidence across three domains: (a) awareness of technology‑aided cheating trends, (b) governance policy and resource attitudes, and (c) historical perspectives on exam integrity. Illustrative items suggest broad recognition of the historical shift toward more covert, technology‑aided cheating (e.g., Q40 and Q42; Table 6 ) and mixed views on macro‑level rationales and governance resourcing (e.g., Q50, Q266, and Q267; Table 6 ). The juxtaposition of institutional deterrence beliefs (Q236) and a reporting‑related dilemma (Q242) further indicates that higher endorsement of institutional deterrence statements can coexist with heterogeneous discretionary judgments (Table 6 ). Open‑ended comments (not systematically analysed) noted needs for clearer guidance and training for handling ambiguous cases. Some items were intentionally retained as forward‑looking ‘stress tests’ (e.g., speculative AI‑enabled cheating scenarios or highly intrusive countermeasures). Given their exploratory intent and potential contestability across contexts, users may treat these items as optional probes rather than core attitude indicators. Attitude‑marking complements psychometric indices by providing respondent‑centred acceptability signals. As summarised in Table 7 , thumbs‑down rates remained consistently low across item sets (~ 2–4%), whereas thumbs‑up rates varied by topic. This pattern can help prioritise which item sets may warrant rephrasing or additional contextualisation in future refinement. Table 6 presents a summary of key items from sections (a)–(c) above, alongside the earlier-mentioned Q236 and Q242, to illustrate the range of findings. These examples showcase the ESTG-Q’s granularity: it captures both the broad trends (e.g. general confidence in technology, recognition of hidden cheating) and the pinpoint dilemmas (e.g. whether to report a colleague’s lapse, whether to invest huge sums in proctoring systems) that characterize exam security governance today. Table 6 Descriptive statistics for selected ESTG‑Q items spanning multiple content areas. Item (ID) n Mean (SD) Endpoint 5 (%) Endpoint 0 (%) Pre‑2000 exams were manageable; high‑tech cheating not yet a major threat. (Q40) 203 3.38 (1.51) 28.1 8.4 Cheating tech vs. counter‑tech is an arms race; cheating is now more covert. (Q42) 205 3.77 (1.38) 38.5 4.9 Investing heavily in exam‑security tech yields large social/economic returns. (Q50) 204 3.18 (1.49) 22.1 8.8 A research‑based pilot project with enterprise technical support is needed. (Q266) 196 3.39 (1.49) 28.6 8.2 Our institution needs auxiliary project‑funding support. (Q267) 195 2.79 (1.85) 24.1 22.1 Current exam rules and punishments are sufficient to deter cheating. (Q236) 203 3.61 (1.29) 30.0 4.4 Student opportunism / role-based moral permissiveness.(Q239) 202 2.16 (1.89) 13.9 36.1 If an exam ends ‘smoothly’, a cheating incident need not be reported up the chain. (Q242) 200 1.81 (1.84) 10.5 42.5 Note. Items are coded 0–5 (higher scores indicate stronger agreement/endorsement of the statement). “Endpoint 5 (%)” and “Endpoint 0 (%)” report valid percentages selecting the scale endpoints (5 or 0), computed using item-level valid responses (n). N indicates the number of valid responses for each item (varying due to item-level nonresponse). Selected items in Table 6 —especially Q236 and Q242—highlight a policy–practice tension in exam governance. Figure 2 provides a compact visualization by contrasting endpoint response rates (percentage selecting 0 vs. 5) for an institutional deterrence item (Q236) and a role-based moral dilemma item (Q242). Whereas Q236 shows low outright rejection (4% at 0) alongside substantial strong agreement (30% at 5), Q242 exhibits a much larger rejection endpoint (42.5% at 0) alongside a non-trivial permissive endpoint (10.5% at 5) (Table 6 ; Fig. 2 ). This endpoint asymmetry illustrates how institutional confidence can coexist with heterogeneous moral discretion in scenario-based judgments. Attitude‑marking feedback aggregated by item set (Table 7 ) shows consistently low thumbs‑down rates (~ 2–4%) and topic‑dependent variation in thumbs‑up rates. Across all item sets, positive marks outweighed negative marks, indicating broad acceptability of the instrument content. Table 7 Attitude-marking feedback summary by major item sets. Item set / Topic Thumbs-Up Rate Thumbs-Down Rate Historical Cognition & Policy environment (Q1–Q39) 36.5% 3.5% Technology-aided cheating & countermeasures (Q40–Q50) 36.2% 2.3% Technology governance & readiness (Q51–Q119) 34.4% 3.0% Proctoring technology implementation & ethics (Q120–Q199) 32.6% 2.9% Cheating psychology, incidents & governance dilemmas (Q200–Q252) 33.6% 2.5% Cheating Attribution subscale (Q236–Q242; subset of Q200–Q252) 33.7% 3.3% Governance strategy, resources & respondent context (Q253–Q280) 33.8% 1.8% Whole instrument (Q1–Q280) 34.1% 2.8% Note. Rates are computed over item-by-respondent opportunities within each item set, using the full valid sample (N = 207) as the denominator. “Thumbs-Up Rate” = total number of thumbs-up marks divided by \(\:N\times\:K\) (where K is the number of items in the set), expressed as a percentage; “Thumbs-Down Rate” is defined analogously. Unmarked responses and abstentions are included in the denominator but not the numerator. 5. Discussion This study introduces the ESTG-Q as a modular instrument for characterising examination administrators’ perspectives on exam security and technology governance. The pooled Likert item set showed high internal consistency (Cronbach’s α = 0.963). Given the instrument’s breadth, this value should be interpreted as reliability of the overall item pool rather than as evidence of a single latent construct (Cronbach 1951 ). Accordingly, we recommend reporting governance domains as profiles and validating reflective subscales only where psychometrically warranted (DeVellis 2016 ). Across respondents from 27 provinces, attitude-marking feedback indicated broad acceptability of the items (low thumbs-down rates; Table 7 ), suggesting that issues such as high-tech cheating, policy enforcement, and exam ethics are widely recognised within this administrative population. At the same time, module-level analyses revealed heterogeneity that would be obscured by any single aggregate score. In particular, items Q236–Q242 supported a two-factor structure separating institutional deterrence/governance beliefs from role-based moral permissiveness, consistent with discretion in “street-level” implementation contexts (Lipsky 1980 ). Scenario-based items exhibited greater dispersion than deterrence-oriented items, indicating heterogeneity in how administrators evaluate discretionary leniency in role-conflict dilemmas. These responses represent attitudes to hypothetical situations rather than evidence of actual misconduct; nevertheless, they imply that integrity outcomes may partly depend on discretionary judgments at the point of implementation. In line with “street-level” governance accounts emphasising frontline discretion (Lipsky 1980 ), this pattern motivates clearer escalation and reporting pathways, reporting protections, and scenario-based training targeted at ambiguous cases. Taken together, the two factors point to a policy–practice tension: respondents can express strong endorsement of institutional deterrence/governance beliefs while differing in role-based judgments under discretionary dilemmas. Figure 2 illustrates this contrast by comparing endpoint response rates for a deterrence-oriented item (Q236) and a reporting-related dilemma item (Q242). Conceptually, this aligns with implementation research showing that formal rules and technical controls are enacted through frontline interpretation and discretion (Lipsky 1980 ). For exam governance design, the findings underscore a socio-technical implication: detection and monitoring tools are only as effective as the organisational pathways through which alerts and suspicions are documented, escalated, and reviewed. Designing audit trails and escalation mechanisms may reduce reliance on single-actor discretion, yet stronger automation must be balanced against legitimacy concerns around privacy, proportionality, transparency, and trust (Kharbat & Abu Daabes 2021 ; Coghlan et al. 2021 ). Institutionally, the results motivate investments in procedural clarity—particularly for reporting and escalation—alongside protections for reporting and structured scenario-based training that targets morally ambiguous cases. Such supports may help align discretionary judgment with policy expectations while recognising the legitimacy challenges that can accompany strict enforcement (Kharbat & Abu Daabes 2021 ; Coghlan et al. 2021 ). Beyond psychometrics, the ESTG-Q offers a structured way to document governance trade-offs in a context where exam security infrastructures are comparatively mature, enabling international readers to examine how integrity is operationalised under strong deterrence regimes. Ethically, the findings reinforce that integrity governance is not only about examinee conduct: administrators’ discretionary judgments and moral reasoning form part of the socio-technical system and warrant empirical attention. Supporting consistent practice may therefore require not only technical controls but also clearer guidance and institutional support for handling morally ambiguous cases. 6. Limitations and Future Directions This initial study is a first step and several limitations point to clear, testable next steps. Sampling and generalisability. Although respondents were drawn from 27 provinces, the sample was modest (N = 207) and anchored in China’s high‑stakes examination environment. The participant group—examination administrators—may not represent other roles or other governance regimes. Replication with larger and more diverse samples is therefore needed, alongside cross‑context comparisons that test whether the same response patterns, factor structure, and parameter estimates emerge. Where cross‑context data are available, formal measurement‑invariance tests can help distinguish stable construct variance from context‑specific functioning. Scope of evidence provided in this paper. The ESTG‑Q is intentionally broad and modular. In this manuscript, we report (a) whole‑instrument internal consistency for the Likert item pool, (b) a worked structural‑validity demonstration for one reflective subscale (Q236–Q242), and (c) descriptive profiling for selected items from other parts of the questionnaire to illustrate response distributions and potential areas of sensitivity. Structural validation of additional reflective subscales—and fit‑for‑purpose modelling strategies for diagnostic or scenario‑based item sets—should be undertaken in targeted follow‑up studies with larger samples and module‑specific analytic plans. This staged approach aligns instrument development with the instrument’s modular design rather than forcing a single, one‑shot validation model. Measurement and modelling considerations. The present analyses rely primarily on classical indices and factor‑analytic techniques. Given the instrument’s multi‑domain design, future work could apply multidimensional item response theory to estimate domain‑specific traits and their relations (Reckase 2009 ). Where response patterns suggest both a broad tendency and domain‑specific variance, bifactor models may be used to evaluate whether a general factor (e.g., broad response propensity) can be separated from domain factors (Reise 2012 ). For scenario‑based item sets that may induce local dependence, testlet approaches offer a principled option (Wainer et al. 2007 ). Importantly, the very high whole‑instrument α should be interpreted as internal consistency across a large Likert pool rather than evidence of a single latent construct; future work may also consider item reduction and short forms while preserving conceptual coverage (DeVellis 2016 ). Response process, behavioural evidence, and criterion‑related validity. Findings are based on self‑reported attitudes and responses to hypothetical scenarios, which may not map cleanly onto real‑world decision behaviour—especially when responses are shaped by perceived interpersonal costs, institutional incentives, or risk of negative consequences. Future studies should triangulate survey responses with behavioural indicators where feasible (e.g., incident‑handling records, audit trails, or structured simulations) and incorporate qualitative follow‑ups (e.g., cognitive interviews) to clarify how respondents interpret scenarios and choose responses. Test–retest reliability, convergent/discriminant evidence with related constructs, and criterion‑related (including predictive) validity evidence would further strengthen the validity argument. Attitude‑marking as validity‑relevant evidence. The embedded thumbs‑up/ thumbs‑down/ abstain feature provides respondent‑centred signals of acceptability and clarity, but its measurement properties warrant systematic study. Future work could test the stability of marking behaviour, its relationship to missingness and response styles (e.g., satisficing, acquiescence), and whether marking predicts item‑functioning issues that conventional psychometrics may miss (e.g., differential item functioning or scenario‑specific misinterpretation). If supported, marking behaviour could become a practical diagnostic layer for iterative item refinement and cross‑context adaptation. Overall, the ESTG‑Q provides a foundation for a staged programme of module‑level validation and application across contexts, enabling more granular study of exam‑integrity governance than single‑score instruments. 7. Conclusion In this paper, we presented the development and initial validation of the Exam Security and Technology Governance Questionnaire (ESTG-Q), a modular instrument designed to capture examination administrators’ perspectives on exam security and technology governance. The instrument supported reliable measurement in this initial deployment and yielded interpretable descriptive profiling across domains. We provided evidence for a two-factor structure in the Cheating Attribution subscale (Q236–Q242), separating institutional deterrence/governance beliefs from role-based moral permissiveness, indicating that governance attitudes may be partially independent rather than reducible to a single integrity dimension. We also introduced an item-level attitude-marking mechanism (thumbs-up/thumbs-down/abstain) as respondent-centred acceptability and perceived-clarity signals that complement psychometric evidence and can guide iterative item refinement. In this study, thumbs-up marks substantially exceeded thumbs-down marks, suggesting limited perceived problematicness of the survey content while still allowing sensitivity differences to be detected at the item-set level. Substantively, the findings highlight a core governance implication: integrity regimes that rely on legal deterrence and technical countermeasures still depend on discretionary human decisions at implementation points. Consistent with scale-development practice, ESTG-Q is designed for staged validation, with module-specific analyses reported progressively as part of a broader research programme. Overall, ESTG-Q offers a flexible tool for examining exam integrity governance as a socio-technical system. Its modular structure supports targeted use for specific governance questions, and its embedded feedback channel adds a practical layer of respondent-centred evidence alongside standard psychometrics. We hope this work supports future research and evaluation that moves beyond prevalence-only accounts toward a more granular understanding of how policies, technologies, and human judgment jointly shape integrity outcomes in high-stakes examinations. Declarations Participant consent statement: All participants were adults. Prior to participation, participants received a written information sheet describing the study purpose, voluntary participation, confidentiality protections, and data-handling procedures. Participants could skip any question and could discontinue participation prior to submitting the questionnaire. Return of a completed paper questionnaire was taken as informed consent to participate. 10. Funding This research was supported by XXXXX Co., Ltd. 13. Ethics approval and consent to participate All procedures involving human participants were conducted in accordance with the Declaration of Helsinki and relevant national requirements, including the National Health Commission of China’s Measures for the Ethical Review of Life Science and Medical Research Involving Human Participants (2023). The study was assessed as minimal risk. Data were collected via paper questionnaires administered to adult participants. Questionnaires were not anonymous at collection because contact/administrative details could be used for distribution and return logistics. An information sheet described the study purpose, voluntary participation, confidentiality protections, and data-handling procedures; participants could skip any item or discontinue prior to submission, and return of a completed questionnaire was taken as informed consent. For analysis, responses were de-identified and stored in an ID-coded dataset. Any linkage between response IDs and identifying information was stored separately under access-restricted, sealed conditions and was not used in statistical analyses. 14. Use of AI-assisted tools A GPT-series large language model was used to assist translation drafting and semantic consistency checks, and to generate simulated relevance ratings for an additional CVI-style content-relevance screening step under a fixed prompt and rubric, as described in the Methods. The model was not used to generate survey responses or to perform statistical analyses. All AI outputs were reviewed by the authors; the model does not meet authorship criteria and all authors take responsibility for the work. References Arnold IJM (2022) Online proctored assessment during COVID-19: Has cheating increased? Journal of Economic Education 53(4):277–295. https://doi.org/10.1080/00220485.2022.2111384 Balash DG, Korkes E, Grant M, Aviv AJ, Fainchtein RA, Sherr M (2023) Educators’ perspectives of using (or not using) online exam proctoring. In: Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, pp 5091–5108. (Online) https://www.usenix.org/conference/usenixsecurity23/presentation/balash Beaton DE, Bombardier C, Guillemin F, Ferraz MB (2000) Guidelines for the process of cross-cultural adaptation of self-report measures. Spine (Phila Pa 1976) 25(24):3186–3191. https://doi.org/10.1097/00007632-200012150-00014 Bretag T, Harper R, Burton M, Ellis C, Newton P, Rozenberg P, Saddiqui S, van Haeringen K (2019) Contract cheating: a survey of Australian university students. Studies in Higher Education 44(11):1837–1856. https://doi.org/10.1080/03075079.2018.1462788 Bretag T, Mahmud S, Wallace M, Walker R, James C, Green M, East J, McGowan U, Partridge L (2011) Core elements of exemplary academic integrity policy in Australian higher education. International Journal for Educational Integrity 7(2):3–12. https://ojs.unisa.edu.au/index.php/IJEI/article/view/759 Brown TA (2015) Confirmatory factor analysis for applied research. 2nd edn. Guilford Press, New York Chaudhry K, Theus AL, Assal H, Chiasson S (2023) “It’s not that I want to see the student’s bedroom…”: Instructor perceptions of e‑proctoring software. In: Proceedings of the 2023 European Symposium on Usable Security (EuroUSEC ’23). Association for Computing Machinery, New York, NY, USA, pp 15–26. https://doi.org/10.1145/3617072.3617103 Chyung SY, Roberts K, Swanson I, Hankinson A (2017) Evidence-based survey design: the use of a midpoint on the Likert scale. Performance Improvement 56(10):15–23. https://doi.org/10.1002/pfi.21727 Clarke R, Lancaster T (2013) Commercial aspects of contract cheating. In: Proceedings of the 18th ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE ’13). Association for Computing Machinery, New York, NY, USA, pp 219–224. https://doi.org/10.1145/2462476.2462497 Coghlan S, Miller T, Paterson J (2021) Good proctor or “Big Brother”? Ethics of online exam supervision technologies. Philosophy & Technology 34(4):1581–1606. https://doi.org/10.1007/s13347-021-00476-1 Costello AB, Osborne JW (2005) Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assessment, Research, and Evaluation 10(7):1–9. https://doi.org/10.7275/jyj1-4868 Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16:297–334. https://doi.org/10.1007/BF02310555 DeVellis RF (2016) Scale development: theory and applications. 4th edn. Sage, Thousand Oaks Dendir S, Maxwell RS (2020) Cheating in online courses: Evidence from online proctoring. Computers in Human Behavior Reports 2:100033. https://doi.org/10.1016/j.chbr.2020.100033 Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ (1999) Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods 4(3):272–299. https://doi.org/10.1037/1082-989X.4.3.272 Garland R (1991) The mid-point on a rating scale: is it desirable? Marketing Bulletin 2:66–70 Hu L, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling 6(1):1–55. https://doi.org/10.1080/10705519909540118 Kharbat FF, Abu Daabes AS (2021) E-proctored exams during the COVID-19 pandemic: a close understanding. Educ Inf Technol 26:6589–6605. https://doi.org/10.1007/s10639-021-10458-7 Kline RB (2015) Principles and practice of structural equation modeling. 4th edn. Guilford Press, New York Lipsky M (1980) Street-level bureaucracy: dilemmas of the individual in public services. Russell Sage Foundation, New York Newton PM (2018) How common is commercial contract cheating in higher education and is it increasing? A systematic review. Frontiers in Education 3:67. https://doi.org/10.3389/feduc.2018.00067 Noorbehbahani F, Mohammadi A, Aminazadeh M (2022) A systematic review of research on cheating in online exams from 2010 to 2021. Educ Inf Technol 27:8413–8460. https://doi.org/10.1007/s10639-022-10927-7 Nunnally JC, Bernstein IH (1994) Psychometric theory. 3rd edn. McGraw-Hill, New York Podsakoff PM, MacKenzie SB, Lee JY, Podsakoff NP (2003) Common method biases in behavioral research: a critical review of the literature and recommended remedies. J Appl Psychol 88(5):879–903. https://doi.org/10.1037/0021-9010.88.5.879 Polit DF, Beck CT, Owen SV (2007) Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Res Nurs Health 30(4):459–467. https://doi.org/10.1002/nur.20199 Reckase MD (2009) Multidimensional item response theory. Springer, New York Reise SP (2012) The rediscovery of bifactor measurement models. Multivar Behav Res 47(5):667–696. https://doi.org/10.1080/00273171.2012.715555 Standing Committee of the National People’s Congress (2015) Criminal Law of the People’s Republic of China (Amendment IX) (in Chinese). Supreme People’s Procuratorate of the People’s Republic of China. https://www.spp.gov.cn/spp/fl/201802/t20180205_364562.shtml. Accessed 30 Dec 2025 The State Council of the People’s Republic of China (2025) Campaign targets online misdeeds that threaten integrity of gaokao. https://english.www.gov.cn/news/202506/04/content_WS683fa661c6d0868f4e8f3107.html. Accessed 30 Dec 2025 Wainer H, Bradlow ET, Wang X (eds) (2007) Testlet response theory and its applications. Cambridge University Press, Cambridge Weber EU, Blais A-R, Betz NE (2002) A domain-specific risk-attitude scale: measuring risk perceptions and risk behaviors. J Behav Decis Making 15:263–290. https://doi.org/10.1002/bdm.414 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8502983","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":570147588,"identity":"be52145b-edab-4ed6-8673-c2d09751dfbe","order_by":0,"name":"Dengfeng Hui","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA0ElEQVRIiWNgGAWjYBACxgYGZoYEBiBiZj5w4MMP0rSwJR6c2UOcRcwgAqiLx/gwBxsx6tsPPzZ4UMGQZ3Cc58NhBh4GeX6xAwQc1pNmnJBwhqHY4DDvhsMFFgyGM2cnENDSkMN8ILGNIXEDSMsMHoYEg9uEtPS/AWr5B9LC8+AwDxsxWmbkMCckNoC1MBCr5ZmxQcIxicSZh9kMgIEsQdgvhv3JjyV/1Ngk9p0//PjDhx828vzShLQ0gCkJGF8Cl0IEkCesZBSMglEwCkY8AABcBEaSvG85xQAAAABJRU5ErkJggg==","orcid":"","institution":"Beijing Wholeway Centron Technology Co., LTD","correspondingAuthor":true,"prefix":"","firstName":"Dengfeng","middleName":"","lastName":"Hui","suffix":""},{"id":570147591,"identity":"523327fd-7fb4-45af-8881-44c6abb6d7fe","order_by":1,"name":"Shamala K. Subramaniam","email":"","orcid":"","institution":"Universiti Putra Malaysia","correspondingAuthor":false,"prefix":"","firstName":"Shamala","middleName":"K.","lastName":"Subramaniam","suffix":""},{"id":570147594,"identity":"e732fb83-3e5f-46b9-8d49-0b2781c38001","order_by":2,"name":"Lili Nurliyana binti Abdullah","email":"","orcid":"","institution":"Universiti Putra Malaysia","correspondingAuthor":false,"prefix":"","firstName":"Lili","middleName":"Nurliyana binti","lastName":"Abdullah","suffix":""},{"id":570147596,"identity":"48e7e2c4-0dca-4713-9741-b91f947bfbcb","order_by":3,"name":"Abdullah bin Muhammed","email":"","orcid":"","institution":"Universiti Putra Malaysia","correspondingAuthor":false,"prefix":"","firstName":"Abdullah","middleName":"bin","lastName":"Muhammed","suffix":""},{"id":570147598,"identity":"eab1bf98-f882-4468-a404-e2a811108180","order_by":4,"name":"Hongyan Liu","email":"","orcid":"","institution":"Beijing Wholeway Centron Technology Co., LTD","correspondingAuthor":false,"prefix":"","firstName":"Hongyan","middleName":"","lastName":"Liu","suffix":""},{"id":570147601,"identity":"b039a02d-bf21-4045-bd85-a7c1f77b71bd","order_by":5,"name":"Yuchun Hui","email":"","orcid":"","institution":"Beijing Normal University","correspondingAuthor":false,"prefix":"","firstName":"Yuchun","middleName":"","lastName":"Hui","suffix":""},{"id":570147604,"identity":"61f5adfc-a6c8-4e15-82ca-d6385a10c431","order_by":6,"name":"Zhongliang Liu","email":"","orcid":"","institution":"Guangdong Light Industry Technician College","correspondingAuthor":false,"prefix":"","firstName":"Zhongliang","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2026-01-02 19:23:27","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8502983/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8502983/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100002625,"identity":"fa1322e8-59e7-4fb1-87e7-00705ffb45c5","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":345233,"visible":true,"origin":"","legend":"","description":"","filename":"202615ESTGQv4.21blind.docx","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/bd110598a469943afd4f9caf.docx"},{"id":100002621,"identity":"a3f038aa-a8c0-4a98-8aa5-a6f45e10b8c3","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"json","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":9730,"visible":true,"origin":"","legend":"","description":"","filename":"378bd9d5258540bf812ba69ba965744e.json","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/90cd98f435f833b1428c5f97.json"},{"id":100361099,"identity":"688d5891-948a-4619-8833-2ecd1ab4bf2d","added_by":"auto","created_at":"2026-01-16 07:44:26","extension":"xml","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":131531,"visible":true,"origin":"","legend":"","description":"","filename":"378bd9d5258540bf812ba69ba965744e1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/ebed315717c16e4738b68086.xml"},{"id":100002620,"identity":"15bded06-2631-4232-a0ef-9045a6478742","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":229519,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1ESTGQWorkflowhighqualityv2.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/23b97561196c1eff3acc5b0b.png"},{"id":100002624,"identity":"bd1a06b6-283a-409b-bf00-06b86c23894f","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":93838,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2ESTGQinstitutionalfinal.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/e9e4e32c6ccfb3a58631f52b.png"},{"id":100360986,"identity":"e1b76791-77b1-4b0f-9da1-80661ed28a51","added_by":"auto","created_at":"2026-01-16 07:44:16","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":283380,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/5fe7bcc39c454baf2ea29624.png"},{"id":100361141,"identity":"ac45faa4-76ce-4050-8ab8-89a4ead720dc","added_by":"auto","created_at":"2026-01-16 07:44:30","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":110931,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/9e37fa92f358fe911b3e6dde.png"},{"id":100360391,"identity":"3cbbdb1a-58b3-483c-bef4-d00ff2856952","added_by":"auto","created_at":"2026-01-16 07:38:38","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":80595,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1ESTGQWorkflowhighqualityv2.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/9154c0e839fda0de755ffb66.png"},{"id":100360733,"identity":"a0cebc7c-3714-452f-9867-10f01d1165c4","added_by":"auto","created_at":"2026-01-16 07:41:30","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":38322,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2ESTGQinstitutionalfinal.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/5d74059152f2f9634d7f14d0.png"},{"id":100002630,"identity":"e82fccd5-5fbb-4049-8f2e-d2cfd03e8817","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":80505,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/c9191f4c774c925ead3828cc.png"},{"id":100002631,"identity":"320fc5e3-567d-46a4-aaf3-d59b6df56c78","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":38232,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/f06878f83f3c7905b457f63f.png"},{"id":100002628,"identity":"2b9e3e55-ed4f-4573-b0f7-93b3c68c1cee","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"xml","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":128007,"visible":true,"origin":"","legend":"","description":"","filename":"378bd9d5258540bf812ba69ba965744e1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/4cc3da640193b981aedb4211.xml"},{"id":100360635,"identity":"361d4802-a900-49d3-9b3b-c9b7ef0e71e9","added_by":"auto","created_at":"2026-01-16 07:40:33","extension":"html","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":137196,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/eb56076675a08638caa71d18.html"},{"id":100002619,"identity":"178449d9-5703-4c46-b591-bcc005d5d343","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":229519,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eInstrument development workflow for the ESTG-Q (from domain mapping to psychometric evidence; item-level attitude marking embedded throughout).\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"Figure1ESTGQWorkflowhighqualityv2.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/673fd6c6f5e49fdcb0b40478.png"},{"id":100002618,"identity":"8e20701a-1221-4477-a812-c704ecf5d9cd","added_by":"auto","created_at":"2026-01-12 04:58:28","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":93838,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEndpoint response distributions for Q236 (institutional deterrence) and Q242 (reporting dilemma). Percentages are valid responses; intermediate categories (1–4) are aggregated.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"Figure2ESTGQinstitutionalfinal.png","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/2082b814021bd64579d46261.png"},{"id":100422147,"identity":"0402755c-d74c-483c-93d0-aba2db44c491","added_by":"auto","created_at":"2026-01-16 14:06:29","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1307205,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8502983/v1/bff1eb23-aae0-41c3-b379-5c4103790c23.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Exam Security and Technology Governance Questionnaire (ESTG-Q): A Modular Instrument with an Item-Level Attitude-Marking Mechanism for Exam Integrity Research","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eEducational integrity is governed at the boundary between rules, technology, and human judgment. As examination systems scale and digitize, cheating often adapts faster than governance, fueling an arms race between misconduct and control. Recent syntheses of online exam cheating describe an expanding repertoire of tactics that exploit consumer electronics, communication platforms, and loopholes in supervision (Noorbehbahani et al. 2022). Beyond opportunistic in-situ cheating, outsourcing markets and impersonation pressures—often discussed under the umbrella of contract cheating—add further governance load to assessment systems (Clarke \u0026amp; Lancaster 2013; Newton 2018; Bretag et al. 2019).\u003c/p\u003e\n\u003cp\u003eCountermeasures increasingly rely on surveillance and automation, including identity verification, signal detection/blocking, automated invigilation, and remote proctoring. While such tools may strengthen deterrence, they can also reshape the moral ecology of testing by altering trust, increasing stress, and raising concerns about privacy and procedural fairness (Kharbat \u0026amp; Abu Daabes 2021; Coghlan et al. 2021). Evidence on remote proctoring suggests that deterrence gains can coexist with concerns about student experience and validity; studies report score shifts after webcam proctoring and mixed evidence on whether cheating increased (Dendir \u0026amp; Maxwell 2020; Arnold 2022). Governance questions therefore extend beyond detection efficacy: they include legitimacy, proportionality, and the social acceptability of enforcement.\u003c/p\u003e\n\u003cp\u003eA conspicuous gap in the integrity literature is the governance perspective of the actors who implement exam policies. Examination administrators interpret rules, deploy monitoring systems, manage staff, and decide how incidents are handled. Classic work on policy implementation emphasizes how such street-level discretion shapes policy as practiced (Lipsky 1980). Yet validated instruments that capture administrators’ perspectives across the legal, technical, and ethical dimensions of exam security remain scarce.\u003c/p\u003e\n\u003cp\u003eChina’s high-stakes examination system provides a distinctive governance context in which deterrence technologies, legal sanctions, and institutional accountability are tightly coupled—making administrators’ perspectives particularly informative for integrity research.\u003c/p\u003e\n\u003cp\u003eThis paper introduces the Exam Security and Technology Governance Questionnaire (ESTG-Q), a modular instrument designed for examination administrators. The ESTG-Q is built to (i) support domain-specific scoring rather than forcing a single total score across heterogeneous constructs, and (ii) embed an item-level attitude-marking mechanism to capture respondent-centered acceptability signals for sensitive items. We report the instrument’s development and evidence package (content relevance indices, whole-instrument reliability, and respondent-centered acceptability signals), and provide a worked psychometric demonstration using the two-factor Cheating Attribution subscale (Q236–Q242), complemented by descriptive evidence from additional thematic item sets.\u003c/p\u003e"},{"header":"2. Background","content":"\u003cp\u003eExam integrity is a moving target in high-stakes assessment contexts, as cheating methods increasingly leverage consumer electronics and communication tools (Noorbehbahani et al. 2022). In response, governance measures now combine legal deterrence with technical controls. In China, for example, organizing cheating in legally prescribed national examinations and providing technical assistance can entail criminal liability under the Criminal Law (Amendment IX) (Standing Committee of the National People\u0026rsquo;s Congress 2015), alongside administrative sanctions and multi-agency enforcement campaigns documented by official government releases (The State Council of the People\u0026rsquo;s Republic of China 2025). This reflects a broader governance logic in which legal sanctions, technological countermeasures, and institutional accountability are expected to reinforce one another.\u003c/p\u003e\n\u003cp\u003eSuch interventions can strengthen deterrence while introducing ethical and practical trade-offs. Advanced proctoring technologies\u0026mdash;from radio-frequency signal jammers and biometric identity checks to AI-supported invigilation and online video monitoring\u0026mdash;can improve detection and deterrence of cheating. Yet they may also generate privacy, fairness, and psychological-stress concerns among test-takers, and can affect trust in institutions (Kharbat \u0026amp; Abu Daabes 2021; Coghlan et al. 2021). Criticism of remote proctoring frequently centres on perceived intrusiveness, surveillance risk, and opaque automated judgments (Coghlan et al. 2021).\u003c/p\u003e\n\u003cp\u003eA persistent blind spot in the literature is the governance perspective of examination administrators. Much academic integrity research has focused on student behavior (e.g., why students cheat and how to prevent it), while comparatively less has examined the attitudes of the educators and implementers who decide on, operate, and oversee exam security measures (Balash et al. 2023; Chaudhry et al. 2023). In practice, administrators translate policy into action: they interpret rules, deploy monitoring technology, manage testing staff, and handle incidents. Classic work on policy implementation highlights how such \u0026ldquo;street-level\u0026rdquo; discretion can shape how rules are enacted in practice (Lipsky 1980). Their beliefs about why cheating occurs, which interventions are justified, and where responsibility lies can therefore influence both enforcement consistency and the perceived legitimacy of integrity measures. This gap suggests that understanding administrators\u0026rsquo; viewpoints is critical for a complete picture of exam integrity governance.\u003c/p\u003e\n\u003cp\u003eMeasuring exam security governance is challenging because it is not a single attitude or behaviour; it is an interlocking profile of institutional, technological, and ethical domains. Governance work includes knowledge of policies and laws, awareness of emerging cheating tactics, openness to new monitoring technologies, perceptions of procedural fairness, and readiness to invest resources. Integrity policy research similarly emphasizes that policy design and accessibility, responsibility structures, and institutional support shape how integrity standards are enacted (Bretag et al. 2011). Collapsing such heterogeneous domains into one total score can obscure diagnostic information and encourage overinterpretation (Kline 2015). A modular instrument therefore offers a more faithful representation of governance as practiced.\u003c/p\u003e\n\u003cp\u003eTo address these gaps, we developed the Exam Security and Technology Governance Questionnaire (ESTG-Q). The ESTG-Q is a structured instrument designed specifically for examination administrators, aiming to capture their perceptions of exam security, technology governance, policy trust, and integrity-related attitudes. We intentionally built the ESTG-Q as a modular framework: it contains multiple thematic sections that can be administered and analysed independently or in purposeful combinations, rather than imposing a monolithic, one-form instrument across heterogeneous governance domains. This design allows researchers or institutions to focus on specific governance components as needed (for example, item sets targeting technology readiness or perceptions of policy legitimacy and trust). In addition, because many ESTG-Q items probe sensitive or controversial practices (e.g. invasive proctoring, strict punishments), we embedded an item-level \u0026ldquo;attitude marking\u0026rdquo; feature \u0026ndash; a simple thumbs-up / thumbs-down / abstain tag that respondents can apply to each item. This feature lets administrators discreetly mark items they find well-phrased or problematic, providing real-time feedback on item acceptability and perceived clarity as respondent-centered face-validity signals.\u003c/p\u003e\n\u003cp\u003eIn summary, this paper presents the development and initial validation of the ESTG-Q, emphasizing two methodological contributions: (1) a modular measurement architecture suited to complex, multi-factor governance domains, and (2) an embedded respondent-centric attitude-marking mechanism that augments standard psychometrics with direct acceptability signals. We posit that treating exam integrity governance as a socio-technical system \u0026ndash; and measuring it with finer granularity \u0026ndash; can yield richer insights to inform both scholarship and practice in educational integrity.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eStudy Objectives\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe study had four main objectives:\u003c/p\u003e\n\u003cp\u003ei.\u0026nbsp; \u0026nbsp;Instrument design: Document the instrument development workflow, including translation and a traceable item-review process (human review supplemented by an LLM-assisted, CVI-style relevance screening step that does not replace expert judgment).\u003c/p\u003e\n\u003cp\u003eii.\u0026nbsp;\u0026nbsp;Innovation \u0026ndash; Attitude Marking: Introduce and operationalize an item-level attitude-marking mechanism that captures respondent-centered acceptability and perceived clarity signals in real time, providing respondent-centred acceptability and perceived-clarity signals relevant to face-validity appraisal.\u003c/p\u003e\n\u003cp\u003eiii.\u0026nbsp;Psychometric evaluation: Provide initial psychometric evidence, including whole-instrument reliability for the Likert pool and a worked demonstration of structural validity in one reflective subscale, within a broader staged programme of module-level analyses to be reported in subsequent publications.\u003c/p\u003e\n\u003cp\u003eiv. Demonstration of use: Demonstrate mixed measurement logic via two applications: (a) a reflective, factor-analysable Cheating Attribution subscale (Q236\u0026ndash;Q242) evaluated with EFA/CFA; and (b) a diagnostic anchoring item set on emerging technology-aided cheating and early governance (Q40\u0026ndash;Q50), reported descriptively and via attitude-marking metrics (thumbs-up, thumbs-down, abstain).\u003c/p\u003e"},{"header":"3. Methods","content":"\u003cp\u003e\u003cem\u003eInstrument Design and Modular Structure\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe ESTG-Q comprises a total of 280 questions. Item generation was guided by a provisional domain map developed during early design (initially organised into 16 topical categories) and remains subject to refinement as the broader research programme progresses.\u003c/p\u003e\n\u003cp\u003eWithin the questionnaire, 233 core items are Likert-type statements capturing attitudes or perceptions, scored on a six-point agreement scale (0\u0026ndash;5). We chose an even-point Likert scale to discourage routine selection of a neutral midpoint and to encourage respondents to take a position on sensitive governance dilemmas\u0026mdash;a design choice discussed in survey-methodology research on midpoint response options (Garland \u003cspan class=\"CitationRef\"\u003e1991\u003c/span\u003e; Chyung et al. \u003cspan class=\"CitationRef\"\u003e2017\u003c/span\u003e). The specific phrasing of the scale anchors was tailored to each thematic item set\u0026mdash;for example, from \u0026ldquo;strongly disagree\u0026rdquo; to \u0026ldquo;strongly agree\u0026rdquo; for attitude statements, or \u0026ldquo;very unfamiliar\u0026rdquo; to \u0026ldquo;very familiar\u0026rdquo; for knowledge-related items\u0026mdash;so that administrators could express nuanced judgments in context.\u003c/p\u003e\n\u003cp\u003eIn addition to Likert items, the ESTG-Q includes scenario-based questions and a set of factual or background items. The scenario-based prompts present hypothetical situations (e.g. an exam-room dilemma involving a suspected cheater or a technical failure) to elicit role-embedded ethical judgments. The background items gather contextual information such as the administrator\u0026rsquo;s experience level, the types of exams they oversee, and any institutional constraints on resources. These items support interpretation by situating responses in real-world conditions (for instance, gauging whether certain attitudes correlate with working in resource-limited settings or with having encountered sophisticated cheating incidents before).\u003c/p\u003e\n\u003cp\u003eBecause exam-security governance spans distinct legal, technical, and ethical domains, ESTG-Q is designed as a modular, profile-based instrument rather than a single-score scale. Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e summarises the questionnaire in chapter-aligned thematic blocks defined by stable Q-number ranges to support transparent administration and reporting. These blocks are practical reporting units; construct-level modelling is undertaken only where psychometrically warranted. Accordingly, we report block-level profiles and provide worked examples of reflective subscale validation, enabling researchers and practitioners to deploy the full instrument or select fit-for-purpose item sets for specific governance questions (DeVellis\u0026nbsp;\u003cspan class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e\n\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eProvisional chapter-aligned thematic blocks of the ESTG-Q (current development stage) and intended reporting logic.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eChapter-aligned thematic block\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eQuestion range\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eIllustrative submodules (examples)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eReporting logic\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1. Historical lens of exam fraud prevention\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ1\u0026ndash;Q39\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.1 Traditional cognition; 1.2 Modern governance cognition; 1.3 Situation appraisal \u0026amp; failure attribution; 1.4 Current situation \u0026amp; comparative cognition\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProfile indicators (descriptive)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2. Foundations of high-tech cheating and early governance\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ40\u0026ndash;Q90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2.1 Emerging-technology cheating map \u0026amp; early governance (Q40\u0026ndash;Q50); 2.2 Concept shift in prevention and control (Q51\u0026ndash;Q76); 2.3 CBT-era emerging methods (Q77\u0026ndash;Q90). Q82\u0026ndash;Q84 were reassigned to thematic block 8.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMixed: diagnostic anchors\u0026thinsp;+\u0026thinsp;profile indicators (not forced into a single latent scale)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3. Metal detection and signal shielding practice\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ91\u0026ndash;Q119\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOn-site screening (Q91\u0026ndash;Q107); signal shielding use \u0026amp; controversies (Q108\u0026ndash;Q119)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProfile indicators (descriptive)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e4. Implementation challenges of technical measures\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ120\u0026ndash;Q163\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImplementation constraints and failure points\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProfile indicators / optional subscales\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5. Systematic infrastructure building for technical governance\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ164\u0026ndash;Q199\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSystem building, standardisation, interoperability\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProfile indicators / optional subscales\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e6. Trend appraisal and deeper analysis of high-tech cheating\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ200\u0026ndash;Q252\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTrend appraisal (Q200\u0026ndash;Q229); individual motives (Q230\u0026ndash;Q239); governance confidence \u0026amp; punishment cognition (Q240\u0026ndash;Q252)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProfile indicators; includes reflective subscale candidates\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7. Procurement preferences and collaboration/adoption attitudes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ253\u0026ndash;Q272\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProcurement preference (Q253\u0026ndash;Q260); collaboration/adoption attitudes (Q261\u0026ndash;Q272)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePrimarily functional/context items (reported descriptively)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e8. Respondent background\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eQ273\u0026ndash;Q280\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eDemographics / role / tenure\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCovariates / stratification variables\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003ctfoot\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"4\"\u003e\u003cem\u003eNote: These blocks are reporting/administration units and are interpreted as governance profiles rather than collapsed into a single total score. A small number of items were reclassified during iterative refinement; stable Q-number identifiers are retained for traceability.\u003c/em\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tfoot\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eTo mitigate common-method bias in self-report data, responses were de-identified prior to analysis and reported only in aggregate. The questionnaire was presented in stable Q-number order while mixing response formats (Likert statements, scenario prompts, and background items) to reduce patterned responding. Such procedural remedies help limit method effects such as social desirability and acquiescence (Podsakoff et al. \u003cspan class=\"CitationRef\"\u003e2003\u003c/span\u003e).\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSample and Data Collection\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eData were collected between 2021 and 2023 under COVID‑19‑related constraints. We used convenience sampling to recruit examination administrators involved in high‑stakes examinations. With logistical support from participating examination bodies and partner institutions, paper questionnaires were distributed to eligible respondents and returned via prepaid courier mail. Before participation, respondents received a written information sheet describing the study purpose, voluntary participation, confidentiality protections, data‑retention/access arrangements, and de‑identified analytic practice. Respondents completed the questionnaire independently and returned it directly to the research team for data entry and secure storage.\u003c/p\u003e\n\u003cp\u003eTo enable traceability of returns while protecting privacy, each paper questionnaire was assigned a unique response ID. The analytic dataset contained ID‑coded responses only. Where identifying information was voluntarily provided for administrative follow‑up, the linkage between the response ID and any identifiers was stored separately in a sealed, access‑restricted archive and was not used in statistical analyses. As a procedural check, Q273 assessed respondents\u0026rsquo; satisfaction with the confidentiality and anonymisation arrangements; among respondents who answered this item (n\u0026thinsp;=\u0026thinsp;199), the median rating was 4 (0\u0026ndash;5 scale) and 69.3% selected 4 or 5.\u003c/p\u003e\n\u003cp\u003eApproximately 300 individuals initially indicated willingness to participate, and 253 questionnaires were returned. After manual prescreening, 12 clearly invalid questionnaires were removed. A further 34 cases were excluded following data‑quality checks (e.g., internal consistency and response‑pattern screening), yielding a final valid sample of 207 (valid yield\u0026thinsp;\u0026asymp;\u0026thinsp;81.8% of returned questionnaires).\u003c/p\u003e\n\u003cp\u003eThe valid sample covered 27 provincial‑level administrative regions in mainland China (excluding Anhui, Guizhou, Tibet, and Ningxia; Hong Kong, Macao, and Taiwan were not included), reflecting substantial geographic diversity. Respondents spanned multiple organisational levels, examination domains, and operational roles relevant to exam security and technology governance (Table \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e). Experience in this field was substantial: approximately three‑quarters reported more than five years of work experience, and more than half reported over ten years (Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e). Because recruitment relied on convenience sampling, category proportions should not be interpreted as population estimates; findings are presented as exploratory.\u003c/p\u003e\n\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab2\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eData source and sample characteristics (N\u0026thinsp;=\u0026thinsp;207)\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eVariable and category\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCode\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003en\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e%\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eExamination management level (Q274)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Company / agent / vendor / manufacturer\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Primary school\u0026ndash;vocational college test site\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e47\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; University / research institute test site\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e58\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; District-level exam administration agency\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e40\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Municipal (city)-level exam administration agency\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Provincial-level (or above) exam administration agency\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eExamination domain (Q275)*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Education examinations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e119\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e57\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Personnel examinations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Judicial examinations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Health/medical examinations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Overseas examination bodies operating in China\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Other examinations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eRole in the examination process (Q276)*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Other\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Frontline test-centre staff / invigilator\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e36\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Frontline site lead / chief invigilator\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Staff member at a higher-level supervising authority\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e42\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Section/department head at a higher-level authority\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Senior leader in charge at a higher-level authority\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProcurement role (Q277)*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Not involved in procurement\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e52\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Assist / influence procurement\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Define procurement requirements\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e52\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Decide technical or commercial requirements\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e40\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Approve payment / acceptance\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; Final decision maker\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eYears of work experience (Q279)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 0\u0026ndash;3 years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e14\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 3\u0026ndash;5 years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 5\u0026ndash;10 years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e38\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 10\u0026ndash;15 years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e54\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 15\u0026ndash;20 years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026mdash; 20\u0026thinsp;+\u0026thinsp;years\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003ctfoot\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"4\"\u003e\u003cem\u003eNote. Q275\u0026ndash;Q277 are multi‑response items (respondents could select multiple options). Percentages for these items are reported as the share of the full valid sample (N\u0026thinsp;=\u0026thinsp;207) selecting each option and may therefore sum to more than 100%. Some respondents skipped background items; accordingly, totals for some variables are less than N\u0026thinsp;=\u0026thinsp;207.\u003c/em\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tfoot\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003e\u003cem\u003eContent Validity Evaluation\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eInstrument development followed an iterative cycle of domain mapping, item generation, expert review, and wording refinement. The initial item pool was drafted from the literature, policy documents, and examination-governance practice, covering governance structures, technology deployment, ethical trade-offs, and implementation constraints. An interdisciplinary review group (examination administrators and academic researchers) screened items for clarity, relevance, and sensitivity, leading to iterative revisions and removals.\u003c/p\u003e\n\u003cp\u003eTo provide a traceable and reproducible item-relevance screen, we implemented an LLM-assisted, CVI-style workflow. Using a fixed prompt and a four-point relevance rubric (1\u0026thinsp;=\u0026thinsp;not relevant; 4\u0026thinsp;=\u0026thinsp;highly relevant), the model generated six independent rating runs. We computed CVI-style indices at the item level (I‑CVI) and scale level (S‑CVI/Ave) and applied commonly used screening cut-offs (I‑CVI\u0026thinsp;\u0026ge;\u0026thinsp;0.75; S‑CVI/Ave\u0026thinsp;\u0026ge;\u0026thinsp;0.90) (Polit et al. \u003cspan class=\"CitationRef\"\u003e2007\u003c/span\u003e). These model-based indices were used as additional screening evidence only and did not replace human judgment; final item-retention decisions were made by the research team.\u003c/p\u003e\n\u003cp\u003eDuring refinement, we conducted cognitive-clarity checks of both language versions with domain experts and bilingual reviewers, focusing on comprehensibility, cultural appropriateness, and role realism in scenario-based items.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eTranslation Workflow and AI Assistance\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe ESTG-Q was originally authored in Chinese for administration with examination administrators in China. To support international use and cross-cultural research, we produced an English version via a translation and back-translation procedure guided by established cross-cultural adaptation recommendations (Beaton et al. \u003cspan class=\"CitationRef\"\u003e2000\u003c/span\u003e), with a focus on conceptual equivalence rather than literal translation.\u003c/p\u003e\n\u003cp\u003eInitial English drafts were generated with assistance from a GPT-series large language model to accelerate drafting and maintain terminological consistency, then reviewed and edited by bilingual researchers for conceptual equivalence and domain-appropriate phrasing. An independent professional translator, not involved in drafting, performed back-translation into Chinese; discrepancies were reconciled iteratively with attention to semantic equivalence and policy-relevant nuance. Small-scale clarity checks with bilingual experts were then conducted for both language versions.\u003c/p\u003e\n\u003cp\u003eAI assistance was used for drafting and consistency checks and for the additional CVI-style screening described above; all substantive decisions on wording, retention, and interpretation were made by the authors. We then proceeded to field administration and psychometric analyses, and report reliability, structural evidence for a reflective subscale, and illustrative uses of the attitude-marking data.\u003c/p\u003e"},{"header":"4. Results","content":"\u003cp\u003e\u003cem\u003eWhole-Scale Reliability and Feedback Overview\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eAdditional LLM-assisted content-relevance screening showed high item-level agreement: \u0026gt;90% of items met the screening benchmark (I-CVI\u0026thinsp;\u0026ge;\u0026thinsp;0.75), with S-CVI/Ave\u0026thinsp;=\u0026thinsp;0.92 for all items and 0.97 for core-construct items (Table \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e). Internal consistency for the 233 Likert-type items was high (Cronbach\u0026rsquo;s \u0026alpha;\u0026thinsp;=\u0026thinsp;0.963). Attitude-marking uptake was broad (~\u0026thinsp;90% of participants marked at least one item); overall thumbs-up and thumbs-down rates were 34.1% and 2.8%, respectively (Table \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e; rate definitions in Section \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\n\u003cp\u003e\u0026ldquo;Thumbs-up rate\u0026rdquo; and \u0026ldquo;thumbs-down rate\u0026rdquo; were computed as full-sample proportions over item-by-respondent opportunities. For an item set containing \u003cem\u003eK\u003c/em\u003e items, the thumbs-up rate equals \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\frac{{\\sum\\:}_{i=1}^{N}{\\sum\\:}_{k=1}^{K}1(At{t}_{ik}=Up)}{N\\times\\:K}\\times\\:100\\text{%}\\)\u003c/span\u003e\u003c/span\u003e, and the thumbs-down rate is defined analogously. Unmarked responses and explicit abstentions were included in the denominator but not the numerator.\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSummary of content-relevance screening (CVI-style indices), internal consistency, and respondent-centered acceptability indicators.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\"\u003e\u003cp\u003eIndicator\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eCriterion / rationale\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eResult\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eI-CVI (item relevance)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eI-CVI (item relevance; CVI-style) \u0026mdash; Screening benchmark: I-CVI\u0026thinsp;\u0026ge;\u0026thinsp;0.75 (reported for transparency; interpretation guidelines in expert-panel CVI studies: Polit et al. \u003cspan class=\"CitationRef\"\u003e2007\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026gt;\u0026thinsp;90% of items\u0026thinsp;\u0026ge;\u0026thinsp;0.75\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eS-CVI/Ave (all items)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eS-CVI/Ave (all items) \u0026mdash; Screening benchmark: S-CVI/Ave\u0026thinsp;\u0026ge;\u0026thinsp;0.90 (commonly used in CVI reporting; Polit et al. \u003cspan class=\"CitationRef\"\u003e2007\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eS-CVI/Ave (core-construct items)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eExcluding primarily functional/context items (Q253\u0026ndash;Q280)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e0.97\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eWhole-instrument internal consistency\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCronbach\u0026rsquo;s \u0026alpha; for 233 Likert-type items\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026alpha;\u0026thinsp;=\u0026thinsp;0.963\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eThumbs-up rate (acceptability)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eRespondent-centered item approval signal\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e34.1%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eThumbs-down rate (acceptability)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eRespondent-centered item objection signal\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e2.8%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"3\"\u003e\u003cem\u003eNote: CVI-style indices were computed from LLM-generated rating runs under a fixed prompt and 4-point rubric and are reported as additional screening evidence; final item-retention decisions were made by the research team based on substantive review.\u003c/em\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003e\u003cem\u003eCheating Attribution: Institutional Deterrence and Role-based Discretion (Q236\u0026ndash;Q242)\u003c/em\u003e\u003c/p\u003e\u003cp\u003eSubscale structure and reliability: The seven-item Cheating Attribution subscale (Q236\u0026ndash;Q242) was analysed as a worked example of a reflective subscale. It was designed to capture two theorised dimensions: (1) institutional deterrence /governance beliefs (Q236\u0026ndash;Q238) and (2) role-based moral permissiveness (Q239\u0026ndash;Q242). For composite scoring and factor modelling, Q236\u0026ndash;Q238 were reverse-scored so that higher values indicate greater permissiveness (Section \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e); item-level descriptives are reported in original coding for interpretability (Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e). Internal consistency was acceptable for an early-stage subscale (Cronbach\u0026rsquo;s \u0026alpha;\u0026thinsp;=\u0026thinsp;0.723; Nunnally \u0026amp; Bernstein \u003cspan class=\"CitationRef\"\u003e1994\u003c/span\u003e).\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab4\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eItem analysis and factor loadings for the Cheating Attribution subscale (Q236\u0026ndash;Q242) using principal axis factoring with oblimin rotation. CITC\u0026thinsp;=\u0026thinsp;corrected item\u0026ndash;total correlation. Items Q236\u0026ndash;Q238 are reverse-scored.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\"\u003e\u003cp\u003eItem\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eCITC\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003e\u0026alpha; if deleted\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eFactor 1 loading (institutional)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eFactor 2 loading\u003c/p\u003e\u003cp\u003e(Role-based)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eReverse-scored\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ236\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.174\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.741\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.58\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e-0.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eYes\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ237\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.197\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.737\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eYes\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ238\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.269\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.724\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.58\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.08\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eYes\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ239\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.568\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.654\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eNo\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ240\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.665\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.625\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eNo\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ241\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.534\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.665\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e-0.02\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.75\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eNo\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eQ242\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.581\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.650\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.00\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e0.76\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eNo\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003eExploratory and confirmatory factor analyses: EFA used principal-axis factoring with oblimin rotation (Fabrigar et al. \u003cspan class=\"CitationRef\"\u003e1999\u003c/span\u003e; Costello \u0026amp; Osborne \u003cspan class=\"CitationRef\"\u003e2005\u003c/span\u003e). Sampling adequacy supported factor analysis (KMO\u0026thinsp;=\u0026thinsp;0.728; Bartlett\u0026rsquo;s test of sphericity \u0026chi;\u0026sup2; = 474.26, df\u0026thinsp;=\u0026thinsp;21, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001). Two factors with eigenvalues\u0026thinsp;\u0026gt;\u0026thinsp;1 were extracted, explaining 67.86% of the total variance (Factor 1: 39.77%; Factor 2: 28.09%). The loading pattern was clean, with minimal cross-loadings (absolute values\u0026thinsp;\u0026lt;\u0026thinsp;0.10), supporting the intended separation between institutional deterrence/governance beliefs/governance beliefs and role-based moral permissiveness (Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e). Confirmatory factor analysis (CFA) further supported the two-factor model with good fit (\u0026chi;\u0026sup2;/df\u0026thinsp;=\u0026thinsp;2.10, CFI\u0026thinsp;=\u0026thinsp;0.97, RMSEA\u0026thinsp;=\u0026thinsp;0.06), and the two-factor model outperformed a one-factor alternative (\u0026Delta;\u0026chi;\u0026sup2; test, p\u0026thinsp;\u0026lt;\u0026thinsp;0.01) (Hu \u0026amp; Bentler \u003cspan class=\"CitationRef\"\u003e1999\u003c/span\u003e; Brown \u003cspan class=\"CitationRef\"\u003e2015\u003c/span\u003e). Key EFA and CFA indices are summarized in Table \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab5\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eKey indices for exploratory (EFA) and confirmatory (CFA) factor analyses for Q236\u0026ndash;Q242 (N\u0026thinsp;=\u0026thinsp;207).\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\"\u003e\u003cp\u003eAnalysis\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eIndex\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eValue\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eSuggested criterion\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEigenvalue (Factor 1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e2.784\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026gt;\u0026thinsp;1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEigenvalue (Factor 2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e1.966\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026gt;\u0026thinsp;1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eVariance explained (Factor 1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e39.77%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026mdash;\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eVariance explained (Factor 2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e28.09%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026mdash;\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCumulative variance explained\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e67.86%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026ge;\u0026thinsp;50%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eKMO\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e0.728\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026ge;\u0026thinsp;0.70\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eEFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eBartlett\u0026rsquo;s test\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026chi;\u0026sup2;(21)\u0026thinsp;=\u0026thinsp;474.26;\u003c/p\u003e\u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eSignificant\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCFA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eModel fit\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026chi;\u0026sup2;/df\u0026thinsp;=\u0026thinsp;2.10; CFI\u0026thinsp;=\u0026thinsp;0.97; RMSEA\u0026thinsp;=\u0026thinsp;0.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u0026chi;\u0026sup2;/df\u0026thinsp;\u0026lt;\u0026thinsp;3; CFI\u0026thinsp;\u0026ge;\u0026thinsp;0.95; RMSEA\u0026thinsp;\u0026le;\u0026thinsp;0.06\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003eScenario-based moral leniency: Item-level response patterns underscored the practical difficulty of relying exclusively on human discretion at the point of enforcement. Institutional deterrence beliefs were comparatively consensual. For example, Q236 showed a high central tendency (M\u0026thinsp;=\u0026thinsp;3.61, SD\u0026thinsp;=\u0026thinsp;1.29), with 30.0% selecting endpoint 5 and 4.4% selecting endpoint 0 (Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e). By contrast, role-based scenario items showed wider dispersion. Q242 showed a lower mean and wider dispersion (M\u0026thinsp;=\u0026thinsp;1.81, SD\u0026thinsp;=\u0026thinsp;1.84), with strong condemnation (42.5% at 0) but also a non-trivial permissive tail (10.5% at 5) (Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eTaken together, the role-based items (Q239\u0026ndash;Q242) suggest that a persistent minority selects highly permissive response options (approximately one in ten at the endpoint 5 across items), while a larger middle band falls between strict prohibition and explicit tolerance. This distributional heterogeneity is not direct evidence of real-world misconduct; rather, it indicates that administrators\u0026rsquo; attitudinal responses to governance dilemmas are more variable than their confidence in institutional deterrence. In operational terms, these patterns highlight a governance vulnerability: even when formal rules and technical controls are viewed as strong, integrity outcomes may hinge on discretionary judgments in ambiguous scenarios\u0026mdash;especially in cases where reporting creates reputational or relational costs.\u003c/p\u003e\u003cp\u003eAttitude-marking data indicated that this subscale was broadly acceptable to respondents (Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e), with thumbs-up marks substantially exceeding thumbs-down marks across items \u0026mdash;reflecting the sensitive nature of these ethical scenarios. Still, the imbalance of positive versus negative marks on all these items indicates that most respondents found the questions themselves fair and pertinent, even if their answers diverged. In sum, the Cheating Attribution subscale demonstrated a meaningful two-factor division with practical significance: administrators largely endorse the importance of institutional measures (Factor 1), yet diverge in role-based moral discretion (Factor 2), with a minority expressing permissive attitudes that may complicate enforcement.\u003c/p\u003e\u003cp\u003e\u003cem\u003eDescriptive Insights from Other Modules\u003c/em\u003e\u003c/p\u003e\u003cp\u003eBeyond Q236\u0026ndash;Q242, the ESTG‑Q yielded descriptive evidence across three domains: (a) awareness of technology‑aided cheating trends, (b) governance policy and resource attitudes, and (c) historical perspectives on exam integrity.\u003c/p\u003e\u003cp\u003eIllustrative items suggest broad recognition of the historical shift toward more covert, technology‑aided cheating (e.g., Q40 and Q42; Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e) and mixed views on macro‑level rationales and governance resourcing (e.g., Q50, Q266, and Q267; Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e). The juxtaposition of institutional deterrence beliefs (Q236) and a reporting‑related dilemma (Q242) further indicates that higher endorsement of institutional deterrence statements can coexist with heterogeneous discretionary judgments (Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e). Open‑ended comments (not systematically analysed) noted needs for clearer guidance and training for handling ambiguous cases.\u003c/p\u003e\u003cp\u003eSome items were intentionally retained as forward‑looking \u0026lsquo;stress tests\u0026rsquo; (e.g., speculative AI‑enabled cheating scenarios or highly intrusive countermeasures). Given their exploratory intent and potential contestability across contexts, users may treat these items as optional probes rather than core attitude indicators.\u003c/p\u003e\u003cp\u003eAttitude‑marking complements psychometric indices by providing respondent‑centred acceptability signals. As summarised in Table \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e, thumbs‑down rates remained consistently low across item sets (~\u0026thinsp;2\u0026ndash;4%), whereas thumbs‑up rates varied by topic. This pattern can help prioritise which item sets may warrant rephrasing or additional contextualisation in future refinement. Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e presents a summary of key items from sections (a)\u0026ndash;(c) above, alongside the earlier-mentioned Q236 and Q242, to illustrate the range of findings. These examples showcase the ESTG-Q\u0026rsquo;s granularity: it captures both the broad trends (e.g. general confidence in technology, recognition of hidden cheating) and the pinpoint dilemmas (e.g. whether to report a colleague\u0026rsquo;s lapse, whether to invest huge sums in proctoring systems) that characterize exam security governance today.\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab6\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDescriptive statistics for selected ESTG‑Q items spanning multiple content areas.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\"\u003e\u003cp\u003eItem (ID)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003en\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eMean (SD)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eEndpoint 5 (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eEndpoint 0 (%)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003ePre‑2000 exams were manageable; high‑tech cheating not yet a major threat. (Q40)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e203\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.38 (1.51)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e28.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e8.4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCheating tech vs. counter‑tech is an arms race; cheating is now more covert. (Q42)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e205\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.77 (1.38)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e38.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e4.9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eInvesting heavily in exam‑security tech yields large social/economic returns. (Q50)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e204\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.18 (1.49)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e22.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e8.8\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eA research‑based pilot project with enterprise technical support is needed. (Q266)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e196\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.39 (1.49)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e28.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e8.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eOur institution needs auxiliary project‑funding support. (Q267)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e195\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.79 (1.85)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e24.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e22.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCurrent exam rules and punishments are sufficient to deter cheating. (Q236)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e203\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.61 (1.29)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e30.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e4.4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eStudent opportunism / role-based moral permissiveness.(Q239)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e202\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.16 (1.89)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e13.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e36.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eIf an exam ends \u0026lsquo;smoothly\u0026rsquo;, a cheating incident need not be reported up the chain. (Q242)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e200\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e1.81 (1.84)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e10.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e42.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"5\"\u003e\u003cem\u003eNote. Items are coded 0\u0026ndash;5 (higher scores indicate stronger agreement/endorsement of the statement). \u0026ldquo;Endpoint 5 (%)\u0026rdquo; and \u0026ldquo;Endpoint 0 (%)\u0026rdquo; report valid percentages selecting the scale endpoints (5 or 0), computed using item-level valid responses (n). N indicates the number of valid responses for each item (varying due to item-level nonresponse).\u003c/em\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003eSelected items in Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e\u0026mdash;especially Q236 and Q242\u0026mdash;highlight a policy\u0026ndash;practice tension in exam governance. Figure \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e provides a compact visualization by contrasting endpoint response rates (percentage selecting 0 vs. 5) for an institutional deterrence item (Q236) and a role-based moral dilemma item (Q242). Whereas Q236 shows low outright rejection (4% at 0) alongside substantial strong agreement (30% at 5), Q242 exhibits a much larger rejection endpoint (42.5% at 0) alongside a non-trivial permissive endpoint (10.5% at 5) (Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e; Fig. \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e). This endpoint asymmetry illustrates how institutional confidence can coexist with heterogeneous moral discretion in scenario-based judgments.\u003c/p\u003e\u003cp\u003eAttitude‑marking feedback aggregated by item set (Table \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e) shows consistently low thumbs‑down rates (~\u0026thinsp;2\u0026ndash;4%) and topic‑dependent variation in thumbs‑up rates. Across all item sets, positive marks outweighed negative marks, indicating broad acceptability of the instrument content.\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab7\" border=\"1\" class=\"fr-table-selection-hover\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 7\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eAttitude-marking feedback summary by major item sets.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\"\u003e\u003cp\u003eItem set / Topic\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eThumbs-Up Rate\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\"\u003e\u003cp\u003eThumbs-Down Rate\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eHistorical Cognition \u0026amp; Policy environment (Q1\u0026ndash;Q39)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e36.5%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.5%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eTechnology-aided cheating \u0026amp; countermeasures (Q40\u0026ndash;Q50)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e36.2%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.3%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eTechnology governance \u0026amp; readiness (Q51\u0026ndash;Q119)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e34.4%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.0%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eProctoring technology implementation \u0026amp; ethics (Q120\u0026ndash;Q199)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e32.6%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.9%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCheating psychology, incidents \u0026amp; governance dilemmas (Q200\u0026ndash;Q252)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e33.6%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.5%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eCheating Attribution subscale (Q236\u0026ndash;Q242; subset of Q200\u0026ndash;Q252)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e33.7%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e3.3%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003eGovernance strategy, resources \u0026amp; respondent context (Q253\u0026ndash;Q280)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e33.8%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e1.8%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\"\u003e\u003cp\u003e\u003cem\u003eWhole instrument (Q1\u0026ndash;Q280)\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e34.1%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\"\u003e\u003cp\u003e2.8%\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"3\"\u003e\u003cem\u003eNote. Rates are computed over item-by-respondent opportunities within each item set, using the full valid sample (N\u0026thinsp;=\u0026thinsp;207) as the denominator. \u0026ldquo;Thumbs-Up Rate\u0026rdquo; = total number of thumbs-up marks divided by\u003c/em\u003e \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:N\\times\\:K\\)\u003c/span\u003e\u003c/span\u003e\u003cem\u003e(where K is the number of items in the set), expressed as a percentage; \u0026ldquo;Thumbs-Down Rate\u0026rdquo; is defined analogously. Unmarked responses and abstentions are included in the denominator but not the numerator.\u003c/em\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e"},{"header":"5. Discussion","content":"\u003cp\u003eThis study introduces the ESTG-Q as a modular instrument for characterising examination administrators’ perspectives on exam security and technology governance. The pooled Likert item set showed high internal consistency (Cronbach’s α = 0.963). Given the instrument’s breadth, this value should be interpreted as reliability of the overall item pool rather than as evidence of a single latent construct (Cronbach \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e1951\u003c/span\u003e). Accordingly, we recommend reporting governance domains as profiles and validating reflective subscales only where psychometrically warranted (DeVellis \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). Across respondents from 27 provinces, attitude-marking feedback indicated broad acceptability of the items (low thumbs-down rates; Table\u0026nbsp;\u003cspan refid=\"Tab7\" class=\"InternalRef\"\u003e7\u003c/span\u003e), suggesting that issues such as high-tech cheating, policy enforcement, and exam ethics are widely recognised within this administrative population. At the same time, module-level analyses revealed heterogeneity that would be obscured by any single aggregate score. In particular, items Q236–Q242 supported a two-factor structure separating institutional deterrence/governance beliefs from role-based moral permissiveness, consistent with discretion in “street-level” implementation contexts (Lipsky \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e1980\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eScenario-based items exhibited greater dispersion than deterrence-oriented items, indicating heterogeneity in how administrators evaluate discretionary leniency in role-conflict dilemmas. These responses represent attitudes to hypothetical situations rather than evidence of actual misconduct; nevertheless, they imply that integrity outcomes may partly depend on discretionary judgments at the point of implementation. In line with “street-level” governance accounts emphasising frontline discretion (Lipsky \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e1980\u003c/span\u003e), this pattern motivates clearer escalation and reporting pathways, reporting protections, and scenario-based training targeted at ambiguous cases.\u003c/p\u003e \u003cp\u003eTaken together, the two factors point to a policy–practice tension: respondents can express strong endorsement of institutional deterrence/governance beliefs while differing in role-based judgments under discretionary dilemmas. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e illustrates this contrast by comparing endpoint response rates for a deterrence-oriented item (Q236) and a reporting-related dilemma item (Q242). Conceptually, this aligns with implementation research showing that formal rules and technical controls are enacted through frontline interpretation and discretion (Lipsky \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e1980\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eFor exam governance design, the findings underscore a socio-technical implication: detection and monitoring tools are only as effective as the organisational pathways through which alerts and suspicions are documented, escalated, and reviewed. Designing audit trails and escalation mechanisms may reduce reliance on single-actor discretion, yet stronger automation must be balanced against legitimacy concerns around privacy, proportionality, transparency, and trust (Kharbat \u0026amp; Abu Daabes \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Coghlan et al. \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Institutionally, the results motivate investments in procedural clarity—particularly for reporting and escalation—alongside protections for reporting and structured scenario-based training that targets morally ambiguous cases. Such supports may help align discretionary judgment with policy expectations while recognising the legitimacy challenges that can accompany strict enforcement (Kharbat \u0026amp; Abu Daabes \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Coghlan et al. \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eBeyond psychometrics, the ESTG-Q offers a structured way to document governance trade-offs in a context where exam security infrastructures are comparatively mature, enabling international readers to examine how integrity is operationalised under strong deterrence regimes. Ethically, the findings reinforce that integrity governance is not only about examinee conduct: administrators’ discretionary judgments and moral reasoning form part of the socio-technical system and warrant empirical attention. Supporting consistent practice may therefore require not only technical controls but also clearer guidance and institutional support for handling morally ambiguous cases.\u003c/p\u003e\n\n "},{"header":"6. Limitations and Future Directions","content":"\u003cp\u003eThis initial study is a first step and several limitations point to clear, testable next steps.\u003c/p\u003e\u003cp\u003eSampling and generalisability. Although respondents were drawn from 27 provinces, the sample was modest (N = 207) and anchored in China’s high‑stakes examination environment. The participant group—examination administrators—may not represent other roles or other governance regimes. Replication with larger and more diverse samples is therefore needed, alongside cross‑context comparisons that test whether the same response patterns, factor structure, and parameter estimates emerge. Where cross‑context data are available, formal measurement‑invariance tests can help distinguish stable construct variance from context‑specific functioning.\u003c/p\u003e\u003cp\u003eScope of evidence provided in this paper. The ESTG‑Q is intentionally broad and modular. In this manuscript, we report (a) whole‑instrument internal consistency for the Likert item pool, (b) a worked structural‑validity demonstration for one reflective subscale (Q236–Q242), and (c) descriptive profiling for selected items from other parts of the questionnaire to illustrate response distributions and potential areas of sensitivity. Structural validation of additional reflective subscales—and fit‑for‑purpose modelling strategies for diagnostic or scenario‑based item sets—should be undertaken in targeted follow‑up studies with larger samples and module‑specific analytic plans. This staged approach aligns instrument development with the instrument’s modular design rather than forcing a single, one‑shot validation model.\u003c/p\u003e\u003cp\u003eMeasurement and modelling considerations. The present analyses rely primarily on classical indices and factor‑analytic techniques. Given the instrument’s multi‑domain design, future work could apply multidimensional item response theory to estimate domain‑specific traits and their relations (Reckase \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). Where response patterns suggest both a broad tendency and domain‑specific variance, bifactor models may be used to evaluate whether a general factor (e.g., broad response propensity) can be separated from domain factors (Reise \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). For scenario‑based item sets that may induce local dependence, testlet approaches offer a principled option (Wainer et al. \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). Importantly, the very high whole‑instrument α should be interpreted as internal consistency across a large Likert pool rather than evidence of a single latent construct; future work may also consider item reduction and short forms while preserving conceptual coverage (DeVellis \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eResponse process, behavioural evidence, and criterion‑related validity. Findings are based on self‑reported attitudes and responses to hypothetical scenarios, which may not map cleanly onto real‑world decision behaviour—especially when responses are shaped by perceived interpersonal costs, institutional incentives, or risk of negative consequences. Future studies should triangulate survey responses with behavioural indicators where feasible (e.g., incident‑handling records, audit trails, or structured simulations) and incorporate qualitative follow‑ups (e.g., cognitive interviews) to clarify how respondents interpret scenarios and choose responses. Test–retest reliability, convergent/discriminant evidence with related constructs, and criterion‑related (including predictive) validity evidence would further strengthen the validity argument.\u003c/p\u003e\u003cp\u003eAttitude‑marking as validity‑relevant evidence. The embedded thumbs‑up/ thumbs‑down/ abstain feature provides respondent‑centred signals of acceptability and clarity, but its measurement properties warrant systematic study. Future work could test the stability of marking behaviour, its relationship to missingness and response styles (e.g., satisficing, acquiescence), and whether marking predicts item‑functioning issues that conventional psychometrics may miss (e.g., differential item functioning or scenario‑specific misinterpretation). If supported, marking behaviour could become a practical diagnostic layer for iterative item refinement and cross‑context adaptation.\u003c/p\u003e\u003cp\u003eOverall, the ESTG‑Q provides a foundation for a staged programme of module‑level validation and application across contexts, enabling more granular study of exam‑integrity governance than single‑score instruments.\u003c/p\u003e"},{"header":"7. Conclusion","content":"\u003cp\u003eIn this paper, we presented the development and initial validation of the Exam Security and Technology Governance Questionnaire (ESTG-Q), a modular instrument designed to capture examination administrators\u0026rsquo; perspectives on exam security and technology governance. The instrument supported reliable measurement in this initial deployment and yielded interpretable descriptive profiling across domains. We provided evidence for a two-factor structure in the Cheating Attribution subscale (Q236\u0026ndash;Q242), separating institutional deterrence/governance beliefs from role-based moral permissiveness, indicating that governance attitudes may be partially independent rather than reducible to a single integrity dimension.\u003c/p\u003e \u003cp\u003eWe also introduced an item-level attitude-marking mechanism (thumbs-up/thumbs-down/abstain) as respondent-centred acceptability and perceived-clarity signals that complement psychometric evidence and can guide iterative item refinement. In this study, thumbs-up marks substantially exceeded thumbs-down marks, suggesting limited perceived problematicness of the survey content while still allowing sensitivity differences to be detected at the item-set level.\u003c/p\u003e \u003cp\u003eSubstantively, the findings highlight a core governance implication: integrity regimes that rely on legal deterrence and technical countermeasures still depend on discretionary human decisions at implementation points. Consistent with scale-development practice, ESTG-Q is designed for staged validation, with module-specific analyses reported progressively as part of a broader research programme.\u003c/p\u003e \u003cp\u003eOverall, ESTG-Q offers a flexible tool for examining exam integrity governance as a socio-technical system. Its modular structure supports targeted use for specific governance questions, and its embedded feedback channel adds a practical layer of respondent-centred evidence alongside standard psychometrics. We hope this work supports future research and evaluation that moves beyond prevalence-only accounts toward a more granular understanding of how policies, technologies, and human judgment jointly shape integrity outcomes in high-stakes examinations.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eParticipant consent statement: All participants were adults. Prior to participation, participants received a written information sheet describing the study purpose, voluntary participation, confidentiality protections, and data-handling procedures. Participants could skip any question and could discontinue participation prior to submitting the questionnaire. Return of a completed paper questionnaire was taken as informed consent to participate.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e10. Funding\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was supported by \u0026nbsp;XXXXX \u0026nbsp;Co., Ltd.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e13. Ethics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll procedures involving human participants were conducted in accordance with the Declaration of Helsinki and relevant national requirements, including the National Health Commission of China’s Measures for the Ethical Review of Life Science and Medical Research Involving Human Participants (2023). The study was assessed as minimal risk. Data were collected via paper questionnaires administered to adult participants. Questionnaires were not anonymous at collection because contact/administrative details could be used for distribution and return logistics. An information sheet described the study purpose, voluntary participation, confidentiality protections, and data-handling procedures; participants could skip any item or discontinue prior to submission, and return of a completed questionnaire was taken as informed consent. For analysis, responses were de-identified and stored in an ID-coded dataset. Any linkage between response IDs and identifying information was stored separately under access-restricted, sealed conditions and was not used in statistical analyses.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e14. Use of AI-assisted tools\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA GPT-series large language model was used to assist translation drafting and semantic consistency checks, and to generate simulated relevance ratings for an additional CVI-style content-relevance screening step under a fixed prompt and rubric, as described in the Methods. The model was not used to generate survey responses or to perform statistical analyses. All AI outputs were reviewed by the authors; the model does not meet authorship criteria and all authors take responsibility for the work.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eArnold IJM (2022) Online proctored assessment during COVID-19: Has cheating increased? Journal of Economic Education 53(4):277\u0026ndash;295. https://doi.org/10.1080/00220485.2022.2111384\u003c/li\u003e\n\u003cli\u003eBalash DG, Korkes E, Grant M, Aviv AJ, Fainchtein RA, Sherr M (2023) Educators\u0026rsquo; perspectives of using (or not using) online exam proctoring. In: Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, pp 5091\u0026ndash;5108. (Online) https://www.usenix.org/conference/usenixsecurity23/presentation/balash\u003c/li\u003e\n\u003cli\u003eBeaton DE, Bombardier C, Guillemin F, Ferraz MB (2000) Guidelines for the process of cross-cultural adaptation of self-report measures. Spine (Phila Pa 1976) 25(24):3186\u0026ndash;3191. https://doi.org/10.1097/00007632-200012150-00014\u003c/li\u003e\n\u003cli\u003eBretag T, Harper R, Burton M, Ellis C, Newton P, Rozenberg P, Saddiqui S, van Haeringen K (2019) Contract cheating: a survey of Australian university students. \u003cem\u003eStudies in Higher Education\u003c/em\u003e 44(11):1837\u0026ndash;1856. https://doi.org/10.1080/03075079.2018.1462788\u003c/li\u003e\n\u003cli\u003eBretag T, Mahmud S, Wallace M, Walker R, James C, Green M, East J, McGowan U, Partridge L (2011) Core elements of exemplary academic integrity policy in Australian higher education. International Journal for Educational Integrity 7(2):3\u0026ndash;12. https://ojs.unisa.edu.au/index.php/IJEI/article/view/759\u003c/li\u003e\n\u003cli\u003eBrown TA (2015) Confirmatory factor analysis for applied research. 2nd edn. Guilford Press, New York\u003c/li\u003e\n\u003cli\u003eChaudhry K, Theus AL, Assal H, Chiasson S (2023) \u0026ldquo;It\u0026rsquo;s not that I want to see the student\u0026rsquo;s bedroom\u0026hellip;\u0026rdquo;: Instructor perceptions of e‑proctoring software. In: Proceedings of the 2023 European Symposium on Usable Security (EuroUSEC \u0026rsquo;23). Association for Computing Machinery, New York, NY, USA, pp 15\u0026ndash;26. https://doi.org/10.1145/3617072.3617103\u003c/li\u003e\n\u003cli\u003eChyung SY, Roberts K, Swanson I, Hankinson A (2017) Evidence-based survey design: the use of a midpoint on the Likert scale. Performance Improvement 56(10):15\u0026ndash;23. https://doi.org/10.1002/pfi.21727\u003c/li\u003e\n\u003cli\u003eClarke R, Lancaster T (2013) Commercial aspects of contract cheating. In: Proceedings of the 18th ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE \u0026rsquo;13). Association for Computing Machinery, New York, NY, USA, pp 219\u0026ndash;224. https://doi.org/10.1145/2462476.2462497\u003c/li\u003e\n\u003cli\u003eCoghlan S, Miller T, Paterson J (2021) Good proctor or \u0026ldquo;Big Brother\u0026rdquo;? Ethics of online exam supervision technologies. Philosophy \u0026amp; Technology 34(4):1581\u0026ndash;1606. https://doi.org/10.1007/s13347-021-00476-1\u003c/li\u003e\n\u003cli\u003eCostello AB, Osborne JW (2005) Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assessment, Research, and Evaluation 10(7):1\u0026ndash;9. https://doi.org/10.7275/jyj1-4868\u003c/li\u003e\n\u003cli\u003eCronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16:297\u0026ndash;334. https://doi.org/10.1007/BF02310555\u003c/li\u003e\n\u003cli\u003eDeVellis RF (2016) Scale development: theory and applications. 4th edn. Sage, Thousand Oaks\u003c/li\u003e\n\u003cli\u003eDendir S, Maxwell RS (2020) Cheating in online courses: Evidence from online proctoring. Computers in Human Behavior Reports 2:100033. https://doi.org/10.1016/j.chbr.2020.100033\u003c/li\u003e\n\u003cli\u003eFabrigar LR, Wegener DT, MacCallum RC, Strahan EJ (1999) Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods 4(3):272\u0026ndash;299. https://doi.org/10.1037/1082-989X.4.3.272\u003c/li\u003e\n\u003cli\u003eGarland R (1991) The mid-point on a rating scale: is it desirable? Marketing Bulletin 2:66\u0026ndash;70\u003c/li\u003e\n\u003cli\u003eHu L, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling 6(1):1\u0026ndash;55. https://doi.org/10.1080/10705519909540118\u003c/li\u003e\n\u003cli\u003eKharbat FF, Abu Daabes AS (2021) E-proctored exams during the COVID-19 pandemic: a close understanding. Educ Inf Technol 26:6589\u0026ndash;6605. https://doi.org/10.1007/s10639-021-10458-7\u003c/li\u003e\n\u003cli\u003eKline RB (2015) Principles and practice of structural equation modeling. 4th edn. Guilford Press, New York\u003c/li\u003e\n\u003cli\u003eLipsky M (1980) Street-level bureaucracy: dilemmas of the individual in public services. Russell Sage Foundation, New York\u003c/li\u003e\n\u003cli\u003eNewton PM (2018) How common is commercial contract cheating in higher education and is it increasing? A systematic review. Frontiers in Education 3:67. https://doi.org/10.3389/feduc.2018.00067\u003c/li\u003e\n\u003cli\u003eNoorbehbahani F, Mohammadi A, Aminazadeh M (2022) A systematic review of research on cheating in online exams from 2010 to 2021. Educ Inf Technol 27:8413\u0026ndash;8460. https://doi.org/10.1007/s10639-022-10927-7\u003c/li\u003e\n\u003cli\u003eNunnally JC, Bernstein IH (1994) Psychometric theory. 3rd edn. McGraw-Hill, New York\u003c/li\u003e\n\u003cli\u003ePodsakoff PM, MacKenzie SB, Lee JY, Podsakoff NP (2003) Common method biases in behavioral research: a critical review of the literature and recommended remedies. J Appl Psychol 88(5):879\u0026ndash;903. https://doi.org/10.1037/0021-9010.88.5.879\u003c/li\u003e\n\u003cli\u003ePolit DF, Beck CT, Owen SV (2007) Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Res Nurs Health 30(4):459\u0026ndash;467. https://doi.org/10.1002/nur.20199\u003c/li\u003e\n\u003cli\u003eReckase MD (2009) Multidimensional item response theory. Springer, New York\u003c/li\u003e\n\u003cli\u003eReise SP (2012) The rediscovery of bifactor measurement models. Multivar Behav Res 47(5):667\u0026ndash;696. https://doi.org/10.1080/00273171.2012.715555\u003c/li\u003e\n\u003cli\u003eStanding Committee of the National People\u0026rsquo;s Congress (2015) Criminal Law of the People\u0026rsquo;s Republic of China (Amendment IX) (in Chinese). Supreme People\u0026rsquo;s Procuratorate of the People\u0026rsquo;s Republic of China. https://www.spp.gov.cn/spp/fl/201802/t20180205_364562.shtml. Accessed 30 Dec 2025\u003c/li\u003e\n\u003cli\u003eThe State Council of the People\u0026rsquo;s Republic of China (2025) Campaign targets online misdeeds that threaten integrity of gaokao. https://english.www.gov.cn/news/202506/04/content_WS683fa661c6d0868f4e8f3107.html. Accessed 30 Dec 2025\u003c/li\u003e\n\u003cli\u003eWainer H, Bradlow ET, Wang X (eds) (2007) Testlet response theory and its applications. Cambridge University Press, Cambridge\u003c/li\u003e\n\u003cli\u003eWeber EU, Blais A-R, Betz NE (2002) A domain-specific risk-attitude scale: measuring risk perceptions and risk behaviors. J Behav Decis Making 15:263\u0026ndash;290. https://doi.org/10.1002/bdm.414\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Educational integrity, exam security, technology governance, proctoring, questionnaire development, content relevance, psychometrics, administrators, China","lastPublishedDoi":"10.21203/rs.3.rs-8502983/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8502983/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBackground:\u003cem\u003e\u003cstrong\u003e \u003c/strong\u003e\u003c/em\u003eHigh-stakes examinations are increasingly challenged by technology-enabled cheating and by governance dilemmas in implementing countermeasures. While prior research has examined student misconduct and proctoring technologies, the governance perspective of examination administrators—the frontline implementers of integrity policies—remains under-examined.\u003c/p\u003e\n\u003cp\u003eMethods: We developed the Exam Security and Technology Governance Questionnaire (ESTG‑Q), a modular instrument comprising 280 questions. Development was guided by a provisional domain map (initially drafted as 16 topical categories and 22 hypothesized dimensions) that remains subject to refinement as the broader program progresses. For administration and reporting, items are presented in numerical order and organized into chapter-aligned thematic blocks; these blocks are navigation/reporting units rather than validated latent constructs. The instrument combines 233 Likert-type items with scenario-based dilemmas and contextual/background items. Item development followed domain mapping and iterative review. To improve traceability during item refinement, we used an LLM-assisted, CVI-style content-relevance screening (six independent runs under a fixed 4-point rubric) as additional evidence alongside human judgment. An optional “thumbs-up/thumbs-down/abstain” attitude-marking feature captured respondent-centered item acceptability. The instrument was fielded in a national sample of examination administrators in China (N = 207).\u003c/p\u003e\n\u003cp\u003eResults: The evidence indicated high item relevance agreement (\u0026gt;90% of items had I-CVI ≥ 0.75; S-CVI/Ave = 0.92 for all items and 0.97 for core-construct items). Whole-instrument internal consistency across the 233 Likert-type items was high (Cronbach’s α = 0.963). A seven-item Cheating Attribution subscale (Q236–Q242) showed acceptable reliability (α = 0.723) and a two-factor structure in EFA (Factor 1 = institutional deterrence beliefs, 39.77%; Factor 2 = role-based moral permissiveness, 28.09%; cumulative = 67.86%), supported by CFA (χ²/df = 2.10, CFI = 0.97, RMSEA = 0.06).\u003c/p\u003e\n\u003cp\u003eConclusions:\u003cem\u003e\u003cstrong\u003e \u003c/strong\u003e\u003c/em\u003eThe ESTG-Q enables modular, profile-based assessment of exam security governance at scale while documenting item-level acceptability in sensitive integrity contexts. It supports diagnostic use beyond a single total score and facilitates comparison between deterrence-oriented beliefs and role-based discretionary judgments.\u003c/p\u003e","manuscriptTitle":"Exam Security and Technology Governance Questionnaire (ESTG-Q): A Modular Instrument with an Item-Level Attitude-Marking Mechanism for Exam Integrity Research","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-12 04:58:23","doi":"10.21203/rs.3.rs-8502983/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f1927ccb-97c5-4685-be89-3c44193a8b64","owner":[],"postedDate":"January 12th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-01-12T04:58:24+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-12 04:58:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8502983","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8502983","identity":"rs-8502983","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00