Evaluating the Multiple-Choice Questions Quality at the College of Medicine, University of Bisha, Saudi Arabia: A Three-Year Experience | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Evaluating the Multiple-Choice Questions Quality at the College of Medicine, University of Bisha, Saudi Arabia: A Three-Year Experience A. M. S. Eleragi, Elhadi Miskeen, Kamal Hussein, Assad Ali Rezigalla, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4635200/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 13 Feb, 2025 Read the published version in BMC Medical Education → Version 1 posted 4 You are reading this latest preprint version Abstract Background Assessment is a central tool that drives and shapes students learning. Multiple choice questions (MCQs) are crucial in medical education assessment because they evaluate knowledge across large cohorts. Good quality items will help to achieve the learning objectives and provide trustful results. This study aims to evaluate the quality of MCQs utilized in the final exams of the Principal of Diseases (PRD) course over three academic years at the College of Medicine at The University of Bisha, Saudi Arabia. Method This cross-sectional institutional-based study used the final exams from the PRD course for the academic years 2016–2019. It was conducted at the College of Medicine, University of Bisha (UBCOM), Saudi Arabia (SA). The analysis process used item analysis (IA) of the PRD final theoretical examinations of the 2016–2017, 2017–2018, and 2018–2019 academic years. 80, 70, and 60 MCQ items were used per test in the above-mentioned years, respectively (210 total). The IA targets the reliability (KR20), difficulty index (DIF), discrimination index (DI), and distractor effectiveness (DE). The generated data were analyzed using SPSS (version 25.0), and statistical significance was set at P < 0.05. Results The exams included 210 items. The reliability (KR20) ranged from 0.804 to 0.906. The DI indicated that 56.7% of items were excellent, 20.9% were good, 13.8% were poor, and 8.6% were defective. The DIF showed that 50.5% of items had acceptable difficulty, 37.6% were easy, and 11.9% were difficult. DE analysis revealed that 70.2% of distractors were functional, with a significant correlation between DI, DIF, and DE (P < 0.05). Conclusion The quality of the analyzed MCQs in this study has good discrimination and acceptable difficulty, making them generally of high quality. The study accentuates the importance of continuous item analysis to maintain and improve the quality of assessment tools used in medical education. Discrimination index difficulty index distractor effectiveness multiple-choice questions items item analysis Figures Figure 1 BACKGROUND Assessment is a central tool that drives and shapes student learning. The totals of assessments are many, and MCQs are the most accepted and widely used. MCQs constitute a mainstay in evaluating or assessing [ 1 – 4 ]. One-best-answer (Type A) questions are the most used format [ 4 ]. They consist of a stem (with or without scenario) and a lead-in question, followed by choices, typically one correct answer (key), and three or four distractors. The distractors should act as a placebo and convey misunderstanding or be less accurate than the key answer [ 5 , 6 ]. MCQs are advantageous in assessing a large amount of knowledge. Moreover, well-constructed MCQs can assess higher knowledge domains, such as application, analysis, and synthesis. MCQs can be marked automatically. However, its limitations were reported as the relative difficulty in their construction and time consuming, cues, and technical errors. Furthermore, some authors consider MCQs as multiple-guess items or are only capable of assessing factual information and are ill-suited for testing higher domains [ 5 , 7 , 8 ]. The quality of MCQs is assessed through item analysis. Item analysis, a statistical analysis of student responses on the test or exam, can provide post-examination feedback to students and item composers [ 6 , 9 ]. The parameters of item analysis include reliability (internal constancy), which KR20 assesses, difficulty and discrimination indices of items, and distractor efficiency. Items with average difficulty index, acceptable discrimination indices, and efficient distractors improve examination reliability and validity. Items with irrelevant difficulty or easiness can't be discriminated between examinees. Non-functional distractors make items easier [ 5 , 6 , 10 , 11 ]. Accordingly, the scrutinization of assessment tools through item analysis is a central process and helps in detecting flaws, editing items, and improving their quality [ 3 , 12 – 15 ]. This study aims to assess the quality of MCQs through item analysis of the final examinations of the PRD course throughout three successive academic years at Bisha College of Medicine, Saudi Arabia. The parameters of item analysis used as study variables are reliability (KR20), Difficulty index (DIF), discrimination index (DI), and distractor effectiveness (DE). MATERIALS AND METHODS Setting The study design is cross-sectional and institutional-based and was conducted at the College of Medicine, University of Bisha (UBCOM) [ 9 , 16 ]. The study utilizes the final exams of the principal of diseases course (PRD). The PRD course (7CH) is offered to second-year MBBS students in the second semester of each academic year [ 9 ]. The final exams were designed and constructed through an examination blueprint. The examination blueprint aligns items to the specific learning outcomes and the required knowledge domain [ 17 ]. All items were constructed by subject experts who conducted the course activities. According to the examination blueprint and policy, each exam's total number of items was determined. In UBCOM, MCQs represent 80% of each theory exam weight. The final examinations of three academic years (2016–2019) were utilized in this study. The items for each final exam were 80, 70, and 60 (210), respectively. Item analysis (IA) After verification, students' responses in the final exams were marked electronically using the Datalink 3000 marking machine (Washington, USA). Datalink ACT/SAT uses built-in software that scores and generates item analysis [ 4 ]. Reliability (KR20) The internal constancy and dimensionality of the exam items are assessed through Cronbach's alpha (KR20). It is widely accepted as an estimate of exam reliability. A KR20 value of 0.7 is acceptable for short exams and in-class assessments and 0.8 for long ones. A value of more than 0.8 is required for high stakes and certifying exams [ 18 ]. Discrimination index (DI) A discrimination index is a measure that determines how well an item discriminates between higher and lower achievers' students. It ranges between 1 and 0, however. There are instances where the DI value can be negative DI, which means that the students of lower achievers answer more correctly than those of higher achievers' [ 1 , 18 , 19 ]. DI in the present study was interpreted as follows [ 1 ]: <0 (negative) = Defective or wrong item, 0.0-< 0.15 = Poor (Discard/Revise), 0.15–0.24 = Good (Store) and ≥ 0.25 = Excellent (Store). Difficulty index (DIF) DIF or P-value estimates the percentage of students who answered the item correctly. It defines the degree of difficulty or easiness of an item. An ideal difficulty index is 50–60%, but 30–70% is acceptable in most situations [ 20 – 22 ]. An item with DIF 70% Easy is easy [ 23 ]. Distractor effectiveness analysis (DE) Non-functional distractors (NFDs) were calculated as options other than the key, selected by < 5% of the examinees, whereas functional (FDs) ones were selected by more than 5% of examinees. Based on an item NFDs, DE ranges from 0–100%. If an item contains three, two, one, or nil NFDs, DE would be 0, 33.3%, 66.6%, and 100%, respectively [ 23 ]. Statistical analysis Data from the final exams were tabulated and analyzed using SPSS 25.0 (IBM, Armonk, NY, United States of America). Study variables were presented as a table of frequency and mean standard deviation. The associations between the DI, DIF, and DE values for all items throughout the three years were determined using Pearson correlation analysis and ANOVA. P < 0.05 was considered as an indication of statistical significance. RESULTS Characteristics of the assessed tests Table 1 shows the characteristics of the utilized final exams. The number of items per exam, number of examinees, summary of examines scores and KR20. Table 1 Descriptive statistics of the studied examinations Variables Academic Year Total 2016-17 2017-8 2018-9 No. of items 80 70 60 210 No. of examinees 40 46 38 124 Highest score 78 (97.5%) 62 (88.6%) 48 (80.0%) - Lowest score 35 (43.8%) 20 (28.6%) 22 (36.7%) - Class average 55.5 (69.4%) 37 (52.92%) 34.1 (56.8%) - Class median 56.0 (70%) 35 (50.0%) 33.0 (55%) - KR20 0.906 0.856 0.804 - Discrimination Index (DI) Throughout the three years, out of 210 items included in the study, the DI showed that 119 (56.7%) items were excellent and 45 (20.9%) were good. Poor and wrong items were 29 (13.8%) and 18 (8.6%) respectively. Most MCQs in the study durations have excellent discrimination; poor and wrong MCQs had the lowest percentage (Fig. 1 .A). The items to be stored were 164 (78.1%), while those to be discarded or revised were 46 (21.9%). No significant differences were observed throughout the three years regarding the DI (Table 2 ). Table 2 Discrimination index of items (n = 210) DI Academic Year Total P value Action 2016-17 2017-8 2018-9 Wrong (˂0.0) 3 (3.8) 5 (7.1) 9 (16.7) 17 (8.6) 0.117 Discard/Revise Poor (0–0.14) 12 (15) 9 (12.9) 8 (13.3) 29 (13.8) Discard/Revise Good (0.15–0.24) 14 (17.5) 15 (21.4) 16 (25) 45 (20.9) Store Excellent (≥ 0.25) 51 (63.7) 41 (58.6) 27 (45) 119 (56.7) Store NI 80 (38.1%) 70 (33.3%) 60 (28.6%) 210 (100%) - DI = Discrimination index; NI = number of items. Difficulty Index (DIF) Table 3 shows that, during the study period, MCQs with acceptable DIF were 106 (50.5%), while the easy and difficult items were 79 (37.6%) and 25 (11.9%), respectively. The mean of the DIF was 59.99 (± 23.5 STD). Regarding the acceptable questions, in the first year, they are 32/80 (40%); in the second year, 45/70 (64.3%); and in the third year (29/60 (48.3%). Most MCQs in the study duration were within the acceptable difficulty, while the percentages of difficult MCQs were lower than easy ones (Fig. 1 .B). All the acceptable MCQs were stored in the questions bank. The association between the three academic years showed significant linear-by-linear associations, meaning that the pattern is similar to a greater extent with the dominance of acceptable items (P = 0.001). Table 3 Difficulty index analysis of items (n = 210) DIF Academic Year Total (%) P value Action 2016-17 2017-8 2018-9 DF (˂ 30%) 4(5%) 10(14.3%) 11(18.3%) 25 (11.9%) 0.001 Revise AcDF (30–70%) 32/80 (40%) 45/70 (64.3%) 29/60 (48.3%) 106 (50.5%) Store Easy (> 70%) 44(55%) 15(21.4%) 20(33.3%) 79 (37.6%) Revise/Discard NI 80 (38.1%) 70 (33.3%) 60 (28.6%) 210 (100%) - DIF = Distractor Index; NI = number of items; DF = Difficult; AcDF = acceptable difficulty. Distractor Effectiveness (DE) The total number of distractors in the study was 630. The FDs were 442 (70.2%), distributed by years as 127/240 (52.9%) in the first year, 174/210 (82.9%) in the second year, and 141/180 (78.3%). Items without NFDs were 95 (45.2%), and none had 3 NFDs (Table 4 ). During the study, FDs comprised the majority (Fig. 1 .C). Regarding the numbers of NFDs per MCQs, MCQs with 0NFDs and 1NFDS form the majority (F1.D). FDs through the three academic years were significantly linked (P = 0.00) (Table 4 ). Table 4 Analysis of the Distractor Efficiency and Non-Functional Distractors of Multiple-Choice Questions in this study (N = 630) DE Academic Year Total P value 2016-17 2017-8 2018-9 NI 80 70 60 210 0.000 ND 240 210 180 630 FDs (%) 127/240 (52.9%) 174/210 (82.9%) 141/180 (78.3%) 442/630 (70.2%) NFDs (%) 113 (47.1%) 36 (17.1%) 39 (21.7) 188 (29.8) 0 NFDs 100% 19 (23.7%) 43 (61.4) 33 (55%) 95 (45.2%) 1 NFDs 66.6% 22 (27.5%) 20 (28.6%) 18 (30%) 60 (28.6%) 2 NFDs 33.3% 26 (32.5%) 5 (7.1%) 6 (10%) 37 (17.6%) 3 NFDs 0% 13 (16.3%) 2 (2.9%) 3 (5%) 18 (8.6%) DE = Distractor Effectiveness; NI = Number of items; ND = Number of Distractor; FD = functional distractor; NFD = non-functional distractor. Correlation between DI, DIF, and DE The DI correlated positively with the DIF, where most excellent and good items were observed in the acceptable rank of DIF (Pearson Chi-Square, P = .003) (Fig. 1 . D). The correlation between the DIF and DE was also positive, where most 0 NFD and 1 NFD items were seen in the acceptable category of DIF (Pearson Chi-Square, P = .000). DISCUSSION The UBCOM is a newly emerging medical school. It was launched to graduate knowledgeable, skilled, and competent graduates with high social accountability locally, country-wise, and globally. In this context, a student-centered, problem-based, integrated, community-based, elective, and systematic (SPICES) curriculum was innovated, launched, and adopted for the whole learning process [ 4 ]. Assessment is generally accredited as the driver of the learning processes. It should provide direction and motivation for future learning, protect the public by generating highly professional, knowledgeable, skilled, and competent graduates, and screen out incompetent ones [ 24 ]. Type A MCQs constitute a principal component in medical education assessment processes; in the UBCOM curriculum, this tool constitutes 80% of the theoretical examinations [ 4 ]. This high ratio necessitates continuous scrutinizing and vetting to ensure proper construction and, in turn, provides a powerful student-discriminating tool. Regarding this context, this study was formulated to check the quality of the used MCQs and ensure the provision of valid and reliable examinations. The investigated exams were found to be reliable regarding their Cronbach alpha coefficient or Kuder-Richardson formula 20 (KR-20) values: 0.906, 0.856, and 0.804, respectively. These findings were inconsistent with the range of 0.70–0.89, which was reported by Downing (2004) and van de Watering and van der Rijt (2006), who proposed this range as an ideal and demonstrates excellent reliability of examinations [ 25 , 26 ]. The three years' experience showed a dominance of excellent items (56.7%) and good ones (20.9%) with an overall percentile of 77.6%. This high percentile favors storing them for subsequent use when needed. This finding was consistent with the reported statement that these items can discriminate between high and low-achiever students, in contrast to the poor or wrong items, which have low discriminating efficiency [ 7 ]. The poor (8.6%) and wrong (13.8%), with a cumulative percentile of 22.4%, need to be revised and amended for subsequent use or be discarded, especially the wrong ones, since it was claimed that such items would inversely affect the exam validity [ 27 ]. However, it was well documented that poor or wrong items enable lower performing students to answer more correctly than those with higher ability. This is probably because lower-ability students may select correct responses by guessing rather than real understanding. In contrast, a good student, suspicious of any easy question, takes a harder path to solve and becomes less successful. Probable explanation was wrong key, ambiguous framing of questions, or generalized poor preparation of students [ 1 – 3 ]. The correlation between the DIF and DI in this study supported the statement that they are in a reciprocal relationship [ 7 ]. The mean DIF index in the present investigation was 59.99 (± 23.5 STD); this finding was consistent with several investigations that reported that the acceptable range of mean DIF should be from 30 to 70% [ 1 , 8 , 23 , 28 ]. Most of the items in this study were acceptable [106/210 (50.5%)]. This was more or less in line with those of Gajjar et al. (2014), Patil et al. (2016), and Hassan and Hod (2017). It is necessary to update or review the difficult items because they will result in deflated scores and significantly reduce their ability to discriminate. Rewriting such items is necessary to address errors in language and syntax, uncertainty, contentious statements, and the presence of the incorrect key [ 1 , 28 ]. Furthermore, easy items may contain NFD, making it feasible for upper and lower performers. They were also reported to generate inflated scores and inversely affect the lower achievers' motivation [ 1 , 28 ]. The total number of items in this investigation was 210, of which 442/630 (70.2%) had functional distractors. Items without NFDs (100%) were 95 (45.2%), whereas those with 1 NFD (66.66%) in 60 (28.6%) items. Items with 2 NFDs (33.3%) were 37 (17.6%), while 3 NFDs (0%) items were 18 (8.6%). The distractors (incorrect alternatives) are analyzed to determine their relative usefulness in each item. Items need to be modified if students consistently fail to select certain distractors. Such alternatives are probably implausible and, therefore, of little use as decoys. Therefore, designing plausible distractors and reducing the NFDs are important aspects of framing quality MCQs. More NFD in an item increases DIF I (makes the item easy) and reduces DE. Conversely, an item with more functioning distractors decreases DIF I (makes the item difficult) and increases DE. The higher the DE, the more difficult the question and vice versa, which ultimately relies on the presence/absence of NFDs in an item [ 11 ] [ 10 ] [ 2 ] [ 3 ]. Items with 2NFD and 3NFD in this study constitute 26.2% of the analyzed items; this percentage suggests that teachers have difficulty developing plausible distractors for most MCQs since most of them are newly recruited in this newly emerging medical school. However, this percentage warrants their removal from the question bank or be replaced with more plausible options to improve their quality. In general, this investigation revealed that the evaluated MCQs effectively identified the students with higher achievement levels. This may be attributed to subject matter experts' involvement and the use of an examination blueprint. This aligns with previous reports that emphasize the importance of expert involvement, faculty development, and regular item analysis [ 4 , 29 – 33 ]. Moreover, the electronic marking and analysis system used in this study provides immediate feedback on item performance, enabling timely interventions to address any issues with specific questions [ 33 ]. Strengths of This Study The study evaluates test-taking quality using KR20, DI, DIF, and DE indices. It uses a longitudinal approach to analyze data from three academic years, employing advanced statistical tools. The study found high reliability in MCQs, with subject matter experts' involvement ensuring relevance and accuracy. The Datalink 3000 electronic marking system enhances assessment efficiency. The MCQs were aligned with educational objectives, contributing to the existing literature on MCQ quality in medical education. The study emphasizes the importance of identifying high-quality, reusable MCQs for educators and institutions to improve assessment methods. Limitations This research has several limitations. The fact that the data comes from a single institution could restrict how broadly the conclusions can be applied. Furthermore, the study only examined final exams from a single course, which means that the quality of MCQs utilized in other courses or universities may not be represented. Differences in the item construction practices used by various instructors could also impact the outcomes. CONCLUSIONS In conclusion, Post-examinational analysis using IA in this study provided information regarding the reliability and validity of the items and/or tests by figuring out DIF, DI, DE, and their interrelationships. This investigation showed that most MCQs utilized in the PRD course at Bisha College of Medicine have good discrimination and acceptable difficulty levels, making them generally of high quality. To maintain high-quality MCQs, regular IA and the participation of subject matter experts in question creation are critical. Future research should explore the impact of different item construction practices and expand the analysis to include other courses and institutions to validate these findings further Recommendations This study suggests several recommendations to improve the quality of multiple-choice questions (MCQs) in medical education. These include systematic IA involving subject matter experts, utilizing electronic marking systems, aligning MCQs with learning outcomes, regular faculty training on best practices, and building a robust question bank with validated items. These measures can lead to better learning outcomes and more accurate student performance evaluations. Abbreviations MCQs: Multiple Choice Questions PRD: Principal of Diseases IA: Item Analysis DIF: Difficulty Index DI: Discrimination Index DE: Distractor Effectiveness KR20: Kuder-Richardson Formula 20 UBCOM: University of Bisha College of Medicine FD: Functional Distractor NFD: Non-Functional Distractor NI: Number of items ND: Number of distractors Declarations Ethics approval and consent to participate This work was approved by the University of Bisha's Research and Ethics Committees. All students were informed that their academic test grades would be utilized for quality assurance and academic research. Clinical Trial Number Not applicable Consent for publication Not applicable. FUNDING: not funded AUTHORS CONTRIBUTIONS: All authors have made significant contributions, whether in conception, study design, conduct, data collection, analysis, interpretation, or all of the above; have participated in drafting, revising, or critically reviewing the article; have given final approval for the version to be published; have agreed on the journal to which the article was submitted; and agree to be responsible for all aspects of the work. Data availability The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request Competing interests The authors declared they have no conflicts of interest. ACKNOWLEDGEMENTS Thanks and appreciation were due to the Dean of the College of Medicine, University of Bisha, who offered the permit to carry out and publish this work. The authors thank the Deanship of Graduate Studies and Scientific Research at the University of Bisha for supporting this work through the Fast-Track Research Support Program. References Gajjar S, Sharma R, Kumar P, Rana M: Item and Test Analysis to Identify Quality Multiple Choice Questions (MCQs) from an Assessment of Medical Students of Ahmedabad, Gujarat . Indian J Community Med 2014, 39 (1):17-20. Kolte V: Item analysis of multiple choice questions in physiology examination . Indian J of Basic & Applied Medical Research 2015, 4 (4):320-326. Ingale AS, A. Giri P, Doibale MK: Study on item and test analysis of multiple choice questions amongst undergraduate medical students . International Journal Of Community Medicine And Public Health 2017, 4 (5):1562-1565. Rezigalla AA: AI in medical education: uses of AI in construction type A MCQs . BMC Medical Education 2024, 24 (1):1-9. Singh T: Principles of assessment in medical education , 2 edn. New Delhi, India: Jaypee Brothers Medical Publishers; 2021. Rezigalla AA, Eleragi AMESA, Elhussein AB, Alfaifi J, Alghamdi MA, Al Ameer AY, Yahia AIO, Mohammed OA, Adam MIE: Item analysis: the impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items . BMC Medical Education 2024, 24 (1):445-447. Mehta G, Mokhasi V: Item analysis of multiple choice questions-An assessment of the assessment tool . Int J Health Sci Res 2014, 4 (7):197-202. Patil PS, Dhobale MR, Mudiraj NR: Item analysis of MCQs'-Myths and realities when applying them as an assessment tool for medical students . International Journal of Current Research and Review 2016, 8 (13):12-18. Rezigalla AA, Eleragi AME, Ishag M: Comparison between Students’ Perception toward an examination and item analysis, reliability and validity of the examination . Sudan Journal of Medical Sciences 2020, 15 (2):114-123. Mozaffer Rahim Hingorjo FJ: Analysis of one best MCQs the difficulty index discrimination index and distractor efficiency . J Pak Med Assoc 2012, 62 (2):142 - 147. Tarrant M, Ware J, Mohammed AM: An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis . BMC medical education 2009, 9 (1):1-8. Haladyna TM: Developing and validating multiple-choice test items , 3 edn. New York: Routledge; 2004. Wallach PM, Crespo LM, Holtzman KZ, Galbraith RM, Swanson DB: Use of a committee review process to improve the quality of course examinations . Adv Health Sci Educ Theory Pract 2006, 11 (1):61-68. Wadi MM, Abdul Rahim AF, Yusoff MSB, Baharuddin KA: The effect of MCQ vetting on students' examination performance . Education in Medicine Journal 2014, 6 (2):16-26. K. Mohammed Sowdani HJ, Ara Murad Thomas: Item Analysis as a Tool for Educational Assessment as Compared to Students, Evaluation to lectures . Al Mustansiriyah Journal of Pharmaceutical Sciences 2018, 18 (2):105-113. Rezigalla AA: Observational study designs: Synopsis for selecting an appropriate study design . Cureus 2020, 12 (1):6692-6700. Salih KM, Al-faifi J, Abbas M, Alghamdi MA, Rezigalla AA: Methods of blueprint of a paediatric course in innovative curriculum . oncology 2021, 15 (8):1-4. Rezigalla AA: Item analysis: Concept and application . In: Medical Education for the 21st Century. edn. Edited by MS F, SP S. London, United Kingdom: IntechOpen; 2022: 105-120. Suryadevara VK, Bano Z: Item analysis to identify quality multiple choice questions/items in an assessment in Pharmacology of II MBBS students in Guntur Medical College of Andhra Pradesh, India . International Journal of Basic & Clinical Pharmacology 2018, 7 (8):1517-1521. Amin Z, Khoo HE: Basics in medical education : World Scientific; 2003. Odukoya JA, Adekeye O, Igbinoba AO, Afolabi A: Item analysis of university-wide multiple choice objective examinations: the experience of a Nigerian private university . Quality & quantity 2018, 52 (3):983-997. Sowdani KM, Jaber H, Thomas AM: Item Analysis as a Tool for Educational Assessment as Compared to Students, Evaluation to lectures . Al-Mustansiriyah Journal for Pharmaceutical Sciences 2018, 18 (2):105-113. Rao C, Kishan Prasad HL, Sajitha K, Permi H, Shetty J: Item analysis of multiple choice questions: Assessing an assessment tool in medical students . International Journal of Educational and Psychological Researches 2016, 2 (4):201-204. Epstein RM: Assessment in medical education . New England journal of medicine 2007, 356 (4):387-396. Downing SM: Reliability: on the reproducibility of assessment data . Medical education 2004, 38 (9):1006-1012. van de Watering G, van der Rijt J: Teachers’ and students’ perceptions of assessments: A review and a study into the ability and accuracy of estimating the difficulty levels of assessment items . Educational Research Review 2006, 1 (2):133-147. Ingale AS, Giri PA, Doibale MK: Study on item and test analysis of multiple choice questions amongst undergraduate medical students . International Journal of Community Medicine and Public Health 2017, 4 (5):1562-1565. Hassan S, Hod R: Use of Item Analysis to Improve the Quality of Single Best Answer Multiple Choice Question in Summative Assessment of Undergraduate Medical Students in Malaysia . Education in Medicine Journal 2017, 9 (3):33-43. Abdulghani HM, Ahmad F, Irshad M, Khalil MS, Al-Shaikh GK, Syed S, Aldrees AA, Alrowais N, Haque S: Faculty development programs improve the quality of Multiple Choice Questions items' writing . Scientific Reports 2015, 5 (1):1-7. Elgadal AH, Mariod AA: Item analysis of multiple-choice questions (MCQs): assessment tool for quality assurance measures . Sudan Journal of Medical Sciences 2021, 16 (3):334-346. Gottlieb M, Bailitz J, Fix M, Shappell E, Wagner MJ: Educator's blueprint: A how‐to guide for developing high‐quality multiple‐choice questions . AEM Education and Training 2023, 7 (1):e10836. Touissi Y, Hjiej G, Hajjioui A, Ibrahimi A, Fourtassi M: Does developing multiple-choice questions improve medical students’ learning? A systematic review . Medical Education Online 2022, 27 (1):1-14. Belay LM, Sendekie TY, Eyowas FA: Quality of multiple-choice questions in medical internship qualification examination determined by item response theory at Debre Tabor University, Ethiopia . BMC Medical Education 2022, 22 (1):1-11. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 13 Feb, 2025 Read the published version in BMC Medical Education → Version 1 posted Editorial decision: Revision requested 01 Jul, 2024 Editor assigned by journal 01 Jul, 2024 Submission checks completed at journal 01 Jul, 2024 First submitted to journal 25 Jun, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4635200","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":321175602,"identity":"7c10581f-3f1c-4935-ad81-be3aa07199db","order_by":0,"name":"A. M. S. Eleragi","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6UlEQVRIiWNgGAWjYHACxgMgko29gQ1IMROnB6KF5wCpWhgkEojUwj/t8IEDH/fY2fVJvjF7wFBhndjAf/gBXi0St9MSDs54lpzcJp1jbsBwJj2xQSLNAL81t3MMDvMcYE5mk84xk2BsOwzUwoBfi/zt/A+H/xyoT2aTPAPU8g+ohf/4B7xaDG7nMBxmOHDYjk2CB6ilAaiFIQe/LYa30wwO9hw4nsDGk1ZukHAs3bhNIqcArxa528kPH/w4UG0v335424MPNday/fzHN+DVAgNA9wBBAhCzEaUeCOyJVTgKRsEoGAUjEAAAzEFIjh0AmaMAAAAASUVORK5CYII=","orcid":"","institution":"University of Bisha","correspondingAuthor":true,"prefix":"","firstName":"A.","middleName":"M. S.","lastName":"Eleragi","suffix":""},{"id":321175603,"identity":"f9d10c64-1f4a-4f11-ac7e-bc02661fdf02","order_by":1,"name":"Elhadi Miskeen","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Elhadi","middleName":"","lastName":"Miskeen","suffix":""},{"id":321175604,"identity":"c2b5b8b8-5f4e-4c19-9f83-38de91c21e8e","order_by":2,"name":"Kamal Hussein","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Kamal","middleName":"","lastName":"Hussein","suffix":""},{"id":321175605,"identity":"715eade1-b40a-42d5-a9ae-628f03c49761","order_by":3,"name":"Assad Ali Rezigalla","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Assad","middleName":"Ali","lastName":"Rezigalla","suffix":""},{"id":321175606,"identity":"a174264a-a958-4896-9fb7-e3499fee2185","order_by":4,"name":"Masoud I. E. Adam","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Masoud","middleName":"I. E.","lastName":"Adam","suffix":""},{"id":321175607,"identity":"ceabab9a-1d5b-4f0f-935f-96c9eb0b1495","order_by":5,"name":"Jaber Ahmed Al-Faifi","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Jaber","middleName":"Ahmed","lastName":"Al-Faifi","suffix":""},{"id":321175608,"identity":"9bcf34b6-445c-4b01-a345-e83147ec42e7","order_by":6,"name":"Abdullah Alhalafi","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Abdullah","middleName":"","lastName":"Alhalafi","suffix":""},{"id":321175609,"identity":"4eb50416-a7a7-4996-9e8b-a5b52db97d1b","order_by":7,"name":"Ahmed Y. Al Ameer","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Ahmed","middleName":"Y. Al","lastName":"Ameer","suffix":""},{"id":321175610,"identity":"57ab8720-c79c-4b8c-86c3-82ad89be257e","order_by":8,"name":"Osama A. Mohammed","email":"","orcid":"","institution":"University of Bisha","correspondingAuthor":false,"prefix":"","firstName":"Osama","middleName":"A.","lastName":"Mohammed","suffix":""}],"badges":[],"createdAt":"2024-06-25 09:23:46","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4635200/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4635200/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12909-025-06700-2","type":"published","date":"2025-02-13T15:58:02+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":60937227,"identity":"8a586431-a195-4743-a104-c96994e6869d","added_by":"auto","created_at":"2024-07-23 19:36:34","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":472661,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalysis of Multiple-Choice Questions Quality Indices (DF = Difficult; AcDF = Acceptable difficulty; NFD = Non-functional distractor)\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-4635200/v1/8e90cca3280f212a88241ec8.png"},{"id":76488293,"identity":"1ec7abf8-cb1d-4edb-8f99-a594987f8918","added_by":"auto","created_at":"2025-02-17 16:13:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2750039,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4635200/v1/65bc6693-081f-425c-94ff-1217216e191e.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Evaluating the Multiple-Choice Questions Quality at the College of Medicine, University of Bisha, Saudi Arabia: A Three-Year Experience","fulltext":[{"header":"BACKGROUND","content":"\u003cp\u003eAssessment is a central tool that drives and shapes student learning. The totals of assessments are many, and MCQs are the most accepted and widely used. MCQs constitute a mainstay in evaluating or assessing [\u003cspan additionalcitationids=\"CR2 CR3\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. One-best-answer (Type A) questions are the most used format [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. They consist of a stem (with or without scenario) and a lead-in question, followed by choices, typically one correct answer (key), and three or four distractors. The distractors should act as a placebo and convey misunderstanding or be less accurate than the key answer [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eMCQs are advantageous in assessing a large amount of knowledge. Moreover, well-constructed MCQs can assess higher knowledge domains, such as application, analysis, and synthesis. MCQs can be marked automatically. However, its limitations were reported as the relative difficulty in their construction and time consuming, cues, and technical errors. Furthermore, some authors consider MCQs as multiple-guess items or are only capable of assessing factual information and are ill-suited for testing higher domains [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe quality of MCQs is assessed through item analysis. Item analysis, a statistical analysis of student responses on the test or exam, can provide post-examination feedback to students and item composers [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. The parameters of item analysis include reliability (internal constancy), which KR20 assesses, difficulty and discrimination indices of items, and distractor efficiency. Items with average difficulty index, acceptable discrimination indices, and efficient distractors improve examination reliability and validity. Items with irrelevant difficulty or easiness can't be discriminated between examinees. Non-functional distractors make items easier [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAccordingly, the scrutinization of assessment tools through item analysis is a central process and helps in detecting flaws, editing items, and improving their quality [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan additionalcitationids=\"CR13 CR14\" citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThis study aims to assess the quality of MCQs through item analysis of the final examinations of the PRD course throughout three successive academic years at Bisha College of Medicine, Saudi Arabia. The parameters of item analysis used as study variables are reliability (KR20), Difficulty index (DIF), discrimination index (DI), and distractor effectiveness (DE).\u003c/p\u003e"},{"header":"MATERIALS AND METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSetting\u003c/h2\u003e \u003cp\u003eThe study design is cross-sectional and institutional-based and was conducted at the College of Medicine, University of Bisha (UBCOM) [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. The study utilizes the final exams of the principal of diseases course (PRD). The PRD course (7CH) is offered to second-year MBBS students in the second semester of each academic year [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. The final exams were designed and constructed through an examination blueprint. The examination blueprint aligns items to the specific learning outcomes and the required knowledge domain [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. All items were constructed by subject experts who conducted the course activities. According to the examination blueprint and policy, each exam's total number of items was determined. In UBCOM, MCQs represent 80% of each theory exam weight. The final examinations of three academic years (2016\u0026ndash;2019) were utilized in this study. The items for each final exam were 80, 70, and 60 (210), respectively.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eItem analysis (IA)\u003c/h2\u003e \u003cp\u003eAfter verification, students' responses in the final exams were marked electronically using the Datalink 3000 marking machine (Washington, USA). Datalink ACT/SAT uses built-in software that scores and generates item analysis [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e].\u003c/p\u003e \u003cp\u003e \u003cb\u003eReliability\u003c/b\u003e (KR20)\u003c/p\u003e \u003cp\u003eThe internal constancy and dimensionality of the exam items are assessed through Cronbach's alpha (KR20). It is widely accepted as an estimate of exam reliability. A KR20 value of 0.7 is acceptable for short exams and in-class assessments and 0.8 for long ones. A value of more than 0.8 is required for high stakes and certifying exams [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eDiscrimination index (DI)\u003c/h2\u003e \u003cp\u003eA discrimination index is a measure that determines how well an item discriminates between higher and lower achievers' students. It ranges between 1 and 0, however. There are instances where the DI value can be negative DI, which means that the students of lower achievers answer more correctly than those of higher achievers' [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. DI in the present study was interpreted as follows [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]: \u0026lt;0 (negative)\u0026thinsp;=\u0026thinsp;Defective or wrong item, 0.0-\u0026lt; 0.15\u0026thinsp;=\u0026thinsp;Poor (Discard/Revise), 0.15\u0026ndash;0.24\u0026thinsp;=\u0026thinsp;Good (Store) and \u0026ge;\u0026thinsp;0.25\u0026thinsp;=\u0026thinsp;Excellent (Store).\u003c/p\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e\u003cb\u003eDifficulty index (DIF)\u003c/b\u003e\u003c/h2\u003e \u003cp\u003eDIF or P-value estimates the percentage of students who answered the item correctly. It defines the degree of difficulty or easiness of an item. An ideal difficulty index is 50\u0026ndash;60%, but 30\u0026ndash;70% is acceptable in most situations [\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. An item with DIF\u0026thinsp;\u0026lt;\u0026thinsp;30% is difficult, 30\u0026ndash;70% Acceptable, and \u0026gt;\u0026thinsp;70% Easy is easy [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003eDistractor effectiveness analysis (DE)\u003c/h2\u003e \u003cp\u003eNon-functional distractors (NFDs) were calculated as options other than the key, selected by \u0026lt;\u0026thinsp;5% of the examinees, whereas functional (FDs) ones were selected by more than 5% of examinees. Based on an item NFDs, DE ranges from 0\u0026ndash;100%. If an item contains three, two, one, or nil NFDs, DE would be 0, 33.3%, 66.6%, and 100%, respectively [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eStatistical analysis\u003c/h2\u003e \u003cp\u003eData from the final exams were tabulated and analyzed using SPSS 25.0 (IBM, Armonk, NY, United States of America). Study variables were presented as a table of frequency and mean standard deviation. The associations between the DI, DIF, and DE values for all items throughout the three years were determined using Pearson correlation analysis and ANOVA. \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05 was considered as an indication of statistical significance.\u003c/p\u003e \u003c/div\u003e"},{"header":"RESULTS","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eCharacteristics of the assessed tests\u003c/h2\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e shows the characteristics of the utilized final exams. The number of items per exam, number of examinees, summary of examines scores and KR20.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDescriptive statistics of the studied examinations\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eVariables\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eAcademic Year\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eTotal\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2016-17\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e2017-8\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2018-9\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo. of items\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e70\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e210\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo. of examinees\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e40\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e46\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e38\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e124\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHighest score\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e78 (97.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e62 (88.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e48 (80.0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLowest score\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e35 (43.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e20 (28.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e22 (36.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClass average\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e55.5 (69.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e37 (52.92%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e34.1 (56.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClass median\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e56.0 (70%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e35 (50.0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e33.0 (55%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKR20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.906\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.856\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.804\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eDiscrimination Index (DI)\u003c/h2\u003e \u003cp\u003eThroughout the three years, out of 210 items included in the study, the DI showed that 119 (56.7%) items were excellent and 45 (20.9%) were good. Poor and wrong items were 29 (13.8%) and 18 (8.6%) respectively. Most MCQs in the study durations have excellent discrimination; poor and wrong MCQs had the lowest percentage (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.A). The items to be stored were 164 (78.1%), while those to be discarded or revised were 46 (21.9%). No significant differences were observed throughout the three years regarding the DI (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDiscrimination index of items (n\u0026thinsp;=\u0026thinsp;210)\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eAcademic Year\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eTotal\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eP value\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eAction\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2016-17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e2017-8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2018-9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWrong (˂0.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3 (3.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5 (7.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e9 (16.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e17 (8.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003e0.117\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eDiscard/Revise\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePoor (0\u0026ndash;0.14)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e12 (15)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e9 (12.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e8 (13.3)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e29 (13.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eDiscard/Revise\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGood (0.15\u0026ndash;0.24)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e14 (17.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e15 (21.4)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e16 (25)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e45 (20.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eStore\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eExcellent (\u0026ge;\u0026thinsp;0.25)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e51 (63.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e41 (58.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e27 (45)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e119 (56.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eStore\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e80\u003c/p\u003e \u003cp\u003e(38.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e70\u003c/p\u003e \u003cp\u003e(33.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u003c/p\u003e \u003cp\u003e(28.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e210\u003c/p\u003e \u003cp\u003e(100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eDI\u0026thinsp;=\u0026thinsp;Discrimination index; NI\u0026thinsp;=\u0026thinsp;number of items.\u003c/b\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eDifficulty Index (DIF)\u003c/h2\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows that, during the study period, MCQs with acceptable DIF were 106 (50.5%), while the easy and difficult items were 79 (37.6%) and 25 (11.9%), respectively. The mean of the DIF was 59.99 (\u0026plusmn;\u0026thinsp;23.5 STD). Regarding the acceptable questions, in the first year, they are 32/80 (40%); in the second year, 45/70 (64.3%); and in the third year (29/60 (48.3%). Most MCQs in the study duration were within the acceptable difficulty, while the percentages of difficult MCQs were lower than easy ones (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.B). All the acceptable MCQs were stored in the questions bank. The association between the three academic years showed significant linear-by-linear associations, meaning that the pattern is similar to a greater extent with the dominance of acceptable items (P\u0026thinsp;=\u0026thinsp;0.001).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDifficulty index analysis of items (n\u0026thinsp;=\u0026thinsp;210)\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDIF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eAcademic Year\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eTotal\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eP value\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eAction\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2016-17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e2017-8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2018-9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDF (˂ 30%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e4(5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e10(14.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e11(18.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e25 (11.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003e0.001\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eRevise\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAcDF (30\u0026ndash;70%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e32/80 (40%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e45/70 (64.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e29/60 (48.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e106\u003c/p\u003e \u003cp\u003e(50.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eStore\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEasy (\u0026gt;\u0026thinsp;70%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e44(55%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e15(21.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e20(33.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e79 (37.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eRevise/Discard\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e80\u003c/p\u003e \u003cp\u003e(38.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e70\u003c/p\u003e \u003cp\u003e(33.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u003c/p\u003e \u003cp\u003e(28.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e210\u003c/p\u003e \u003cp\u003e(100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eDIF\u0026thinsp;=\u0026thinsp;Distractor Index; NI\u0026thinsp;=\u0026thinsp;number of items; DF\u0026thinsp;=\u0026thinsp;Difficult; AcDF\u0026thinsp;=\u0026thinsp;acceptable difficulty.\u003c/b\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eDistractor Effectiveness (DE)\u003c/h2\u003e \u003cp\u003eThe total number of distractors in the study was 630. The FDs were 442 (70.2%), distributed by years as 127/240 (52.9%) in the first year, 174/210 (82.9%) in the second year, and 141/180 (78.3%). Items without NFDs were 95 (45.2%), and none had 3 NFDs (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). During the study, FDs comprised the majority (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.C). Regarding the numbers of NFDs per MCQs, MCQs with 0NFDs and 1NFDS form the majority (F1.D). FDs through the three academic years were significantly linked (P\u0026thinsp;=\u0026thinsp;0.00) (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003e\u003cb\u003eAnalysis of the Distractor Efficiency and Non-Functional Distractors of Multiple-Choice Questions in this study (N\u0026thinsp;=\u0026thinsp;630)\u003c/b\u003e\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eAcademic Year\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eTotal\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eP value\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2016-17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e2017-8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2018-9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e70\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e210\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\" morerows=\"7\" rowspan=\"8\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e240\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e210\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e180\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e630\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFDs (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e127/240 (52.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e174/210 (82.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e141/180 (78.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e442/630 (70.2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNFDs (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e113 (47.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e36 (17.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e39 (21.7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e188 (29.8)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e0 NFDs 100%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e19 (23.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e43 (61.4)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e33 (55%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e95 (45.2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1 NFDs 66.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e22 (27.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e20 (28.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e18 (30%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e60 (28.6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2 NFDs 33.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e26 (32.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5 (7.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e6 (10%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e37 (17.6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3 NFDs 0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e13 (16.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e2 (2.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e3 (5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e18 (8.6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eDE\u0026thinsp;=\u0026thinsp;Distractor Effectiveness; NI\u0026thinsp;=\u0026thinsp;Number of items; ND\u0026thinsp;=\u0026thinsp;Number of Distractor; FD\u0026thinsp;=\u0026thinsp;functional distractor; NFD\u0026thinsp;=\u0026thinsp;non-functional distractor.\u003c/b\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eCorrelation between DI, DIF, and DE\u003c/h2\u003e \u003cp\u003eThe DI correlated positively with the DIF, where most excellent and good items were observed in the acceptable rank of DIF (Pearson Chi-Square, P\u0026thinsp;=\u0026thinsp;.003) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. D). The correlation between the DIF and DE was also positive, where most 0 NFD and 1 NFD items were seen in the acceptable category of DIF (Pearson Chi-Square, P\u0026thinsp;=\u0026thinsp;.000).\u003c/p\u003e \u003c/div\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eThe UBCOM is a newly emerging medical school. It was launched to graduate knowledgeable, skilled, and competent graduates with high social accountability locally, country-wise, and globally. In this context, a student-centered, problem-based, integrated, community-based, elective, and systematic (SPICES) curriculum was innovated, launched, and adopted for the whole learning process [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAssessment is generally accredited as the driver of the learning processes. It should provide direction and motivation for future learning, protect the public by generating highly professional, knowledgeable, skilled, and competent graduates, and screen out incompetent ones [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Type A MCQs constitute a principal component in medical education assessment processes; in the UBCOM curriculum, this tool constitutes 80% of the theoretical examinations [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. This high ratio necessitates continuous scrutinizing and vetting to ensure proper construction and, in turn, provides a powerful student-discriminating tool. Regarding this context, this study was formulated to check the quality of the used MCQs and ensure the provision of valid and reliable examinations.\u003c/p\u003e \u003cp\u003eThe investigated exams were found to be reliable regarding their Cronbach alpha coefficient or Kuder-Richardson formula 20 (KR-20) values: 0.906, 0.856, and 0.804, respectively. These findings were inconsistent with the range of 0.70\u0026ndash;0.89, which was reported by Downing (2004) and van de Watering and van der Rijt (2006), who proposed this range as an ideal and demonstrates excellent reliability of examinations [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe three years' experience showed a dominance of excellent items (56.7%) and good ones (20.9%) with an overall percentile of 77.6%. This high percentile favors storing them for subsequent use when needed. This finding was consistent with the reported statement that these items can discriminate between high and low-achiever students, in contrast to the poor or wrong items, which have low discriminating efficiency [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. The poor (8.6%) and wrong (13.8%), with a cumulative percentile of 22.4%, need to be revised and amended for subsequent use or be discarded, especially the wrong ones, since it was claimed that such items would inversely affect the exam validity [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. However, it was well documented that poor or wrong items enable lower performing students to answer more correctly than those with higher ability. This is probably because lower-ability students may select correct responses by guessing rather than real understanding. In contrast, a good student, suspicious of any easy question, takes a harder path to solve and becomes less successful. Probable explanation was wrong key, ambiguous framing of questions, or generalized poor preparation of students [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The correlation between the DIF and DI in this study supported the statement that they are in a reciprocal relationship [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe mean DIF index in the present investigation was 59.99 (\u0026plusmn;\u0026thinsp;23.5 STD); this finding was consistent with several investigations that reported that the acceptable range of mean DIF should be from 30 to 70% [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Most of the items in this study were acceptable [106/210 (50.5%)]. This was more or less in line with those of Gajjar et al. (2014), Patil et al. (2016), and Hassan and Hod (2017). It is necessary to update or review the difficult items because they will result in deflated scores and significantly reduce their ability to discriminate. Rewriting such items is necessary to address errors in language and syntax, uncertainty, contentious statements, and the presence of the incorrect key [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Furthermore, easy items may contain NFD, making it feasible for upper and lower performers. They were also reported to generate inflated scores and inversely affect the lower achievers' motivation [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe total number of items in this investigation was 210, of which 442/630 (70.2%) had functional distractors. Items without NFDs (100%) were 95 (45.2%), whereas those with 1 NFD (66.66%) in 60 (28.6%) items. Items with 2 NFDs (33.3%) were 37 (17.6%), while 3 NFDs (0%) items were 18 (8.6%). The distractors (incorrect alternatives) are analyzed to determine their relative usefulness in each item. Items need to be modified if students consistently fail to select certain distractors. Such alternatives are probably implausible and, therefore, of little use as decoys. Therefore, designing plausible distractors and reducing the NFDs are important aspects of framing quality MCQs. More NFD in an item increases DIF I (makes the item easy) and reduces DE. Conversely, an item with more functioning distractors decreases DIF I (makes the item difficult) and increases DE. The higher the DE, the more difficult the question and vice versa, which ultimately relies on the presence/absence of NFDs in an item [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e] [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e] [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e] [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Items with 2NFD and 3NFD in this study constitute 26.2% of the analyzed items; this percentage suggests that teachers have difficulty developing plausible distractors for most MCQs since most of them are newly recruited in this newly emerging medical school. However, this percentage warrants their removal from the question bank or be replaced with more plausible options to improve their quality.\u003c/p\u003e \u003cp\u003eIn general, this investigation revealed that the evaluated MCQs effectively identified the students with higher achievement levels. This may be attributed to subject matter experts' involvement and the use of an examination blueprint. This aligns with previous reports that emphasize the importance of expert involvement, faculty development, and regular item analysis [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan additionalcitationids=\"CR30 CR31 CR32\" citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Moreover, the electronic marking and analysis system used in this study provides immediate feedback on item performance, enabling timely interventions to address any issues with specific questions [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eStrengths of This Study\u003c/h2\u003e \u003cp\u003eThe study evaluates test-taking quality using KR20, DI, DIF, and DE indices. It uses a longitudinal approach to analyze data from three academic years, employing advanced statistical tools. The study found high reliability in MCQs, with subject matter experts' involvement ensuring relevance and accuracy. The Datalink 3000 electronic marking system enhances assessment efficiency. The MCQs were aligned with educational objectives, contributing to the existing literature on MCQ quality in medical education. The study emphasizes the importance of identifying high-quality, reusable MCQs for educators and institutions to improve assessment methods.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eLimitations\u003c/strong\u003e \u003cp\u003eThis research has several limitations. The fact that the data comes from a single institution could restrict how broadly the conclusions can be applied. Furthermore, the study only examined final exams from a single course, which means that the quality of MCQs utilized in other courses or universities may not be represented. Differences in the item construction practices used by various instructors could also impact the outcomes.\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"CONCLUSIONS","content":"\u003cp\u003eIn conclusion, Post-examinational analysis using IA in this study provided information regarding the reliability and validity of the items and/or tests by figuring out DIF, DI, DE, and their interrelationships. This investigation showed that most MCQs utilized in the PRD course at Bisha College of Medicine have good discrimination and acceptable difficulty levels, making them generally of high quality. To maintain high-quality MCQs, regular IA and the participation of subject matter experts in question creation are critical. Future research should explore the impact of different item construction practices and expand the analysis to include other courses and institutions to validate these findings further\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eRecommendations\u003c/strong\u003e \u003cp\u003eThis study suggests several recommendations to improve the quality of multiple-choice questions (MCQs) in medical education. These include systematic IA involving subject matter experts, utilizing electronic marking systems, aligning MCQs with learning outcomes, regular faculty training on best practices, and building a robust question bank with validated items. These measures can lead to better learning outcomes and more accurate student performance evaluations.\u003c/p\u003e \u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cul\u003e\n \u003cli\u003eMCQs: Multiple Choice Questions\u003c/li\u003e\n \u003cli\u003ePRD: Principal of Diseases\u003c/li\u003e\n \u003cli\u003eIA: Item Analysis\u003c/li\u003e\n \u003cli\u003eDIF: Difficulty Index\u003c/li\u003e\n \u003cli\u003eDI: Discrimination Index\u003c/li\u003e\n \u003cli\u003eDE: Distractor Effectiveness\u003c/li\u003e\n \u003cli\u003eKR20: Kuder-Richardson Formula 20\u003c/li\u003e\n \u003cli\u003eUBCOM: University of Bisha College of Medicine\u003c/li\u003e\n \u003cli\u003eFD: Functional Distractor\u003c/li\u003e\n \u003cli\u003eNFD: Non-Functional Distractor\u003c/li\u003e\n \u003cli\u003eNI: Number of items\u003c/li\u003e\n \u003cli\u003eND: Number of distractors\u0026nbsp;\u003c/li\u003e\n\u003c/ul\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was approved by the University of Bisha\u0026apos;s Research and Ethics Committees. All students were informed that their academic test grades would be utilized for quality assurance and academic research.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eClinical Trial Number\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFUNDING: \u003cstrong\u003enot funded\u003c/strong\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAUTHORS CONTRIBUTIONS:\u0026nbsp;\u003c/strong\u003eAll authors have made significant contributions, whether in conception, study design, conduct, data collection, analysis, interpretation, or all of the above; have participated in drafting, revising, or critically reviewing the article; have given final approval for the version to be published; have agreed on the journal to which the article was submitted; and agree to be responsible for all aspects of the work.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and analyzed during the current study are available from the corresponding author upon reasonable request\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declared they have no conflicts of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eACKNOWLEDGEMENTS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThanks and appreciation were due to the Dean of the College of Medicine, University of Bisha, who offered the permit to carry out and publish this work. The authors thank the Deanship of Graduate Studies and Scientific Research at the University of Bisha for supporting this work through the Fast-Track Research Support Program.\u003cstrong\u003e\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eGajjar S, Sharma R, Kumar P, Rana M: \u003cstrong\u003eItem and Test Analysis to Identify Quality Multiple Choice Questions (MCQs) from an Assessment of Medical Students of Ahmedabad, Gujarat\u003c/strong\u003e. \u003cem\u003eIndian J Community Med \u003c/em\u003e2014, \u003cstrong\u003e39\u003c/strong\u003e(1):17-20.\u003c/li\u003e\n\u003cli\u003eKolte V: \u003cstrong\u003eItem analysis of multiple choice questions in physiology examination\u003c/strong\u003e. \u003cem\u003eIndian J of Basic \u0026amp; Applied Medical Research \u003c/em\u003e2015, \u003cstrong\u003e4\u003c/strong\u003e(4):320-326.\u003c/li\u003e\n\u003cli\u003eIngale AS, A. Giri P, Doibale MK: \u003cstrong\u003eStudy on item and test analysis of multiple choice questions amongst undergraduate medical students\u003c/strong\u003e. \u003cem\u003eInternational Journal Of Community Medicine And Public Health \u003c/em\u003e2017, \u003cstrong\u003e4\u003c/strong\u003e(5):1562-1565.\u003c/li\u003e\n\u003cli\u003eRezigalla AA: \u003cstrong\u003eAI in medical education: uses of AI in construction type A MCQs\u003c/strong\u003e. \u003cem\u003eBMC Medical Education \u003c/em\u003e2024, \u003cstrong\u003e24\u003c/strong\u003e(1):1-9.\u003c/li\u003e\n\u003cli\u003eSingh T: \u003cstrong\u003ePrinciples of assessment in medical education\u003c/strong\u003e, 2 edn. New Delhi, India: Jaypee Brothers Medical Publishers; 2021.\u003c/li\u003e\n\u003cli\u003eRezigalla AA, Eleragi AMESA, Elhussein AB, Alfaifi J, Alghamdi MA, Al Ameer AY, Yahia AIO, Mohammed OA, Adam MIE: \u003cstrong\u003eItem analysis: the impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items\u003c/strong\u003e. \u003cem\u003eBMC Medical Education \u003c/em\u003e2024, \u003cstrong\u003e24\u003c/strong\u003e(1):445-447.\u003c/li\u003e\n\u003cli\u003eMehta G, Mokhasi V: \u003cstrong\u003eItem analysis of multiple choice questions-An assessment of the assessment tool\u003c/strong\u003e. \u003cem\u003eInt J Health Sci Res \u003c/em\u003e2014, \u003cstrong\u003e4\u003c/strong\u003e(7):197-202.\u003c/li\u003e\n\u003cli\u003ePatil PS, Dhobale MR, Mudiraj NR: \u003cstrong\u003eItem analysis of MCQs\u0026apos;-Myths and realities when applying them as an assessment tool for medical students\u003c/strong\u003e. \u003cem\u003eInternational Journal of Current Research and Review \u003c/em\u003e2016, \u003cstrong\u003e8\u003c/strong\u003e(13):12-18.\u003c/li\u003e\n\u003cli\u003eRezigalla AA, Eleragi AME, Ishag M: \u003cstrong\u003eComparison between Students\u0026rsquo; Perception toward an examination and item analysis, reliability and validity of the examination\u003c/strong\u003e. \u003cem\u003eSudan Journal of Medical Sciences \u003c/em\u003e2020, \u003cstrong\u003e15\u003c/strong\u003e(2):114-123.\u003c/li\u003e\n\u003cli\u003eMozaffer Rahim Hingorjo FJ: \u003cstrong\u003eAnalysis of one best MCQs the difficulty index discrimination index and distractor efficiency\u003c/strong\u003e. \u003cem\u003eJ Pak Med Assoc \u003c/em\u003e2012, \u003cstrong\u003e62\u003c/strong\u003e(2):142 - 147.\u003c/li\u003e\n\u003cli\u003eTarrant M, Ware J, Mohammed AM: \u003cstrong\u003eAn assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis\u003c/strong\u003e. \u003cem\u003eBMC medical education \u003c/em\u003e2009, \u003cstrong\u003e9\u003c/strong\u003e(1):1-8.\u003c/li\u003e\n\u003cli\u003eHaladyna TM: \u003cstrong\u003eDeveloping and validating multiple-choice test items\u003c/strong\u003e, 3 edn. New York: Routledge; 2004.\u003c/li\u003e\n\u003cli\u003eWallach PM, Crespo LM, Holtzman KZ, Galbraith RM, Swanson DB: \u003cstrong\u003eUse of a committee review process to improve the quality of course examinations\u003c/strong\u003e. \u003cem\u003eAdv Health Sci Educ Theory Pract \u003c/em\u003e2006, \u003cstrong\u003e11\u003c/strong\u003e(1):61-68.\u003c/li\u003e\n\u003cli\u003eWadi MM, Abdul Rahim AF, Yusoff MSB, Baharuddin KA: \u003cstrong\u003eThe effect of MCQ vetting on students\u0026apos; examination performance\u003c/strong\u003e. \u003cem\u003eEducation in Medicine Journal \u003c/em\u003e2014, \u003cstrong\u003e6\u003c/strong\u003e(2):16-26.\u003c/li\u003e\n\u003cli\u003eK. Mohammed Sowdani HJ, Ara Murad Thomas: \u003cstrong\u003eItem Analysis as a Tool for Educational Assessment as Compared to Students, Evaluation to lectures\u003c/strong\u003e. \u003cem\u003eAl Mustansiriyah Journal of Pharmaceutical Sciences \u003c/em\u003e2018, \u003cstrong\u003e18\u003c/strong\u003e(2):105-113.\u003c/li\u003e\n\u003cli\u003eRezigalla AA: \u003cstrong\u003eObservational study designs: Synopsis for selecting an appropriate study design\u003c/strong\u003e. \u003cem\u003eCureus \u003c/em\u003e2020, \u003cstrong\u003e12\u003c/strong\u003e(1):6692-6700.\u003c/li\u003e\n\u003cli\u003eSalih KM, Al-faifi J, Abbas M, Alghamdi MA, Rezigalla AA: \u003cstrong\u003eMethods of blueprint of a paediatric course in innovative curriculum\u003c/strong\u003e. \u003cem\u003eoncology \u003c/em\u003e2021, \u003cstrong\u003e15\u003c/strong\u003e(8):1-4.\u003c/li\u003e\n\u003cli\u003eRezigalla AA: \u003cstrong\u003eItem analysis: Concept and application\u003c/strong\u003e. In: \u003cem\u003eMedical Education for the 21st Century.\u003c/em\u003e edn. Edited by MS F, SP S. London, United Kingdom: IntechOpen; 2022: 105-120.\u003c/li\u003e\n\u003cli\u003eSuryadevara VK, Bano Z: \u003cstrong\u003eItem analysis to identify quality multiple choice questions/items in an assessment in Pharmacology of II MBBS students in Guntur Medical College of Andhra Pradesh, India\u003c/strong\u003e. \u003cem\u003eInternational Journal of Basic \u0026amp; Clinical Pharmacology \u003c/em\u003e2018, \u003cstrong\u003e7\u003c/strong\u003e(8):1517-1521.\u003c/li\u003e\n\u003cli\u003eAmin Z, Khoo HE: \u003cstrong\u003eBasics in medical education\u003c/strong\u003e: World Scientific; 2003.\u003c/li\u003e\n\u003cli\u003eOdukoya JA, Adekeye O, Igbinoba AO, Afolabi A: \u003cstrong\u003eItem analysis of university-wide multiple choice objective examinations: the experience of a Nigerian private university\u003c/strong\u003e. \u003cem\u003eQuality \u0026amp; quantity \u003c/em\u003e2018, \u003cstrong\u003e52\u003c/strong\u003e(3):983-997.\u003c/li\u003e\n\u003cli\u003eSowdani KM, Jaber H, Thomas AM: \u003cstrong\u003eItem Analysis as a Tool for Educational Assessment as Compared to Students, Evaluation to lectures\u003c/strong\u003e. \u003cem\u003eAl-Mustansiriyah Journal for Pharmaceutical Sciences \u003c/em\u003e2018, \u003cstrong\u003e18\u003c/strong\u003e(2):105-113.\u003c/li\u003e\n\u003cli\u003eRao C, Kishan Prasad HL, Sajitha K, Permi H, Shetty J: \u003cstrong\u003eItem analysis of multiple choice questions: Assessing an assessment tool in medical students\u003c/strong\u003e. \u003cem\u003eInternational Journal of Educational and Psychological Researches \u003c/em\u003e2016, \u003cstrong\u003e2\u003c/strong\u003e(4):201-204.\u003c/li\u003e\n\u003cli\u003eEpstein RM: \u003cstrong\u003eAssessment in medical education\u003c/strong\u003e. \u003cem\u003eNew England journal of medicine \u003c/em\u003e2007, \u003cstrong\u003e356\u003c/strong\u003e(4):387-396.\u003c/li\u003e\n\u003cli\u003eDowning SM: \u003cstrong\u003eReliability: on the reproducibility of assessment data\u003c/strong\u003e. \u003cem\u003eMedical education \u003c/em\u003e2004, \u003cstrong\u003e38\u003c/strong\u003e(9):1006-1012.\u003c/li\u003e\n\u003cli\u003evan de Watering G, van der Rijt J: \u003cstrong\u003eTeachers\u0026rsquo; and students\u0026rsquo; perceptions of assessments: A review and a study into the ability and accuracy of estimating the difficulty levels of assessment items\u003c/strong\u003e. \u003cem\u003eEducational Research Review \u003c/em\u003e2006, \u003cstrong\u003e1\u003c/strong\u003e(2):133-147.\u003c/li\u003e\n\u003cli\u003eIngale AS, Giri PA, Doibale MK: \u003cstrong\u003eStudy on item and test analysis of multiple choice questions amongst undergraduate medical students\u003c/strong\u003e. \u003cem\u003eInternational Journal of Community Medicine and Public Health \u003c/em\u003e2017, \u003cstrong\u003e4\u003c/strong\u003e(5):1562-1565.\u003c/li\u003e\n\u003cli\u003eHassan S, Hod R: \u003cstrong\u003eUse of Item Analysis to Improve the Quality of Single Best Answer Multiple Choice Question in Summative Assessment of Undergraduate Medical Students in Malaysia\u003c/strong\u003e. \u003cem\u003eEducation in Medicine Journal \u003c/em\u003e2017, \u003cstrong\u003e9\u003c/strong\u003e(3):33-43.\u003c/li\u003e\n\u003cli\u003eAbdulghani HM, Ahmad F, Irshad M, Khalil MS, Al-Shaikh GK, Syed S, Aldrees AA, Alrowais N, Haque S: \u003cstrong\u003eFaculty development programs improve the quality of Multiple Choice Questions items\u0026apos; writing\u003c/strong\u003e. \u003cem\u003eScientific Reports \u003c/em\u003e2015, \u003cstrong\u003e5\u003c/strong\u003e(1):1-7.\u003c/li\u003e\n\u003cli\u003eElgadal AH, Mariod AA: \u003cstrong\u003eItem analysis of multiple-choice questions (MCQs): assessment tool for quality assurance measures\u003c/strong\u003e. \u003cem\u003eSudan Journal of Medical Sciences \u003c/em\u003e2021, \u003cstrong\u003e16\u003c/strong\u003e(3):334-346.\u003c/li\u003e\n\u003cli\u003eGottlieb M, Bailitz J, Fix M, Shappell E, Wagner MJ: \u003cstrong\u003eEducator\u0026apos;s blueprint: A how‐to guide for developing high‐quality multiple‐choice questions\u003c/strong\u003e. \u003cem\u003eAEM Education and Training \u003c/em\u003e2023, \u003cstrong\u003e7\u003c/strong\u003e(1):e10836.\u003c/li\u003e\n\u003cli\u003eTouissi Y, Hjiej G, Hajjioui A, Ibrahimi A, Fourtassi M: \u003cstrong\u003eDoes developing multiple-choice questions improve medical students\u0026rsquo; learning? A systematic review\u003c/strong\u003e. \u003cem\u003eMedical Education Online \u003c/em\u003e2022, \u003cstrong\u003e27\u003c/strong\u003e(1):1-14.\u003c/li\u003e\n\u003cli\u003eBelay LM, Sendekie TY, Eyowas FA: \u003cstrong\u003eQuality of multiple-choice questions in medical internship qualification examination determined by item response theory at Debre Tabor University, Ethiopia\u003c/strong\u003e. \u003cem\u003eBMC Medical Education \u003c/em\u003e2022, \u003cstrong\u003e22\u003c/strong\u003e(1):1-11.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-education","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"meed","sideBox":"Learn more about [BMC Medical Education](http://bmcmededuc.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/meed/default.aspx","title":"BMC Medical Education","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Discrimination index, difficulty index, distractor effectiveness, multiple-choice questions, items, item analysis","lastPublishedDoi":"10.21203/rs.3.rs-4635200/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4635200/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eAssessment is a central tool that drives and shapes students learning. Multiple choice questions (MCQs) are crucial in medical education assessment because they evaluate knowledge across large cohorts. Good quality items will help to achieve the learning objectives and provide trustful results. This study aims to evaluate the quality of MCQs utilized in the final exams of the Principal of Diseases (PRD) course over three academic years at the College of Medicine at The University of Bisha, Saudi Arabia.\u003c/p\u003e\u003ch2\u003eMethod\u003c/h2\u003e \u003cp\u003eThis cross-sectional institutional-based study used the final exams from the PRD course for the academic years 2016\u0026ndash;2019. It was conducted at the College of Medicine, University of Bisha (UBCOM), Saudi Arabia (SA). The analysis process used item analysis (IA) of the PRD final theoretical examinations of the 2016\u0026ndash;2017, 2017\u0026ndash;2018, and 2018\u0026ndash;2019 academic years. 80, 70, and 60 MCQ items were used per test in the above-mentioned years, respectively (210 total). The IA targets the reliability (KR20), difficulty index (DIF), discrimination index (DI), and distractor effectiveness (DE). The generated data were analyzed using SPSS (version 25.0), and statistical significance was set at P\u0026thinsp;\u0026lt;\u0026thinsp;0.05.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe exams included 210 items. The reliability (KR20) ranged from 0.804 to 0.906. The DI indicated that 56.7% of items were excellent, 20.9% were good, 13.8% were poor, and 8.6% were defective. The DIF showed that 50.5% of items had acceptable difficulty, 37.6% were easy, and 11.9% were difficult. DE analysis revealed that 70.2% of distractors were functional, with a significant correlation between DI, DIF, and DE (P\u0026thinsp;\u0026lt;\u0026thinsp;0.05).\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThe quality of the analyzed MCQs in this study has good discrimination and acceptable difficulty, making them generally of high quality. The study accentuates the importance of continuous item analysis to maintain and improve the quality of assessment tools used in medical education.\u003c/p\u003e","manuscriptTitle":"Evaluating the Multiple-Choice Questions Quality at the College of Medicine, University of Bisha, Saudi Arabia: A Three-Year Experience","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-07-23 19:28:29","doi":"10.21203/rs.3.rs-4635200/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-07-01T11:25:18+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-07-01T10:54:41+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-07-01T10:53:21+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Medical Education","date":"2024-06-25T09:22:13+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-education","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"meed","sideBox":"Learn more about [BMC Medical Education](http://bmcmededuc.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/meed/default.aspx","title":"BMC Medical Education","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"c794f480-6422-44a4-a5bd-cef6b207276e","owner":[],"postedDate":"July 23rd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-02-17T16:10:43+00:00","versionOfRecord":{"articleIdentity":"rs-4635200","link":"https://doi.org/10.1186/s12909-025-06700-2","journal":{"identity":"bmc-medical-education","isVorOnly":false,"title":"BMC Medical Education"},"publishedOn":"2025-02-13 15:58:02","publishedOnDateReadable":"February 13th, 2025"},"versionCreatedAt":"2024-07-23 19:28:29","video":"","vorDoi":"10.1186/s12909-025-06700-2","vorDoiUrl":"https://doi.org/10.1186/s12909-025-06700-2","workflowStages":[]},"version":"v1","identity":"rs-4635200","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4635200","identity":"rs-4635200","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.