LLM-based generation of USMLE-style questions with ASPET/AMSPC knowledge objectives: All RAGs and no riches

doi:10.22541/au.174315614.40371705/v1

LLM-based generation of USMLE-style questions with ASPET/AMSPC knowledge objectives: All RAGs and no riches

2025 · doi:10.22541/au.174315614.40371705/v1

preprint OA: closed

Full text JSON View at publisher

Full text 2,741 characters · extracted from oa-doi-fallback · 2 sections · click to expand

Abstract

Developing high-quality pharmacology multiple-choice questions (MCQs) is challenging in large part due to continually evolving therapeutic guidelines and the complex integration of basic science and clinical medicine in this subject area. Large language models (LLMs) like ChatGPT-4 have repeatedly demonstrated proficiency in answering medical licensing exam questions, prompting interest in their use for generating high stakes exam-style questions. This study evaluates the performance of ChatGPT-4o in generating USMLE-style pharmacology questions based on ASPET/AMSPC Knowledge Objectives and assesses the impact of retrieval-augmented generation (RAG) on question accuracy and quality. Using standardized prompts, 50 questions (25 RAG, 25 non-RAG) were generated and subsequently evaluated by expert reviewers. Results showed higher accuracy for non-RAG questions (88.0% vs 69.2%), though the difference was not statistically significant. No significant differences were observed in other quality dimensions. These findings suggest that sophisticated LLMs can generate high-quality pharmacology questions efficiently without RAG, though human oversight remains crucial. Supplementary Material File (rag versus non-rag item generation.docx) - Download - 158.25 KB Information & Authors Information Version history Peer review timeline Published British Journal of Clinical Pharmacology Version of Record8 Jun 2025Published Copyright This work is licensed under a Non Exclusive No Reuse License. Collection

Keywords

Authors Metrics & Citations Metrics Article Usage 393views 252downloads Citations Download citation Thomas Thesen, Rupa Tuan, Joe Blumer, et al. LLM-based generation of USMLE-style questions with ASPET/AMSPC knowledge objectives: All RAGs and no riches. Authorea. 28 March 2025. DOI: https://doi.org/10.22541/au.174315614.40371705/v1 DOI: https://doi.org/10.22541/au.174315614.40371705/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu. Cited by - Using assessment in pharmacology to drive learning in the right direction, British Journal of Clinical Pharmacology, (2026).https://doi.org/10.1002/bcp.70533 - A generative AI teaching assistant for personalized learning in medical education, npj Digital Medicine, 8, 1, (2025).https://doi.org/10.1038/s41746-025-02022-1 Loading...

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00