A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT

doi:10.21203/rs.3.rs-5559102/v1

A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT

2025 · doi:10.21203/rs.3.rs-5559102/v1

preprint OA: closed

Full text JSON View at publisher

Full text 153,588 characters · extracted from preprint-html · click to expand

A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT Cong Zhang, Xin Gao, Xuebin Zheng, Jun Xie, Gang Feng, Yunchao Bao, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5559102/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 18 Apr, 2025 Read the published version in EJNMMI Research → Version 1 posted 5 You are reading this latest preprint version Abstract Background The quality of clinical PET/CT images is critical for both accurate diagnosis and image-based research. However, current image quality assessment (IQA) methods predominantly rely on handcrafted features and region-specific analyses, thereby limiting automation in whole-body and multi-center evaluations. This study aims to develop an expert-perceptive deep learning-based IQA system for [18F]FDG PET/CT to tackle the lack of automated, interpretable assessments of clinical whole-body PET/CT image quality. Methods This retrospective multicenter study included clinical whole-body [18F]FDG PET/CT scans from 718 patients. Automated identification and localization algorithms were applied to select predefined pairs of PET and CT slices from whole-body images. Fifteen experienced experts, trained to conduct blinded slice-level subjective assessments, provided average visual scores as reference standards. Using the MANIQA framework, the developed IQA model integrates the Vision Transformer, Transposed Attention, and Scale Swin Transformer Blocks to categorize PET and CT images into five quality classes. The model’s correlation, consistency, and accuracy with expert evaluations on both PET and CT test sets were statistically analysed to assess the system's IQA performance. Additionally, the model's ability to distinguish high-quality images was evaluated using receiver operating characteristic (ROC) curves. Results The IQA model demonstrated high accuracy in predicting image quality categories and showed strong concordance with expert evaluations of PET/CT image quality. In predicting slice-level image quality across all body regions, the model achieved an average accuracy of 0.832 for PET and 0.902 for CT. The model’s scores showed substantial agreement with expert assessments, achieving average Spearman coefficients (ρ) of 0.891 for PET and 0.624 for CT, while the average Intraclass Correlation Coefficient (ICC) reached 0.953 for PET and 0.92 for CT. The PET IQA model demonstrated strong discriminative performance, achieving an area under the curve (AUC) of ≥ 0.88 for both the thoracic and abdominal regions. Conclusions This fully automated IQA system provides a robust and comprehensive framework for the objective evaluation of clinical image quality. Furthermore, it demonstrates significant potential as an impartial, expert-level tool for standardised multicenter clinical IQA. [18F]FDG Image quality assessment Positron emission tomography/computed tomography whole-body deep learning Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Background [18F]fluorodeoxyglucose positron emission tomography/computed tomography ([18F]FDG PET/CT) is an indispensable non-invasive imaging tool for detecting and monitoring oncologic diseases [ 1 ]. The quality of PET/CT images directly influences clinical decision-making and has significant implications for patient outcomes. However, as medical centers accumulate vast PET/CT datasets daily, evaluating clinical PET/CT image quality on a multicenter scale has become a core goal in quality management within nuclear medicine. Traditional image quality assessment (IQA)[ 2 , 3 ] methods, which rely on subjective evaluations by nuclear medicine experts, face substantial obstacles, including inconsistent evaluation standards and inefficiency when managing large clinical datasets. This emphasizes the urgent need for objective, widely accepted artificial intelligence (AI) tools to support multicenter PET/CT IQA. Recent advances in computational power have enabled promising AI-driven solutions [ 4 ], with deep learning (DL)[ 5 , 6 ] demonstrating notable capabilities in extracting relevant patterns from large datasets [ 7 ]. DL has already achieved impressive results in medical image analysis tasks, such as reconstruction, denoising, registration, segmentation, and modeling [ 8 – 12 ]. However, significant challenges remain in applying DL to IQA [ 13 , 14 ]. DL applications rely extensively on curated datasets, but acquiring high-quality annotated data is labor-intensive and costly [ 15 ]. In IQA, objective and subjective methods have traditionally been applied. Objective metrics, such as mean squared error, signal-to-noise ratio, and structural similarity index [ 16 ], provide quick and repeatable evaluations. However, due to the non-linear nature of human visual processing [ 17 – 20 ], these content-independent objective metrics often fail to fully capture radiologists' diagnostic perceptions. Consequently, subjective assessments, particularly those from experienced radiologists, are considered more reliable for evaluating medical image quality [ 21 ]. Nonetheless, subjective assessments are limited by inconsistent evaluation criteria, variations in physician expertise, evaluator fatigue, and environmental factors, challenging their reliability and reproducibility across clinical settings [ 22 ]. Multiple factors, both equipment-related and human-related, contribute to the overall quality of clinical PET/CT images. Equipment-related factors, including device hardware configurations and image processing software within the acquisition pipeline, significantly affect image quality. Human-related variables, such as operator expertise, injection dosage, drug leakage and patient-specific variables (e.g., glucose levels, movement-induced artifacts), further complicate accurate image assessment. These factors make it challenging to evaluate one aspect without compromising others. Together, these factors make subjective evaluation of clinical PET/CT images a complex task. DL offers a promising solution for emulating expert-level evaluations, with its ability to capture subtle image features and adapt to complex visual contexts. Recent studies have focused on developing automated methods for medical image quality evaluation. For example, Lin et al. [ 23 ] found that handcrafted features were valuable in data-limited settings, but combining these features with convolutional neural networks (CNNs) achieved the best performance. Additionally, DL-based models are now commonly applied to evaluate the quality of whole-heart [ 24 ], liver [ 25 ] magnetic resonance images, and PET scans [ 26 , 27 ], often outperforming human observers. Although previous studies have shown promise, most region-specific approaches typically rely on handcrafted features to create objective metrics [ 28 , 29 ] or explore correlations with subjective human perception [ 30 , 31 ]. Additionally, these methods are often constrained by limited datasets and small numbers of evaluators, making them applicable only in specific settings and limiting their acceptance as unbiased, broadly applicable IQA tools for PET/CT images. Due to the inefficiency of handcrafted metrics and varying image quality across anatomical regions, whole-body PET/CT presents unique challenges, limiting the efficacy of single-model assessments. To address these challenges, we trained our model on a large clinical dataset with assessments from 15 experts as references—the largest dataset available to date. Our study introduces a fully automated, DL-based IQA framework for whole-body PET/CT images. This framework offers: (1) customizable algorithms for automatic slice selection across diverse regions, establishing a standardized foundation for multicenter IQA; and (2) expert-level modeling of subjective evaluations by integrating expert perceptual assessments with DL capabilities, enabling automated quality assessment. Our study aims to support standardized IQA for multicenter collaborations and advance high-quality, image-based nuclear medicine research in the future. Materials and methods Ethics approval for this retrospective analysis was obtained from the relevant institutional ethics committees, with a waiver of informed consent. Study design and data acquisition We retrospectively obtained DICOM data from 718 patients who underwent whole-body [18F]FDG PET/CT scans across four institutions. All PET/CT scanners underwent routine quality assurance (QA) and quality control (QC) procedures, with their system characteristics detailed in Table 1 . These QA/QC programs ensured that scanner performance met the necessary standards required for multi-center [18F]FDG PET/CT studies. The inclusion criteria were: (a) whole-body [18F]FDG PET/CT scans (from head to mid-thigh), (b) complete visualization of all major organs. The exclusion criteria are as follows: (a) incomplete or corrupted PET/CT scan data, with image loss or distortion due to data errors; (b) missing organ data or misalignment between CT and PET data, where the distance between corresponding key slices in the two modalities is unequal, preventing identification of the key slices; (c) severe artifacts or patient motion. Table 1 Patient Demographics and Image Acquisition Parameters for Data from Different Centers Characteristics PLAGH SHCenter HZCenter GZCenter (n = 104) (n = 428) (n = 142) (n = 44) Age (years) 57.16 ± 12.10 51.30 ± 12.39 64.38 ± 10.82 61.27 ± 10.56 Sex Male 81 249 93 28 Female 23 179 49 16 BMI ( kg/m 2 ) 24.70 ± 3.66 23.60 ± 3.99 23.19 ± 3.60 21.95 ± 3.85 Devices United Imaging (uMI 510, uEXPLORER*); Siemens(Biograph Vision 600); GE (Discovery 710) Siemens (Biograph mCT) GE (Discovery 710) Siemens (Biograph mCT) PET Matrix 192×192&(head)128×128;192×192& (head)256×256*; 440×440;192×192 200×200 192×192 200×200 PET Slice thickness, mm 2.44, 2.886*; 3; 5 5 5 5 PET Recon method OSEM; True X + TOF; Vue Point HD/FX True X + TOF Vue Point HD/FX True X + TOF Gaussian filter, FWHM, mm 3.0,3.0*;5.0&(head)2.0;3.0 5.0 3.0 5.0 * Data derived from United Imaging (uEXPLORER); BMI, body mass index; n, number of patients. Data are presented as the mean ± standard deviation (SD). PLAGH: Chinese PLA General Hospital; SHCenter: Shanghai Universal Medical Imaging Diagnostic Center; HZCenter: Hangzhou Universal Medical Imaging Diagnostic Center; GZCenter: Guangzhou Universal Medical Imaging Diagnostic Center. To facilitate modeling, a preliminary assessment was performed by two experienced radiologists, each with over 10 years in practice, to ensure a balanced distribution of images classified as good, moderate, or poor quality across the dataset. We developed customized algorithms to automatically localize seven key axial slices of the whole body that capture critical anatomical landmarks (including the basal ganglia, cerebellar vermis, aortic arch, tracheal carina, liver, pancreas, and iliac bone). These slices were selected for their anatomical significance and reproducibility across patients. A full description of our segmentation process is provided in Supplementary Section 1.1, while the localization methodology is detailed in Supplementary Section 1.2 (Key Slices Selection Logic). Following standardized on-table training, 15 nuclear medicine experts conducted blinded subjective evaluations on these predefined slices for 718 patients. After data preprocessing, 4,394 CT and 4,180 PET images were randomly divided into training, validation, and test sets at an 8:1:1 ratio [ 32 ], as shown in Fig. 1 , which illustrates the flow diagram of PET/CT data collection and dataset division. The complete workflow of the IQA system is outlined in Fig. 2 . Segmentation of Anatomical Regions Accurate identification and localization of key CT images requires a sophisticated segmentation model. For this purpose, we utilized TotalSegmentator [ 33 ], which is based on the nnUNet framework (Fig. S2 ) [ 34 ], to perform fine-grained segmentation of whole-body regions (five torso and two cranial regions), as illustrated in Fig. S1 . The selection of axial key slices was guided by both the segmentation results and our defined anatomical criteria. CT Key-Slice Localization Scheme Key-slice localization is crucial for both physician-guided and automated DL assessments of PET/CT image quality. To facilitate this process, numerous functions from the OpenCV package (version 3.4.2) [ 35 ] were used. The cv2.matchShapes() function was used for the automatic identification and extraction of key slices, which acted as benchmarks to support physician-based IQA. The automated identification process involved several steps. Initially, the extracted image was converted into a grayscale image and then transformed into a binary image. The contour detection function cv.findContours() in OpenCV was then used to identify all CT image contours. Subsequently, the cv.matchShapes() function was applied to compare the contours of the image with those of a reference image from experts. By identifying the contour with the minimum degree of mismatch, the algorithm pinpointed the key slices within the image. To better align with the varying characteristics of different anatomical regions, we tailored the localization scheme by incorporating selection logic. For example, we customized the automatic recognition function for the aortic arch in the thorax and implemented tailored evaluation criteria for this slice. Details of the key slices selection logic are summarized in the Supplementary Table S1 . This approach enabled us to accurately pinpoint and segment each relevant slice in the PET/CT images. Aligned Key-Slices Acquisition After selecting the key slices from the CT images, the PET images were aligned with the initial axial head slices of the CT scans to ensure consistent ranges as shown in Fig. 2 a, which depicts the process of segmenting organs on CT images and identifying corresponding slices in PET images. Key PET slices were then identified in tandem with their CT counterparts, using the physical distance metric to streamline the process: Real physical distance = slice thickness × number of slices If the distances for the CT and PET data were not equal, the images were considered to be misaligned and therefore excluded. Given its specificity, head data for PET images required a customized localization strategy involving an initial PET/CT head slice alignment followed by sagittal planning and tilt angle adjustments based on the segmentation of landmarks such as the condyle (see Fig. S3 for the slice of condyle segmentation). A detailed head slice positioning scheme, with visualizations of the rotating slice view of the pitch angle of the head (Fig. S4) and the image after rotation (Fig. S5), is provided in Supplementary Section 1.3. This critical alignment enables the identification and evaluation of key cerebral slices using morphometric maxima derived from the relative juxtaposition of cerebral diameters and surface areas, thereby facilitating subsequent processes. All key slices were reviewed and confirmed by an expert with over 10 years of experience in nuclear medicine, establishing these slices as the gold standard. The automatically localized positions were compared with the gold standard, achieving an overall accuracy of 83.17% within an error margin of ± 2 slices and 79.35% within an error margin of ± 1 slice, as detailed in Table 2 , which presents the performance of the IQA model and the accuracy of PET slice selection. Error margins exceeding ± 2 slices were excluded from further analysis for each position. Table 2 Performance of IQA model in testing dataset and accuracy of PET slices selection Description CT Accuracy PET Accuracy Slice Selection Accuracy # Slice Selection Accuracy* Basal ganglia 0.869 0.887 0.684 0.726 Cerebellar vermis 0.787 0.887 0.694 0.726 Aortic arch 0.969 0.846 0.836 0.879 Tracheal carina 0.875 0.818 0.819 0.857 Liver 0.921 0.755 0.843 0.882 Pancreas 0.969 0.755 0.83 0.868 Iliac bone 0.922 0.875 0.85 0.886 Average 0.902 0.832 0.794 0.832 # Accuracy of PET slices selection within ± 1 slice error margin. *Accuracy of PET slices selection within ± 2 slices error margin. Subjective IQA All anonymized PET/CT data were stored in accordance with the highest standards of personal data security and then uploaded to the Digital Micro Image Analysis System (Shanghai Aitrox Technology Corporation Limited). After the aforementioned processing steps, the extracted axial images of seven key slices from CT and PET datasets were randomized and then assigned to evaluators for blinded subjective assessment. Following the single stimulus method outlined in the BT.500 − 15 guidelines of the International Telecommunication Union Radiocommunication Sector [ 36 ], test images were displayed on an FlexScan EV 2750 monitor (EIZO, Ishikawa, Japan) in a controlled environment. Corresponding prompts and scoring labels were provided for blind evaluation. Fifteen physicians, with over 9 years of experience in nuclear medicine, participated in the subjective evaluation. All evaluators participated in a collaborative discussion and received training of educational images (not included in the testing dataset), as shown in Fig. 3 , which presents suggested PET examples for the 5-point Likert scale. Evaluators conducted subjective assessments of the overall image quality for axial PET and CT images, assigning scores based on a five-point Likert scale. Importantly, they were blinded to any identifiable patient information, including identity, diagnosis, treatment, and other specific details. We defined an image quality score of 1 as very poor, definitely affecting diagnosis; a score of 2 as poor, affecting diagnosis; a score of 3 as average, possibly affecting diagnosis; a score of 4 as good, not affecting diagnosis; and a score of 5 as excellent, not affecting diagnosis. All 15 nuclear medicine experts from different centers evaluated all 718 patients. According to our expert consensus, if the assessments of any two evaluators differed by three or more points, it indicated the need for further discussion or reevaluation. To resolve such discrepancies, we consulted the opinions of three senior experts, each with over 20 years of experience, to reach a consensus. Their final evaluations were considered the ground truth for these images. The scores were then averaged to create the mean opinion score (MOS)[ 37 ]. Based on the MOS, we categorized image quality into five levels as follows: worst , [1, 1.5), poor , [1.5, 2.5), average , [2.5, 3.5), good [3.5, 4.5), excellent [4.5, 5]. Image Quality Scoring Model Construction The IQA scoring model applied the multi-dimensional attention network for reference-free image quality assessment [ 38 ] ( https://github.com/IIGROUP/MANIQA ), which combines a vision transformer (ViT) for feature extraction with a transposed attention block (TAB) and a scaled shifted-window transformer block (SSTB) to deploy the attention mechanism across both channels and spatial dimensions. In this multi-dimensional manner, the interconnected modules enhance the global and local interactions of features, thereby improving the accuracy of the scoring system (Fig. 2 b). Training leverages the predictive capabilities of the scoring model to assess the quality of key slices, utilizing a comprehensive dataset of classified outcomes that have been meticulously vetted using empirical, subjective scoring methods. This evaluative framework guarantees that the quality predictions align with established standards. Detailed model description of the IQA Scoring model and training parameters are provided in Supplementary 1.4 (IQA Scoring Model) and 1.5 (Training Protocol). To align with real world clinical scene, the output of the model is a decimal between 1 and 5. The criteria used for determining the final quality score adhere to the same standard as that employed in subjective image assessment. Statistical Analysis Statistical analysis was performed using IBM SPSS Statistics for Windows, version 25.0 (IBM Corp., Armonk, NY, USA). The Spearman correlation coefficient ( ρ )[ 39 ] and intraclass correlation coefficient (ICC), both commonly used metrics for assessing agreement among multiple raters, were calculated to evaluate the consistency of the image quality scores assigned by the nuclear medicine experts and those generated by the IQA model. The Spearman correlation coefficient quantifies the strength and direction of the association between the ranked scores from the two sources of evaluation (human experts and the IQA model). In contrast, the ICC measures the reliability of ratings or measurements across these evaluators when assessing the same quantity. The values of ρ and ICC were categorized as follows: [0.75, 1], excellent agreement; [0.60, 0.75), good agreement; [0.40, 0.60), fair agreement; and [0, 0.40), poor agreement. The test statistics were approximated using a normal distribution to calculate the 95% confidence interval (CI) and the p -value, with p < 0.05 being considered statistically significant. Results Our study developed a DL-based system tailored for automated, expert-level image quality assessment (IQA) of whole-body [18F]FDG PET/CT scans. By customizing algorithms to automatically select and localize seven predefined key slices across four anatomical regions, our model eliminates the dependency on handcrafted IQA metrics. This automated approach enables precise, consistent IQA that aligns with expert subjective evaluations, establishing a robust, scalable framework for multi-center clinical image quality standardization. Dataset The mean age of the 718 patients was 55.35 years (range 18–88 years), with the majority of patients (62.8%) being male. Patient characteristics, including demographics and image acquisition parameters, taken from the DICOM data are shown in Table 1 . Given that the resolution of CT images was sufficient for anatomical localization, we focused on documenting the relevant parameters of the PET systems. Detailed numbers of images in the training, validation, and test datasets are provided in Supplementary Tables S2. IQA Model Performances This deep learning-based IQA model can predict PET/CT image quality in accordance with physicians' subjective assessments, classifying images into five levels (from worst to excellent). Detailed performance metrics of the IQA model on the test set are shown in Table 2 . For the PET dataset, the model achieved an average accuracy of 0.832 (SD = 0.058), while for the CT dataset, the average accuracy was 0.902 (SD = 0.064). The accuracy for the PET dataset across various anatomical regions achieved alignment with expert scoring, ranging from 75.47–88.68%, while the CT dataset demonstrated accuracies between 78.68% and 96.88%. These results demonstrate that our model has a high accuracy in predicting the quality of PET and CT images. The Confusion matrix analysis in Fig. 4 showed the distribution of different scoring outcomes of the DL model scoring in PET and CT test sets for different regions. Given the unevenness in the scoring distribution, ROC curves (Fig. 5 ) were employed to assess the discriminative capacity between high- and low-quality images (with high quality defined as good or excellent, and low quality as worst, poor, or average) across various regions. The PET images demonstrated an AUC of ≥ 0.88, signifying the robust performance of the IQA models. However, due to the inherent characteristics of head data, there were no high-quality PET images available, precluding the plotting of an ROC curve. Figure 6 demonstrates the detailed performance of recall, precision, and F1 score [ 40 – 44 ] which indicates the robustness of the IQA model. To validate the model's efficacy, we have calculated the Spearman correlation coefficients and ICCs comparing the model's predicted scores with the expert-derived MOS. These calculations confirmed the relatedness and consistency of the assessments, as detailed in Table 3 . Within the PET dataset, the model's scores demonstrated strong agreement with MOS, with Spearman coefficients (ρ) ranging from 0.765 to 0.915. The overall ρ across all PET regions reached 0.891. Similarly, in the CT dataset, the model's scores for key slices indicated good agreement with the scores from 15 experts, with correlation coefficients varying from 0.422 to 0.823, and an average of 0.624 for overall body parts. In both the PET and CT datasets, the model's scores exhibit moderate to excellent agreement with the assessments of experts (ρ for CT: 0.624, PET: 0.891, p < 0.001). Table 3 Correlation of IQA model score with MOS across anatomical regions in PET/CT Description ρ ( p -value) ICC (95% CI) PET CT PET CT Basal ganglia 0.776 (< 0.001) 0.823 (< 0.001) 0.873 (0.781–0.927) 0.972 (0.953–0.983) Cerebellar vermis 0.765 (< 0.001) 0.666 (< 0.001) 0.874 (0.782–0.927) 0.919 (0.865–0.951) Aortic arch 0.891 (< 0.001) 0.422 (0.001) 0.952 (0.917–0.973) 0.758 (0.602–0.853) Tracheal carina 0.915 (< 0.001) 0.441 (< 0.001) 0.965 (0.940–0.979) 0.540 (0.243–0.721) Liver 0.898 (< 0.001) 0.664 (< 0.001) 0.965 (0.940–0.980) 0.779 (0.636–0.866) Pancreas 0.887 (< 0.001) 0.719 (< 0.001) 0.951 (0.915–0.972) 0.831 (0.722–0.897) Iliac bone 0.886 (< 0.001) 0.606 (< 0.001) 0.956 (0.926–0.974) 0.837 (0.731–0.901) Average 0.891 (< 0.001) 0.624 (< 0.001) 0.953 (0.942–0.962) 0.920 (0.906–0.932) Spearman correlation coefficient (ρ) and ICC quantifying the agreement between the IQA model score and the MOS from 15 evaluators. p 0.75 suggests excellent reliability. The ICC analysis revealed a comparable pattern: barring the tracheal carina in the CT dataset, which exhibited moderate agreement (ICC = 0.54, 95% CI: 0.243–0.721), the model generally exhibited excellent agreement with MOS. The ICCs ranged from 0.758 to 0.972 (ICC ≥ 0.75) in the remaining areas of both datasets, indicating robust agreement. Notable concordance between the algorithm's output and the expert-derived scores (ICC for CT: 0.92, 95% CI: 0.906–0.932; ICC for PET: 0.953, 95% CI: 0.942–0.962) corroborated the credibility of the deep learning IQA system. Discussion We successfully developed an advanced DL-based system for the comprehensive, expert-level evaluation of whole-body [18F]FDG PET/CT images, laying a solid foundation for advancements in automated IQA. Our model was trained on a large clinical dataset of 3,517 CT and 3,430 PET images, using assessments from 15 experts as references, and demonstrated high accuracy in predicting image quality across multiple anatomical regions, including the brain, thorax, abdomen, and pelvis. Notably, this fully automated IQA system shows strong alignment with physician assessments, enhancing the interpretability of clinical images and highlighting its potential for consistent, expert-level evaluation in multi-center clinical settings. A validated IQA system may relieve physicians from repetitive tasks, enhance diagnostic confidence in clinical images, and promote the reproducibility of image-based applications [ 45 ]. DL algorithms have emerged as effective tools for image evaluation [ 46 , 47 ], and several PET studies have investigated the application of artificial intelligence to IQA. Hopson et al. [ 48 ] found that a pre-training strategy improved the performance of CNNs in predicting PET image quality. Reynés-Llompart et al. [ 49 ] analysed the correlation between image quality parameters and subjective scores for 112 PET scans, employing a radiomics-based machine learning model to predict the subjective scores. Zhang et al. [ 27 ] validated the feasibility of using a DL model to assess PET image quality by scoring 89 head-free three-dimensional PET images. Using manually established objective image metrics, Qi et al. [ 26 ] developed a CNN-based system to perform rapid IQA of PET images of the thoracic, abdominal, and pelvic regions. While AI has made significant advancements in PET IQA, several challenges remain unresolved. These include the need for large, high-quality, and diverse datasets for robust model training [ 50 ], the necessity of external validation [ 51 ] to establish model reliability, and the challenge of seamlessly integrating AI tools into clinical workflows. Overcoming these obstacles is essential to advancing the field and ensuring the clinical utility of AI-based IQA systems. Previous studies have proposed various AI-driven methods for the automated IQA of [18F]FDG PET images [ 13 , 26 , 27 , 52 ]. However, these methods often lack comprehensiveness, focusing either on maximum intensity projection (MIP) images [ 27 ] or manually selected axial slices that exclude specific regions, such as the head [ 26 ], which limits their clinical applicability and generalizability. Amini et al. [ 13 ] introduced a region-specific IQA framework that aimed to replicate human perceptual standards. However, this method was limited by its reliance on evaluations conducted by only two independent raters and the absence of a consensus-based scoring system, which may compromise label consistency and model robustness. Additionally, these studies employ datasets of limited size and diversity, failing to account for variations in imaging protocols, scanner types, or patient populations, which restricts the generalizability and reproducibility [ 53 ] of the resulting AI models. Our study addresses these challenges through the development of a novel, fully automated, DL-based IQA framework. Leveraging one of the largest known datasets of [18F]FDG PET/CT images, along with subjective evaluations by 15 nuclear medicine experts, we ensured higher reliability and robustness in data labeling. The dataset incorporates multi-center data, spanning a diverse range of imaging protocols and scanner types, thereby enhancing the generalizability of the proposed model. Furthermore, our framework employs advanced deep learning techniques to automatically identify key anatomical regions, such as the basal ganglia, cerebellar vermis, aortic arch, and iliac region, and applies region-specific evaluation criteria. This ensures a standardized and interpretable approach to IQA across the whole body, addressing both reproducibility and explainability in clinical practice. In our study, head movement variability and respiratory motion in the thorax led to inconsistent image quality across key slices within the same patient, making it difficult to evaluate the overall image quality accurately. To address this, we implemented a slice-level subjective evaluation, allowing for a more precise assessment of image quality across different anatomical regions. To avoid bias stemming from a limited training set size, we adopted a data collection strategy using multi-center clinical data, supervised and guided by domain experts. Recognizing the different and complex factors that affect PET and CT image quality, we innovatively developed an automated scoring system capable of reliably assessing the quality of both whole-body CT and PET images. Confusion matrix analysis (Fig. 4 ) showed very few misclassifications, indicating that the model is capable of robustly classifying image quality across different regions. The PET IQA models demonstrated excellent performance in distinguishing high-quality images, with AUC values ≥ 0.88 for thoracic and abdominal regions (Fig. 5 ). The model achieved up to 96.9% accuracy, 100% recall, 100% precision, and a 93.8% F-score (Fig. 6 ), surpassing or matching the performance of similar studies [ 26 , 27 , 41 – 44 ]. These metrics indicate a robust ability to identify high-quality images with minimal misclassifications. However, due to dataset imbalance, the lack of high-quality head PET images limited the evaluation in this region. Our benchmark was based on visual diagnostic standards of expert physicians, and the consistency between the model and expert evaluations (Table 3 ) underscores its ability to capture subtle subjective perceptions of image quality. Although challenges remain in interpreting PET/CT scans due to inter- and intra-observer variability, our subjective assessment involving 15 experts helped mitigate potential biases from individual evaluators, providing a more reliable and robust assessment, marking a significant advancement in the field. In addition to tackling dataset diversity and annotation practices, our study adheres strictly to the CLAIM guidelines [ 54 ], as outlined in Table S3 , to enhance reproducibility and ensure compliance with established standards for AI research in medical imaging. Unlike prior studies, which often lack clarity in reporting dataset characteristics and evaluator scoring standards [ 53 ], our work combines advanced AI techniques with well-defined, reproducible, expert-driven evaluation methodologies, ensuring both the integrity of the research process and the credibility of the results. This approach not only improves research transparency but also enhances clinical applicability, facilitating the integration of the proposed system into diverse clinical environments. In this proof-of-concept study, we introduced a novel approach to IQA for PET/CT and demonstrated the feasibility of using DL in clinical settings. The individual modules in this IQA system can be independently customized, providing functionalities such as automatic segmentation and selection of key slices, cross-modality matching, and three-dimensional alignment, which serve as powerful tools for clinical research. Limitations Although our DL model shows promising results, several limitations should be acknowledged. First, our initial work involved the automated identification and scoring of seven axial slices in PET and CT images; however, other slices may also hold significant evaluative importance. Deficiencies in the localization algorithm could cause slight misalignments of these key slices, potentially leading to an imbalanced training dataset and affecting the performance of the system in clinical settings. Moreover, compared to the consistency between prediction scores and expert ratings for CT images, the deep learning model prediction scores for PET images are more consistent with expert ratings. To address this limitation, future work should focus on improving the accuracy of the algorithm. Second, even with well-defined IQA criteria, discrepancies in the subjective evaluation of PET/CT scans remain because of differences in reader experience and expertise. To address this, we consulted senior experts' image quality annotations as ground truth. However, this expert consensus approach may introduce new biases. Future work should involve a larger number of senior specialists for further validation. Third, although data from multiple institutions were used, the performance of the IQA system may still be influenced by the quality and diversity of the training data. Additionally, our method has not yet been tested in real-world clinical settings. To advance this work, it is crucial to expand the training dataset and test the model in diverse clinical settings. Conclusion This study developed a foundational, automated IQA system that provides a reliable, expert-level assessment of whole-body [18F]FDG PET/CT images. The system’s versatility would allow for automated selection of high-quality clinical images from multi-center datasets and provide customizable support for the development of image-based models. Declarations Supplementary Information Supplementary Material 1. Data availability The data of the study is available from the corresponding author on reasonable request. Acknowledgments We are very grateful to the Shanghai Universal Medical Imaging Diagnostic Center for their assistance in this research. We thank Liwen Bianji (Edanz) (www.liwenbianji.cn/ac) for editing the language of a draft of this manuscript. Funding This work was supported by the Shanghai Central Government-led Local Development Fund (Project No.: YDZX20223100003001). Authors and Affiliations Department of Nuclear Medicine, The First Medical Center of Chinese PLA General Hospital, Beijing, China Cong Zhang, Ruimin Wang, Jiahe Tian Shanghai Universal Medical Imaging Diagnostic Center, Shanghai, China Xin Gao, Gang Feng Department of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China Xuebin Zheng, Jun Xie, Yunchao Bao, Pengchen Gu, Chuan He Author's contributions CZ: conceptualization, formal analysis, and investigation; project administration; writing of the original draft and visualization. XG: funding, resources and project administration; XZ: project administration; validation and investigation. JX: data processing, editing. GF: data processing. YB: validation and investigation. PG: validation and investigation. YZ: resources and data processing. CH: resources and data processing. RW: supervision. JT: review, and supervision. All authors read and approved the final manuscript Ethics declarations Competing Interest All authors have no conflicts of interest to report. Ethics approval and consent to participate All procedures performed in studies involving human participants were in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.The study was carried out in compliance with the CLAIM guidelines Ethics approval and consent to participate This study was approved by the Clinical Research Ethics Committee of the Chinese PLA General Hospital (approval number: S2024-029-01) and the Ethics Committee of Universal Medical Imaging Diagnostic Center (approval number: SHQJ-2022-08, HZQJ-2022-08, GZQJ-2022-08). Written informed consent was waived due to its retrospective design. Consent for publication Not applicable. References Pinker K, Riedl C, Weber WA. Evaluating tumor response with FDG PET: updates on PERCIST, comparison with EORTC criteria and clues to future developments. Eur J Nucl Med Mol Imaging. 2017;44(Suppl 1):55–66. http://10.1007/s00259-017-3687-3 . Leveque L, Outtas M, Liu H, Zhang L. Comparative study of the methodologies used for subjective medical image quality assessment. Phys Med Biol. 2021;66(15). http://10.1088/1361-6560/ac1157 . Chow LS, Paramesran R. Review of medical image quality assessment. Biomedical signal processing and control;27:145 – 54. 2016. Syed AB, Zoga AC. Artificial Intelligence in Radiology: Current Technology and Future Directions. Semin Musculoskelet Radiol. 2018;22(5):540–5. http://10.1055/s-0038-1673383 . Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85–117. http://10.1016/j.neunet.2014.09.003 . LeCun Y, Bengio Y, Hinton G. Deep learning. nature;521(7553):436 – 44. 2015. Chen X, Wang X, Zhang K, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. http://10.1016/j.media.2022.102444 . Atasever S, Azginoglu N, Terzi DS, Terzi R. A comprehensive survey of deep learning research on medical image analysis with focus on transfer learning. Clin Imaging. 2023;94:18–41. http://10.1016/j.clinimag.2022.11.003 . Tajbakhsh N, Jeyaseelan L, Li Q, Chiang JN, Wu Z, Ding X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Med Image Anal. 2020;63:101693. http://10.1016/j.media.2020.101693 . El-Shafai W, El-Nabi SA, El-Rabaie E-SM et al. Efficient Deep-Learning-Based Autoencoder Denoising Approach for Medical Image Diagnosis. Computers Mater Continua;70(3). 2022. Loft M, Ladefoged CN, Johnbeck CB, et al. An Investigation of Lesion Detection Accuracy for Artificial Intelligence-Based Denoising of Low-Dose (64)Cu-DOTATATE PET Imaging in Patients with Neuroendocrine Neoplasms. J Nucl Med. 2023;64(6):951–9. http://10.2967/jnumed.122.264826 . Kaur A, Dong G. A complete review on image denoising techniques for medical images. Neural Process Lett. 2023;55(6):7807–50. Amini M, Salimi Y, Hajianfar G, et al. Fully Automated Region-Specific Human-Perceptive-Equivalent Image Quality Assessment: Application to 18F-FDG PET Scans. Clin Nucl Med. 2024;49(12):1079–90. Yang J, Lyu M, Qi Z, Shi Y. Deep learning based image quality assessment: A survey. Procedia Comput Sci. 2023;221:1000–5. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88. http://10.1016/j.media.2017.07.005 . Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12. http://10.1109/tip.2003.819861 . Sui X, Tan H, Yu H, et al. Exploration of the total-body PET/CT reconstruction protocol with ultra-low 18F-FDG activity over a wide range of patient body mass indices. EJNMMI Phys. 2022;9(1):17. Ukita J, Yoshida T, Ohki K. Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network. Sci Rep. 2019;9(1):3791. http://10.1038/s41598-019-40535-4 . Cheng Y, Abadi E, Smith TB, et al. Validation of algorithmic CT image quality metrics with preferences of radiologists. Med Phys. 2019;46(11):4837–46. http://10.1002/mp.13795 . Sartoretti T, Skawran S, Gennari AG, et al. Fully automated computational measurement of noise in positron emission tomography. Eur Radiol. 2024;34(3):1716–23. http://10.1007/s00330-023-10056-w . Nikiforaki K, Karatzanis I, Dovrou A, et al. Image Quality Assessment Tool for Conventional and Dynamic Magnetic Resonance Imaging Acquisitions. J Imaging. 2024;10(5). http://10.3390/jimaging10050115 . Hoeijmakers EJI, Martens B, Hendriks BMF, et al. How subjective CT image quality assessment becomes surprisingly reliable: pairwise comparisons instead of Likert scale. Eur Radiol. 2024;34(7):4494–503. http://10.1007/s00330-023-10493-7 . Lin W, Hasenstab K, Moura Cunha G, Schwartzman A. Comparison of handcrafted features and convolutional neural networks for liver MR image adequacy assessment. Sci Rep. 2020;10(1):20336. http://10.1038/s41598-020-77264-y . Piccini D, Demesmaeker R, Heerfordt J, et al. Deep Learning to Automate Reference-Free Image Quality Assessment of Whole-Heart MR Images. Radiol Artif Intell. 2020;2(3):e190123. http://10.1148/ryai.2020190123 . Esses SJ, Lu X, Zhao T, et al. Automated image quality evaluation of T(2) -weighted liver MRI utilizing deep learning architecture. J Magn Reson Imaging. 2018;47(3):723–8. http://10.1002/jmri.25779 . Qi C, Wang S, Yu H, et al. An artificial intelligence-driven image quality assessment system for whole-body [(18)F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2023;50(5):1318–28. http://10.1007/s00259-022-06078-z . Zhang H, Liu Y, Wang Y, et al. Deep learning model for automatic image quality assessment in PET. BMC Med Imaging. 2023;23(1):75. http://10.1186/s12880-023-01017-2 . Afshar P, Mohammadi A, Plataniotis KN, Oikonomou A, Benali H. From handcrafted to deep-learning-based cancer radiomics: challenges and opportunities. IEEE Signal Process Mag. 2019;36(4):132–60. Rodrigues R, Lévêque L, Gutiérrez J et al. Objective quality assessment of medical images and videos: Review and challenges. Multimedia Tools Appl:1–34. 2024. Kim J, Nguyen AD, Lee S, Deep, CNN-Based Blind Image Quality Predictor. IEEE Trans Neural Netw Learn Syst. 2019;30(1):11–24. http://10.1109/TNNLS.2018.2829819 . Thanki R, Borra S, Dey N, Ashour AS. Medical imaging and its objective quality assessment: an introduction. Classification in BioApps: Automation of Decision Making:3–32. 2018. Zheng X, Jing B, Zhao Z et al. An interpretable deep learning model for identifying the morphological characteristics of dMMR/MSI-H gastric cancer. iScience;27(3):109243. 2024. http://10.1016/j.isci.2024.109243 Wasserthal J, Breit HC, Meyer MT, et al. TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. Radiol Artif Intell. 2023;5(5):e230024. http://10.1148/ryai.230024 . Isensee F, Petersen J, Klein A et al. nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:180910486. 2018. Zelinsky A et al. Learning OpenCV—Computer vision with the OpenCV library (Bradski, GR. ; 2008)[On the Shelf]. IEEE Robotics & Automation Magazine;16(3):100-. 2009. ITU-R BT.500 – 15. Methodologies for the subjective assessment of the quality of television images. Int Telecommunication Union; 2012. https://www.itu.int/rec/R-REC-BT.500/en Obuchowicz R, Oszust M, Piorkowski A. Interobserver variability in quality assessment of magnetic resonance images. BMC Med Imaging. 2020;20(1):109. http://10.1186/s12880-020-00505-z . Yang S, Wu T, Shi S, et al. editors. Maniqa: Multi-dimension attention network for no-reference image quality assessment. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. Dodge Y. The concise encyclopedia of statistics. Springer Science & Business Media; 2008. Mukhlif AA, Al-Khateeb B, Mohammed MA. An extensive review of state-of-the-art transfer learning techniques used in medical imaging: Open issues and challenges. J Intell Syst. 2022;31(1):1085–111. Nesamani SL, Rajini SNS. Breast Cancer Detection with Transfer Learning Technique in Convolutional Neural Networks. Des Eng:11102–9. 2021. Kassem MA, Hosny KM, Fouad MM. Skin lesions classification into eight classes for ISIC 2019 using deep convolutional neural network and transfer learning. IEEE access. 2020;8:114822–32. Saber A, Sakr M, Abo-Seida OM, Keshk A, Chen H. A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEe Access. 2021;9:71194–209. Wang SH, Xie S, Chen X, et al. Alcoholism Identification Based on an AlexNet Transfer Learning Model. Front Psychiatry. 2019;10:205. http://10.3389/fpsyt.2019.00205 . Bradshaw TJ, Boellaard R, Dutta J, et al. Nuclear Medicine and Artificial Intelligence: Best Practices for Algorithm Development. J Nucl Med. 2022;63(4):500–10. http://10.2967/jnumed.121.262567 . Kang L, Ye P, Li Y, Doermann D, editors. Convolutional neural networks for no-reference image quality assessment. Proceedings of the IEEE conference on computer vision and pattern recognition; 2014. Kim J, Nguyen A-D, Lee S. Deep CNN-based blind image quality predictor. IEEE Trans neural networks Learn Syst. 2018;30(1):11–24. Hopson JB, Neji R, Dunn JT, et al. Pre-training via Transfer Learning and Pretext Learning a Convolutional Neural Network for Automated Assessments of Clinical PET Image Quality. IEEE Trans Radiat Plasma Med Sci. 2023;7(4):372–81. http://10.1109/TRPMS.2022.3231702 . Reynes-Llompart G, Sabate-Llobera A, Llinares-Tello E, Marti-Climent JM, Gamez-Cenzano C. Image quality evaluation in a modern PET system: impact of new reconstructions methods and a radiomics approach. Sci Rep. 2019;9(1):10640. http://10.1038/s41598-019-46937-8 . Binuya MAE, Engelhardt EG, Schats W, Schmidt MK, Steyerberg EW. Methodological guidance for the evaluation and updating of clinical prediction models: a systematic review. BMC Med Res Methodol. 2022;22(1):316. http://10.1186/s12874-022-01801-8 . Yu AC, Mohajer B, Eng J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol Artif Intell. 2022;4(3):e210064. http://10.1148/ryai.210064 . Schwyzer M, Skawran S, Gennari AG, et al. Automated F18-FDG PET/CT image quality assessment using deep neural networks on a latest 6-ring digital detector system. Sci Rep. 2023;13(1):11332. http://10.1038/s41598-023-37182-1 . Moassefi M, Rouzrokh P, Conte GM, et al. Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review. J Digit Imaging. 2023;36(5):2306–12. http://10.1007/s10278-023-00870-5 . Mongan J, Moy L, Kahn CE. Jr. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell. 2020;2(2):e200029. http://10.1148/ryai.2020200029 . Supplementary Files SupplementaryMaterials0121.docx Pointtopointresponse3.20.docx Cite Share Download PDF Status: Published Journal Publication published 18 Apr, 2025 Read the published version in EJNMMI Research → Version 1 posted Editorial decision: Accept 05 Apr, 2025 Reviewers agreed at journal 24 Mar, 2025 Reviewers invited by journal 21 Mar, 2025 Editor assigned by journal 21 Mar, 2025 First submitted to journal 20 Mar, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5559102","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":432275067,"identity":"a42c662f-b2ec-47b3-8634-a905ef637a6b","order_by":0,"name":"Cong Zhang","email":"","orcid":"","institution":"Chinese PLA General Hospital","correspondingAuthor":false,"prefix":"","firstName":"Cong","middleName":"","lastName":"Zhang","suffix":""},{"id":432275068,"identity":"2101f510-8874-4f31-b65d-d2c84299867a","order_by":1,"name":"Xin Gao","email":"","orcid":"","institution":"Shanghai Universal Medical Imaging Diagnostic Center","correspondingAuthor":false,"prefix":"","firstName":"Xin","middleName":"","lastName":"Gao","suffix":""},{"id":432275069,"identity":"826dacd3-8144-4ba0-ae83-e211f11c93f5","order_by":2,"name":"Xuebin Zheng","email":"","orcid":"","institution":"Shanghai Aitrox Technology Corporation limited","correspondingAuthor":false,"prefix":"","firstName":"Xuebin","middleName":"","lastName":"Zheng","suffix":""},{"id":432275070,"identity":"f71aff3f-dd4e-4c5d-aee4-44378cfea4fc","order_by":3,"name":"Jun Xie","email":"","orcid":"","institution":"Shanghai Aitrox Technology Corporation Limited","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"","lastName":"Xie","suffix":""},{"id":432275071,"identity":"988b3d81-0fd1-4f2a-9936-5f5161b33aaf","order_by":4,"name":"Gang Feng","email":"","orcid":"","institution":"Shanghai Universal Medical Imaging Diagnostic Center","correspondingAuthor":false,"prefix":"","firstName":"Gang","middleName":"","lastName":"Feng","suffix":""},{"id":432275072,"identity":"c9c4b9d1-dbe5-422c-8a0f-ed001d75cf49","order_by":5,"name":"Yunchao Bao","email":"","orcid":"","institution":"Shanghai Aitrox Technology Corporation Limited","correspondingAuthor":false,"prefix":"","firstName":"Yunchao","middleName":"","lastName":"Bao","suffix":""},{"id":432275073,"identity":"bcfde4e6-2af6-4694-b3b7-d944e588cbcc","order_by":6,"name":"Pengchen Gu","email":"","orcid":"","institution":"Shanghai Aitrox Technology Corporation Limited","correspondingAuthor":false,"prefix":"","firstName":"Pengchen","middleName":"","lastName":"Gu","suffix":""},{"id":432275074,"identity":"e71f6dc9-e290-4708-ac48-5631f7766ac2","order_by":7,"name":"Chuan He","email":"","orcid":"","institution":"Shanghai Aitrox Technology Corporation Limited","correspondingAuthor":false,"prefix":"","firstName":"Chuan","middleName":"","lastName":"He","suffix":""},{"id":432275075,"identity":"8511690a-ae0d-414a-9b33-bde138a4aa64","order_by":8,"name":"Ruimin Wang","email":"","orcid":"","institution":"Chinese PLA General Hospital","correspondingAuthor":false,"prefix":"","firstName":"Ruimin","middleName":"","lastName":"Wang","suffix":""},{"id":432275076,"identity":"ba1faa63-a215-4615-883f-576a348e57d5","order_by":9,"name":"Jiahe Tian","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5klEQVRIie3PsWrDMBCA4TMCZRHJeqHBfgUFQ+nghzkt2goZXQjEhmAPTenara/QseMFgbuou8dk79BsHUpJ9gTb2Trom+/n7gCC4B+SyQcz5lmcjJ4PO8qX/ckYleE7b9P5hlO9801/EiOk/FA5U7R0O92vxYDDbgriVtqoLMjmppAwqR+pO5ltefuiMjECblrzPgP0n2/dCRA5RCujsqxa4yVovO9NtPvVToETcmEqMSBB0oynRdBICcMS5YmRrZ5vlEDyjer9Jalr9z39y1avyVd0+MmX8aR+6k7OqOvGgyAIgouO3NBN4h/jb+IAAAAASUVORK5CYII=","orcid":"","institution":"Chinese PLA General Hospital","correspondingAuthor":true,"prefix":"","firstName":"Jiahe","middleName":"","lastName":"Tian","suffix":""}],"badges":[],"createdAt":"2024-12-01 15:12:00","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5559102/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5559102/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s13550-025-01238-2","type":"published","date":"2025-04-18T15:57:54+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":79256069,"identity":"d1b8ae1e-61e2-471b-b416-5d4e7504f793","added_by":"auto","created_at":"2025-03-26 08:55:59","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":106008,"visible":true,"origin":"","legend":"\u003cp\u003eFlow diagram of PET/CT data collection and dataset division\u003c/p\u003e\n\u003cp\u003ePET/CT data were collected from four medical centers, with \u003cem\u003en\u003c/em\u003e representing the number of patients. After data cleaning and processing, PET and CT images were separately divided into training, validation, and testing sets.\u003c/p\u003e","description":"","filename":"OnlineFig1.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/2277d82dd86f5e3b56eb4c19.png"},{"id":79258738,"identity":"501d5fab-5c3e-47c7-9734-03db182d4d50","added_by":"auto","created_at":"2025-03-26 09:11:59","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":257940,"visible":true,"origin":"","legend":"\u003cp\u003eOverall workflow of the deep learning-driven IQA system\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ea\u003c/strong\u003e The process involves segmenting organs on CT images and locating key slices, as well as using algorithms to identify corresponding slices in PET images.\u003cstrong\u003e b\u003c/strong\u003eProcedure for developing the IQA system based on the subjective scores of physicians.\u003c/p\u003e","description":"","filename":"OnlineFig2.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/b939a2dae0c21e302220b863.png"},{"id":79258739,"identity":"5df7baab-ec42-4eee-b63c-da61c1119b7b","added_by":"auto","created_at":"2025-03-26 09:11:59","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":302850,"visible":true,"origin":"","legend":"\u003cp\u003eSuggested PET examples of the 5-point Likert scale\u003c/p\u003e\n\u003cp\u003eExamples of PET images (not included in the testing dataset) ranging from excellent (score 5) to worst (score 1) quality were provided. Since a perfectly flawless image is not achievable in real-world conditions, these examples were solely used as reference images for education. For head images, we set an initial tracer concentration of 15 g/mL, and for other images, the concentration was set to 7 g/mL, allowing physicians to make slight adjustments according to their diagnostic preferences.\u003c/p\u003e","description":"","filename":"OnlineFig3.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/c831497cb142deaaa017ca4d.png"},{"id":79256078,"identity":"bebab557-6e3d-40cd-888c-ad452130d156","added_by":"auto","created_at":"2025-03-26 08:55:59","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":272106,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix analysis for different regions\u003c/p\u003e\n\u003cp\u003eThe horizontal and vertical axes give the Likert scores given by the physicians. A diagonal matrix would indicate complete agreement. \u003cstrong\u003eW\u003c/strong\u003e: worst ; \u003cstrong\u003eP\u003c/strong\u003e: poor; \u003cstrong\u003eA\u003c/strong\u003e: average; \u003cstrong\u003eG\u003c/strong\u003e: good; \u003cstrong\u003eE\u003c/strong\u003e: excellent.\u003c/p\u003e","description":"","filename":"OnlineFig4.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/2f8f4df579b6d385cf63f8ba.png"},{"id":79257215,"identity":"f61929ea-4278-446a-b47d-66393c3db952","added_by":"auto","created_at":"2025-03-26 09:03:59","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":193045,"visible":true,"origin":"","legend":"\u003cp\u003e\u003ca href=\"https://developer.baidu.com/article/details/3323193\" target=\"https://cn.bing.com/_blank\"\u003eReceiver Operating Characteristic (ROC) Curve\u003c/a\u003es for different regions in classifying high quality/poor quality images\u003c/p\u003e","description":"","filename":"Onlinefig5.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/26327fa7ae58101f8bb62e54.png"},{"id":79256079,"identity":"7c2aa0d6-fed5-4993-bf02-bd8eab13434a","added_by":"auto","created_at":"2025-03-26 08:55:59","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":102627,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix analysis of the IQA model performance\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ea\u003c/strong\u003e Recall, precision, and F1 score of the IQA model for PET images, \u003cstrong\u003eb\u003c/strong\u003e Recall, precision, and F1 score of the IQA model for CT images.\u003c/p\u003e","description":"","filename":"Onlinefig6.png","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/77758f3864c39e47782589d8.png"},{"id":81051498,"identity":"05fc0f44-5d91-4fc5-b22f-58c5bec6f552","added_by":"auto","created_at":"2025-04-21 16:10:49","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2957604,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/e8f04174-534d-4290-9aa2-fcc878ff6b83.pdf"},{"id":79256070,"identity":"3a926a15-5751-4a09-a6d5-c1cc9bf05433","added_by":"auto","created_at":"2025-03-26 08:55:59","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":3198431,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryMaterials0121.docx","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/214d71c639feafda8d0b6cec.docx"},{"id":79256071,"identity":"5a76cfc2-8bf4-440a-8fc3-664ca52bbac2","added_by":"auto","created_at":"2025-03-26 08:55:59","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":27770,"visible":true,"origin":"","legend":"","description":"","filename":"Pointtopointresponse3.20.docx","url":"https://assets-eu.researchsquare.com/files/rs-5559102/v1/f0420f9e8f3ab7fbe6239270.docx"}],"financialInterests":"","formattedTitle":"A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT","fulltext":[{"header":"Background","content":"\u003cp\u003e[18F]fluorodeoxyglucose positron emission tomography/computed tomography ([18F]FDG PET/CT) is an indispensable non-invasive imaging tool for detecting and monitoring oncologic diseases [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The quality of PET/CT images directly influences clinical decision-making and has significant implications for patient outcomes. However, as medical centers accumulate vast PET/CT datasets daily, evaluating clinical PET/CT image quality on a multicenter scale has become a core goal in quality management within nuclear medicine. Traditional image quality assessment (IQA)[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] methods, which rely on subjective evaluations by nuclear medicine experts, face substantial obstacles, including inconsistent evaluation standards and inefficiency when managing large clinical datasets. This emphasizes the urgent need for objective, widely accepted artificial intelligence (AI) tools to support multicenter PET/CT IQA. Recent advances in computational power have enabled promising AI-driven solutions [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e], with deep learning (DL)[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e] demonstrating notable capabilities in extracting relevant patterns from large datasets [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. DL has already achieved impressive results in medical image analysis tasks, such as reconstruction, denoising, registration, segmentation, and modeling [\u003cspan additionalcitationids=\"CR9 CR10 CR11\" citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. However, significant challenges remain in applying DL to IQA [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. DL applications rely extensively on curated datasets, but acquiring high-quality annotated data is labor-intensive and costly [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn IQA, objective and subjective methods have traditionally been applied. Objective metrics, such as mean squared error, signal-to-noise ratio, and structural similarity index [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e], provide quick and repeatable evaluations. However, due to the non-linear nature of human visual processing [\u003cspan additionalcitationids=\"CR18 CR19\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], these content-independent objective metrics often fail to fully capture radiologists' diagnostic perceptions. Consequently, subjective assessments, particularly those from experienced radiologists, are considered more reliable for evaluating medical image quality [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Nonetheless, subjective assessments are limited by inconsistent evaluation criteria, variations in physician expertise, evaluator fatigue, and environmental factors, challenging their reliability and reproducibility across clinical settings [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eMultiple factors, both equipment-related and human-related, contribute to the overall quality of clinical PET/CT images. Equipment-related factors, including device hardware configurations and image processing software within the acquisition pipeline, significantly affect image quality. Human-related variables, such as operator expertise, injection dosage, drug leakage and patient-specific variables (e.g., glucose levels, movement-induced artifacts), further complicate accurate image assessment. These factors make it challenging to evaluate one aspect without compromising others. Together, these factors make subjective evaluation of clinical PET/CT images a complex task. DL offers a promising solution for emulating expert-level evaluations, with its ability to capture subtle image features and adapt to complex visual contexts.\u003c/p\u003e \u003cp\u003eRecent studies have focused on developing automated methods for medical image quality evaluation. For example, Lin et al. [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e] found that handcrafted features were valuable in data-limited settings, but combining these features with convolutional neural networks (CNNs) achieved the best performance. Additionally, DL-based models are now commonly applied to evaluate the quality of whole-heart [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], liver [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] magnetic resonance images, and PET scans [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], often outperforming human observers.\u003c/p\u003e \u003cp\u003eAlthough previous studies have shown promise, most region-specific approaches typically rely on handcrafted features to create objective metrics [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] or explore correlations with subjective human perception [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Additionally, these methods are often constrained by limited datasets and small numbers of evaluators, making them applicable only in specific settings and limiting their acceptance as unbiased, broadly applicable IQA tools for PET/CT images. Due to the inefficiency of handcrafted metrics and varying image quality across anatomical regions, whole-body PET/CT presents unique challenges, limiting the efficacy of single-model assessments.\u003c/p\u003e \u003cp\u003eTo address these challenges, we trained our model on a large clinical dataset with assessments from 15 experts as references\u0026mdash;the largest dataset available to date. Our study introduces a fully automated, DL-based IQA framework for whole-body PET/CT images. This framework offers: (1) customizable algorithms for automatic slice selection across diverse regions, establishing a standardized foundation for multicenter IQA; and (2) expert-level modeling of subjective evaluations by integrating expert perceptual assessments with DL capabilities, enabling automated quality assessment. Our study aims to support standardized IQA for multicenter collaborations and advance high-quality, image-based nuclear medicine research in the future.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cp\u003e Ethics approval\u003c/strong\u003e for this retrospective analysis was obtained from the relevant institutional ethics committees, with a waiver of informed consent.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy design and data acquisition\u003c/h2\u003e \u003cp\u003eWe retrospectively obtained DICOM data from 718 patients who underwent whole-body [18F]FDG PET/CT scans across four institutions. All PET/CT scanners underwent routine quality assurance (QA) and quality control (QC) procedures, with their system characteristics detailed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. These QA/QC programs ensured that scanner performance met the necessary standards required for multi-center [18F]FDG PET/CT studies. The inclusion criteria were: (a) whole-body [18F]FDG PET/CT scans (from head to mid-thigh), (b) complete visualization of all major organs. The exclusion criteria are as follows: (a) incomplete or corrupted PET/CT scan data, with image loss or distortion due to data errors; (b) missing organ data or misalignment between CT and PET data, where the distance between corresponding key slices in the two modalities is unequal, preventing identification of the key slices; (c) severe artifacts or patient motion.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePatient Demographics and Image Acquisition Parameters for Data from Different Centers\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eCharacteristics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePLAGH\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSHCenter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eHZCenter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eGZCenter\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e(n\u0026thinsp;=\u0026thinsp;104)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e(n\u0026thinsp;=\u0026thinsp;428)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e(n\u0026thinsp;=\u0026thinsp;142)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e(n\u0026thinsp;=\u0026thinsp;44)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAge (years)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e57.16\u0026thinsp;\u0026plusmn;\u0026thinsp;12.10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e51.30\u0026thinsp;\u0026plusmn;\u0026thinsp;12.39\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e64.38\u0026thinsp;\u0026plusmn;\u0026thinsp;10.82\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e61.27\u0026thinsp;\u0026plusmn;\u0026thinsp;10.56\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSex\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e249\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e93\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e28\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFemale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e23\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e179\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e16\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBMI ( kg/m\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e24.70\u0026thinsp;\u0026plusmn;\u0026thinsp;3.66\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e23.60\u0026thinsp;\u0026plusmn;\u0026thinsp;3.99\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e23.19\u0026thinsp;\u0026plusmn;\u0026thinsp;3.60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e21.95\u0026thinsp;\u0026plusmn;\u0026thinsp;3.85\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDevices\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eUnited Imaging (uMI 510, uEXPLORER*); Siemens(Biograph Vision 600); GE (Discovery 710)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSiemens (Biograph mCT)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eGE (Discovery 710)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSiemens (Biograph mCT)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePET Matrix\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e192\u0026times;192\u0026amp;(head)128\u0026times;128;192\u0026times;192\u0026amp;\u003c/p\u003e \u003cp\u003e(head)256\u0026times;256*; 440\u0026times;440;192\u0026times;192\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e200\u0026times;200\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e192\u0026times;192\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e200\u0026times;200\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePET Slice thickness, mm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2.44, 2.886*; 3; 5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePET Recon method\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOSEM; True X\u0026thinsp;+\u0026thinsp;TOF; Vue Point HD/FX\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTrue X\u0026thinsp;+\u0026thinsp;TOF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eVue Point HD/FX\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eTrue X\u0026thinsp;+\u0026thinsp;TOF\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGaussian filter, FWHM, mm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3.0,3.0*;5.0\u0026amp;(head)2.0;3.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e3.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e5.0\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003e* Data derived from United Imaging (uEXPLORER); BMI, body mass index; n, number of patients. Data are presented as the mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation (SD). PLAGH: Chinese PLA General Hospital; SHCenter: Shanghai Universal Medical Imaging Diagnostic Center; HZCenter: Hangzhou Universal Medical Imaging Diagnostic Center; GZCenter: Guangzhou Universal Medical Imaging Diagnostic Center.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eTo facilitate modeling, a preliminary assessment was performed by two experienced radiologists, each with over 10 years in practice, to ensure a balanced distribution of images classified as good, moderate, or poor quality across the dataset. We developed customized algorithms to automatically localize seven key axial slices of the whole body that capture critical anatomical landmarks (including the basal ganglia, cerebellar vermis, aortic arch, tracheal carina, liver, pancreas, and iliac bone). These slices were selected for their anatomical significance and reproducibility across patients. A full description of our segmentation process is provided in Supplementary Section 1.1, while the localization methodology is detailed in Supplementary Section 1.2 (Key Slices Selection Logic). Following standardized on-table training, 15 nuclear medicine experts conducted blinded subjective evaluations on these predefined slices for 718 patients. After data preprocessing, 4,394 CT and 4,180 PET images were randomly divided into training, validation, and test sets at an 8:1:1 ratio [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, which illustrates the flow diagram of PET/CT data collection and dataset division. The complete workflow of the IQA system is outlined in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eSegmentation of Anatomical Regions\u003c/h3\u003e\n\u003cp\u003eAccurate identification and localization of key CT images requires a sophisticated segmentation model. For this purpose, we utilized TotalSegmentator [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e], which is based on the nnUNet framework (Fig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e) [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e], to perform fine-grained segmentation of whole-body regions (five torso and two cranial regions), as illustrated in Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e. The selection of axial key slices was guided by both the segmentation results and our defined anatomical criteria.\u003c/p\u003e\n\u003ch3\u003eCT Key-Slice Localization Scheme\u003c/h3\u003e\n\u003cp\u003eKey-slice localization is crucial for both physician-guided and automated DL assessments of PET/CT image quality. To facilitate this process, numerous functions from the OpenCV package (version 3.4.2) [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e] were used. The cv2.matchShapes() function was used for the automatic identification and extraction of key slices, which acted as benchmarks to support physician-based IQA. The automated identification process involved several steps. Initially, the extracted image was converted into a grayscale image and then transformed into a binary image. The contour detection function cv.findContours() in OpenCV was then used to identify all CT image contours. Subsequently, the cv.matchShapes() function was applied to compare the contours of the image with those of a reference image from experts. By identifying the contour with the minimum degree of mismatch, the algorithm pinpointed the key slices within the image.\u003c/p\u003e \u003cp\u003eTo better align with the varying characteristics of different anatomical regions, we tailored the localization scheme by incorporating selection logic. For example, we customized the automatic recognition function for the aortic arch in the thorax and implemented tailored evaluation criteria for this slice. Details of the key slices selection logic are summarized in the Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e. This approach enabled us to accurately pinpoint and segment each relevant slice in the PET/CT images.\u003c/p\u003e\n\u003ch3\u003eAligned Key-Slices Acquisition\u003c/h3\u003e\n\u003cp\u003eAfter selecting the key slices from the CT images, the PET images were aligned with the initial axial head slices of the CT scans to ensure consistent ranges as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea, which depicts the process of segmenting organs on CT images and identifying corresponding slices in PET images. Key PET slices were then identified in tandem with their CT counterparts, using the physical distance metric to streamline the process:\u003c/p\u003e\n\u003ch3\u003eReal physical distance = slice thickness × number of slices\u003c/h3\u003e\n\u003cp\u003eIf the distances for the CT and PET data were not equal, the images were considered to be misaligned and therefore excluded.\u003c/p\u003e \u003cp\u003e Given its specificity, head data for PET images required a customized localization strategy involving an initial PET/CT head slice alignment followed by sagittal planning and tilt angle adjustments based on the segmentation of landmarks such as the condyle (see Fig. \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e for the slice of condyle segmentation). A detailed head slice positioning scheme, with visualizations of the rotating slice view of the pitch angle of the head (Fig. S4) and the image after rotation (Fig. S5), is provided in Supplementary Section 1.3. This critical alignment enables the identification and evaluation of key cerebral slices using morphometric maxima derived from the relative juxtaposition of cerebral diameters and surface areas, thereby facilitating subsequent processes.\u003c/p\u003e \u003cp\u003eAll key slices were reviewed and confirmed by an expert with over 10 years of experience in nuclear medicine, establishing these slices as the gold standard. The automatically localized positions were compared with the gold standard, achieving an overall accuracy of 83.17% within an error margin of \u0026plusmn;\u0026thinsp;2 slices and 79.35% within an error margin of \u0026plusmn;\u0026thinsp;1 slice, as detailed in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, which presents the performance of the IQA model and the accuracy of PET slice selection. Error margins exceeding\u0026thinsp;\u0026plusmn;\u0026thinsp;2 slices were excluded from further analysis for each position.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePerformance of IQA model in testing dataset and accuracy of PET slices selection\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDescription\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCT Accuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePET Accuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSlice Selection Accuracy\u003csup\u003e#\u003c/sup\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSlice Selection Accuracy*\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBasal ganglia\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.869\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.887\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.684\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.726\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCerebellar vermis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.787\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.887\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.694\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.726\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAortic arch\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.969\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.846\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.836\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.879\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTracheal carina\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.875\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.818\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.819\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.857\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLiver\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.921\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.755\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.843\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.882\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePancreas\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.969\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.755\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.83\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.868\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIliac bone\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.922\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.875\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.85\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.886\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAverage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.902\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.832\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.794\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.832\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003e\u003csup\u003e#\u003c/sup\u003e Accuracy of PET slices selection within \u0026plusmn;\u0026thinsp;1 slice error margin. *Accuracy of PET slices selection within \u0026plusmn;\u0026thinsp;2 slices error margin.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eSubjective IQA\u003c/h2\u003e \u003cp\u003eAll anonymized PET/CT data were stored in accordance with the highest standards of personal data security and then uploaded to the Digital Micro Image Analysis System (Shanghai Aitrox Technology Corporation Limited). After the aforementioned processing steps, the extracted axial images of seven key slices from CT and PET datasets were randomized and then assigned to evaluators for blinded subjective assessment. Following the single stimulus method outlined in the BT.500\u0026thinsp;\u0026minus;\u0026thinsp;15 guidelines of the International Telecommunication Union Radiocommunication Sector [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e], test images were displayed on an FlexScan EV 2750 monitor (EIZO, Ishikawa, Japan) in a controlled environment. Corresponding prompts and scoring labels were provided for blind evaluation.\u003c/p\u003e \u003cp\u003eFifteen physicians, with over 9 years of experience in nuclear medicine, participated in the subjective evaluation. All evaluators participated in a collaborative discussion and received training of educational images (not included in the testing dataset), as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, which presents suggested PET examples for the 5-point Likert scale. Evaluators conducted subjective assessments of the overall image quality for axial PET and CT images, assigning scores based on a five-point Likert scale. Importantly, they were blinded to any identifiable patient information, including identity, diagnosis, treatment, and other specific details. We defined an image quality score of 1 as very poor, definitely affecting diagnosis; a score of 2 as poor, affecting diagnosis; a score of 3 as average, possibly affecting diagnosis; a score of 4 as good, not affecting diagnosis; and a score of 5 as excellent, not affecting diagnosis.\u003c/p\u003e \u003cp\u003eAll 15 nuclear medicine experts from different centers evaluated all 718 patients. According to our expert consensus, if the assessments of any two evaluators differed by three or more points, it indicated the need for further discussion or reevaluation. To resolve such discrepancies, we consulted the opinions of three senior experts, each with over 20 years of experience, to reach a consensus. Their final evaluations were considered the ground truth for these images. The scores were then averaged to create the mean opinion score (MOS)[\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. Based on the MOS, we categorized image quality into five levels as follows: \u003cb\u003eworst\u003c/b\u003e, [1, 1.5), \u003cb\u003epoor\u003c/b\u003e, [1.5, 2.5), \u003cb\u003eaverage\u003c/b\u003e, [2.5, 3.5), \u003cb\u003egood\u003c/b\u003e [3.5, 4.5), \u003cb\u003eexcellent\u003c/b\u003e [4.5, 5].\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eImage Quality Scoring Model Construction\u003c/h3\u003e\n\u003cp\u003eThe IQA scoring model applied the multi-dimensional attention network for reference-free image quality assessment [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/IIGROUP/MANIQA\u003c/span\u003e\u003cspan address=\"https://github.com/IIGROUP/MANIQA\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), which combines a vision transformer (ViT) for feature extraction with a transposed attention block (TAB) and a scaled shifted-window transformer block (SSTB) to deploy the attention mechanism across both channels and spatial dimensions. In this multi-dimensional manner, the interconnected modules enhance the global and local interactions of features, thereby improving the accuracy of the scoring system (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb).\u003c/p\u003e \u003cp\u003eTraining leverages the predictive capabilities of the scoring model to assess the quality of key slices, utilizing a comprehensive dataset of classified outcomes that have been meticulously vetted using empirical, subjective scoring methods. This evaluative framework guarantees that the quality predictions align with established standards. Detailed model description of the IQA Scoring model and training parameters are provided in Supplementary 1.4 (IQA Scoring Model) and 1.5 (Training Protocol). To align with real world clinical scene, the output of the model is a decimal between 1 and 5. The criteria used for determining the final quality score adhere to the same standard as that employed in subjective image assessment.\u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eStatistical Analysis\u003c/h2\u003e \u003cp\u003eStatistical analysis was performed using IBM SPSS Statistics for Windows, version 25.0 (IBM Corp., Armonk, NY, USA). The Spearman correlation coefficient (\u003cem\u003eρ\u003c/em\u003e)[\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] and intraclass correlation coefficient (ICC), both commonly used metrics for assessing agreement among multiple raters, were calculated to evaluate the consistency of the image quality scores assigned by the nuclear medicine experts and those generated by the IQA model. The Spearman correlation coefficient quantifies the strength and direction of the association between the ranked scores from the two sources of evaluation (human experts and the IQA model). In contrast, the ICC measures the reliability of ratings or measurements across these evaluators when assessing the same quantity. The values of \u003cem\u003eρ\u003c/em\u003e and ICC were categorized as follows: [0.75, 1], excellent agreement; [0.60, 0.75), good agreement; [0.40, 0.60), fair agreement; and [0, 0.40), poor agreement. The test statistics were approximated using a normal distribution to calculate the 95% confidence interval (CI) and the \u003cem\u003ep\u003c/em\u003e-value, with \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05 being considered statistically significant.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cp\u003eOur study developed a DL-based system tailored for automated, expert-level image quality assessment (IQA) of whole-body [18F]FDG PET/CT scans. By customizing algorithms to automatically select and localize seven predefined key slices across four anatomical regions, our model eliminates the dependency on handcrafted IQA metrics. This automated approach enables precise, consistent IQA that aligns with expert subjective evaluations, establishing a robust, scalable framework for multi-center clinical image quality standardization.\u003c/p\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eDataset\u003c/h2\u003e \u003cp\u003eThe mean age of the 718 patients was 55.35 years (range 18\u0026ndash;88 years), with the majority of patients (62.8%) being male. Patient characteristics, including demographics and image acquisition parameters, taken from the DICOM data are shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Given that the resolution of CT images was sufficient for anatomical localization, we focused on documenting the relevant parameters of the PET systems. Detailed numbers of images in the training, validation, and test datasets are provided in Supplementary Tables S2.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eIQA Model Performances\u003c/h2\u003e \u003cp\u003eThis deep learning-based IQA model can predict PET/CT image quality in accordance with physicians' subjective assessments, classifying images into five levels (from worst to excellent). Detailed performance metrics of the IQA model on the test set are shown in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. For the PET dataset, the model achieved an average accuracy of 0.832 (SD\u0026thinsp;=\u0026thinsp;0.058), while for the CT dataset, the average accuracy was 0.902 (SD\u0026thinsp;=\u0026thinsp;0.064). The accuracy for the PET dataset across various anatomical regions achieved alignment with expert scoring, ranging from 75.47\u0026ndash;88.68%, while the CT dataset demonstrated accuracies between 78.68% and 96.88%. These results demonstrate that our model has a high accuracy in predicting the quality of PET and CT images.\u003c/p\u003e \u003cp\u003eThe Confusion matrix analysis in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e showed the distribution of different scoring outcomes of the DL model scoring in PET and CT test sets for different regions. Given the unevenness in the scoring distribution, ROC curves (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e) were employed to assess the discriminative capacity between high- and low-quality images (with high quality defined as good or excellent, and low quality as worst, poor, or average) across various regions. The PET images demonstrated an AUC of \u0026ge;\u0026thinsp;0.88, signifying the robust performance of the IQA models. However, due to the inherent characteristics of head data, there were no high-quality PET images available, precluding the plotting of an ROC curve. Figure\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e demonstrates the detailed performance of recall, precision, and F1 score [\u003cspan additionalcitationids=\"CR41 CR42 CR43\" citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e] which indicates the robustness of the IQA model.\u003c/p\u003e \u003cp\u003eTo validate the model's efficacy, we have calculated the Spearman correlation coefficients and ICCs comparing the model's predicted scores with the expert-derived MOS. These calculations confirmed the relatedness and consistency of the assessments, as detailed in Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Within the PET dataset, the model's scores demonstrated strong agreement with MOS, with Spearman coefficients (ρ) ranging from 0.765 to 0.915. The overall ρ across all PET regions reached 0.891. Similarly, in the CT dataset, the model's scores for key slices indicated good agreement with the scores from 15 experts, with correlation coefficients varying from 0.422 to 0.823, and an average of 0.624 for overall body parts. In both the PET and CT datasets, the model's scores exhibit moderate to excellent agreement with the assessments of experts (ρ for CT: 0.624, PET: 0.891, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.001).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eCorrelation of IQA model score with MOS across anatomical regions in PET/CT\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDescription\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eρ (\u003cem\u003ep\u003c/em\u003e-value)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c5\" namest=\"c4\"\u003e \u003cp\u003eICC (95% CI)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePET\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCT\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePET\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCT\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBasal ganglia\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.776 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.823 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.873 (0.781\u0026ndash;0.927)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.972 (0.953\u0026ndash;0.983)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCerebellar vermis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.765 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.666 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.874 (0.782\u0026ndash;0.927)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.919 (0.865\u0026ndash;0.951)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAortic arch\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.891 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.422 (0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.952 (0.917\u0026ndash;0.973)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.758 (0.602\u0026ndash;0.853)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTracheal carina\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.915 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.441 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.965 (0.940\u0026ndash;0.979)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.540 (0.243\u0026ndash;0.721)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLiver\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.898 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.664 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.965 (0.940\u0026ndash;0.980)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.779 (0.636\u0026ndash;0.866)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePancreas\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.887 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.719 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.951 (0.915\u0026ndash;0.972)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.831 (0.722\u0026ndash;0.897)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIliac bone\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.886 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.606 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.956 (0.926\u0026ndash;0.974)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.837 (0.731\u0026ndash;0.901)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAverage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.891 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.624 (\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.953 (0.942\u0026ndash;0.962)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.920 (0.906\u0026ndash;0.932)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003eSpearman correlation coefficient (ρ) and ICC quantifying the agreement between the IQA model score and the MOS from 15 evaluators. \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.001 indicates a highly significant correlation. ICC\u0026thinsp;\u0026gt;\u0026thinsp;0.75 suggests excellent reliability.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe ICC analysis revealed a comparable pattern: barring the tracheal carina in the CT dataset, which exhibited moderate agreement (ICC\u0026thinsp;=\u0026thinsp;0.54, 95% CI: 0.243\u0026ndash;0.721), the model generally exhibited excellent agreement with MOS. The ICCs ranged from 0.758 to 0.972 (ICC\u0026thinsp;\u0026ge;\u0026thinsp;0.75) in the remaining areas of both datasets, indicating robust agreement. Notable concordance between the algorithm's output and the expert-derived scores (ICC for CT: 0.92, 95% CI: 0.906\u0026ndash;0.932; ICC for PET: 0.953, 95% CI: 0.942\u0026ndash;0.962) corroborated the credibility of the deep learning IQA system.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eWe successfully developed an advanced DL-based system for the comprehensive, expert-level evaluation of whole-body [18F]FDG PET/CT images, laying a solid foundation for advancements in automated IQA. Our model was trained on a large clinical dataset of 3,517 CT and 3,430 PET images, using assessments from 15 experts as references, and demonstrated high accuracy in predicting image quality across multiple anatomical regions, including the brain, thorax, abdomen, and pelvis. Notably, this fully automated IQA system shows strong alignment with physician assessments, enhancing the interpretability of clinical images and highlighting its potential for consistent, expert-level evaluation in multi-center clinical settings.\u003c/p\u003e \u003cp\u003eA validated IQA system may relieve physicians from repetitive tasks, enhance diagnostic confidence in clinical images, and promote the reproducibility of image-based applications [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. DL algorithms have emerged as effective tools for image evaluation [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e], and several PET studies have investigated the application of artificial intelligence to IQA. Hopson et al. [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] found that a pre-training strategy improved the performance of CNNs in predicting PET image quality. Reyn\u0026eacute;s-Llompart et al. [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e] analysed the correlation between image quality parameters and subjective scores for 112 PET scans, employing a radiomics-based machine learning model to predict the subjective scores. Zhang et al. [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] validated the feasibility of using a DL model to assess PET image quality by scoring 89 head-free three-dimensional PET images. Using manually established objective image metrics, Qi et al. [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] developed a CNN-based system to perform rapid IQA of PET images of the thoracic, abdominal, and pelvic regions.\u003c/p\u003e \u003cp\u003eWhile AI has made significant advancements in PET IQA, several challenges remain unresolved. These include the need for large, high-quality, and diverse datasets for robust model training [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e], the necessity of external validation [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e] to establish model reliability, and the challenge of seamlessly integrating AI tools into clinical workflows. Overcoming these obstacles is essential to advancing the field and ensuring the clinical utility of AI-based IQA systems.\u003c/p\u003e \u003cp\u003ePrevious studies have proposed various AI-driven methods for the automated IQA of [18F]FDG PET images [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e]. However, these methods often lack comprehensiveness, focusing either on maximum intensity projection (MIP) images [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] or manually selected axial slices that exclude specific regions, such as the head [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e], which limits their clinical applicability and generalizability. Amini et al. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] introduced a region-specific IQA framework that aimed to replicate human perceptual standards. However, this method was limited by its reliance on evaluations conducted by only two independent raters and the absence of a consensus-based scoring system, which may compromise label consistency and model robustness. Additionally, these studies employ datasets of limited size and diversity, failing to account for variations in imaging protocols, scanner types, or patient populations, which restricts the generalizability and reproducibility [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e] of the resulting AI models.\u003c/p\u003e \u003cp\u003eOur study addresses these challenges through the development of a novel, fully automated, DL-based IQA framework. Leveraging one of the largest known datasets of [18F]FDG PET/CT images, along with subjective evaluations by 15 nuclear medicine experts, we ensured higher reliability and robustness in data labeling. The dataset incorporates multi-center data, spanning a diverse range of imaging protocols and scanner types, thereby enhancing the generalizability of the proposed model. Furthermore, our framework employs advanced deep learning techniques to automatically identify key anatomical regions, such as the basal ganglia, cerebellar vermis, aortic arch, and iliac region, and applies region-specific evaluation criteria. This ensures a standardized and interpretable approach to IQA across the whole body, addressing both reproducibility and explainability in clinical practice.\u003c/p\u003e \u003cp\u003eIn our study, head movement variability and respiratory motion in the thorax led to inconsistent image quality across key slices within the same patient, making it difficult to evaluate the overall image quality accurately. To address this, we implemented a slice-level subjective evaluation, allowing for a more precise assessment of image quality across different anatomical regions. To avoid bias stemming from a limited training set size, we adopted a data collection strategy using multi-center clinical data, supervised and guided by domain experts. Recognizing the different and complex factors that affect PET and CT image quality, we innovatively developed an automated scoring system capable of reliably assessing the quality of both whole-body CT and PET images. Confusion matrix analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e) showed very few misclassifications, indicating that the model is capable of robustly classifying image quality across different regions. The PET IQA models demonstrated excellent performance in distinguishing high-quality images, with AUC values\u0026thinsp;\u0026ge;\u0026thinsp;0.88 for thoracic and abdominal regions (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). The model achieved up to 96.9% accuracy, 100% recall, 100% precision, and a 93.8% F-score (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e), surpassing or matching the performance of similar studies [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan additionalcitationids=\"CR42 CR43\" citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. These metrics indicate a robust ability to identify high-quality images with minimal misclassifications. However, due to dataset imbalance, the lack of high-quality head PET images limited the evaluation in this region. Our benchmark was based on visual diagnostic standards of expert physicians, and the consistency between the model and expert evaluations (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e) underscores its ability to capture subtle subjective perceptions of image quality.\u003c/p\u003e \u003cp\u003eAlthough challenges remain in interpreting PET/CT scans due to inter- and intra-observer variability, our subjective assessment involving 15 experts helped mitigate potential biases from individual evaluators, providing a more reliable and robust assessment, marking a significant advancement in the field. In addition to tackling dataset diversity and annotation practices, our study adheres strictly to the CLAIM guidelines [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e], as outlined in Table \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e, to enhance reproducibility and ensure compliance with established standards for AI research in medical imaging. Unlike prior studies, which often lack clarity in reporting dataset characteristics and evaluator scoring standards [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e], our work combines advanced AI techniques with well-defined, reproducible, expert-driven evaluation methodologies, ensuring both the integrity of the research process and the credibility of the results. This approach not only improves research transparency but also enhances clinical applicability, facilitating the integration of the proposed system into diverse clinical environments.\u003c/p\u003e \u003cp\u003eIn this proof-of-concept study, we introduced a novel approach to IQA for PET/CT and demonstrated the feasibility of using DL in clinical settings. The individual modules in this IQA system can be independently customized, providing functionalities such as automatic segmentation and selection of key slices, cross-modality matching, and three-dimensional alignment, which serve as powerful tools for clinical research.\u003c/p\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eLimitations\u003c/h2\u003e \u003cp\u003eAlthough our DL model shows promising results, several limitations should be acknowledged. First, our initial work involved the automated identification and scoring of seven axial slices in PET and CT images; however, other slices may also hold significant evaluative importance. Deficiencies in the localization algorithm could cause slight misalignments of these key slices, potentially leading to an imbalanced training dataset and affecting the performance of the system in clinical settings. Moreover, compared to the consistency between prediction scores and expert ratings for CT images, the deep learning model prediction scores for PET images are more consistent with expert ratings. To address this limitation, future work should focus on improving the accuracy of the algorithm.\u003c/p\u003e \u003cp\u003eSecond, even with well-defined IQA criteria, discrepancies in the subjective evaluation of PET/CT scans remain because of differences in reader experience and expertise. To address this, we consulted senior experts' image quality annotations as ground truth. However, this expert consensus approach may introduce new biases. Future work should involve a larger number of senior specialists for further validation.\u003c/p\u003e \u003cp\u003eThird, although data from multiple institutions were used, the performance of the IQA system may still be influenced by the quality and diversity of the training data. Additionally, our method has not yet been tested in real-world clinical settings. To advance this work, it is crucial to expand the training dataset and test the model in diverse clinical settings.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis study developed a foundational, automated IQA system that provides a reliable, expert-level assessment of whole-body [18F]FDG PET/CT images. The system\u0026rsquo;s versatility would allow for automated selection of high-quality clinical images from multi-center datasets and provide customizable support for the development of image-based models.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eSupplementary Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSupplementary Material 1.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data of the study is available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe are very grateful to the Shanghai Universal Medical Imaging Diagnostic Center for their assistance in this research. We thank Liwen Bianji (Edanz) (www.liwenbianji.cn/ac) for editing the language of a draft of this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the Shanghai Central Government-led Local Development Fund (Project No.: YDZX20223100003001).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors and Affiliations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDepartment of Nuclear Medicine, The First Medical Center of Chinese PLA General Hospital, Beijing, China\u003c/p\u003e\n\u003cp\u003eCong Zhang, Ruimin Wang, Jiahe Tian\u003c/p\u003e\n\u003cp\u003eShanghai Universal Medical Imaging Diagnostic Center, Shanghai, China\u003c/p\u003e\n\u003cp\u003eXin Gao, Gang Feng\u003c/p\u003e\n\u003cp\u003eDepartment of Scientific Research, Shanghai Aitrox Technology Corporation Limited, Shanghai, China\u003c/p\u003e\n\u003cp\u003eXuebin Zheng, Jun Xie, Yunchao Bao, Pengchen Gu, Chuan He\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor\u0026apos;s contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCZ: conceptualization, formal analysis, and investigation; project administration; writing of the original draft and visualization. XG: funding, resources and project administration; XZ: project administration; validation and investigation. JX: data processing, editing. GF: data processing. YB: validation and investigation. PG: validation and investigation. YZ: resources and data processing. CH: resources and data processing. RW: supervision. JT: review, and supervision. All authors read and approved the final manuscript\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics declarations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interest\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll authors have no conflicts of interest to report.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll procedures performed in studies involving human participants were in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.The study was carried out in compliance with the CLAIM guidelines\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was approved by the Clinical Research Ethics Committee of the Chinese PLA General Hospital (approval number: S2024-029-01) and the Ethics Committee of Universal Medical Imaging Diagnostic Center (approval number: SHQJ-2022-08, HZQJ-2022-08, GZQJ-2022-08).\u0026nbsp;Written informed consent was waived due to its retrospective design.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003ePinker K, Riedl C, Weber WA. Evaluating tumor response with FDG PET: updates on PERCIST, comparison with EORTC criteria and clues to future developments. Eur J Nucl Med Mol Imaging. 2017;44(Suppl 1):55\u0026ndash;66. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1007/s00259-017-3687-3\u003c/span\u003e\u003cspan address=\"http://10.1007/s00259-017-3687-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeveque L, Outtas M, Liu H, Zhang L. Comparative study of the methodologies used for subjective medical image quality assessment. Phys Med Biol. 2021;66(15). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1088/1361-6560/ac1157\u003c/span\u003e\u003cspan address=\"http://10.1088/1361-6560/ac1157\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChow LS, Paramesran R. Review of medical image quality assessment. Biomedical signal processing and control;27:145\u0026thinsp;\u0026ndash;\u0026thinsp;54. 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSyed AB, Zoga AC. Artificial Intelligence in Radiology: Current Technology and Future Directions. Semin Musculoskelet Radiol. 2018;22(5):540\u0026ndash;5. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1055/s-0038-1673383\u003c/span\u003e\u003cspan address=\"http://10.1055/s-0038-1673383\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85\u0026ndash;117. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.neunet.2014.09.003\u003c/span\u003e\u003cspan address=\"http://10.1016/j.neunet.2014.09.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeCun Y, Bengio Y, Hinton G. Deep learning. nature;521(7553):436\u0026thinsp;\u0026ndash;\u0026thinsp;44. 2015.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen X, Wang X, Zhang K, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.media.2022.102444\u003c/span\u003e\u003cspan address=\"http://10.1016/j.media.2022.102444\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAtasever S, Azginoglu N, Terzi DS, Terzi R. A comprehensive survey of deep learning research on medical image analysis with focus on transfer learning. Clin Imaging. 2023;94:18\u0026ndash;41. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.clinimag.2022.11.003\u003c/span\u003e\u003cspan address=\"http://10.1016/j.clinimag.2022.11.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTajbakhsh N, Jeyaseelan L, Li Q, Chiang JN, Wu Z, Ding X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Med Image Anal. 2020;63:101693. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.media.2020.101693\u003c/span\u003e\u003cspan address=\"http://10.1016/j.media.2020.101693\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEl-Shafai W, El-Nabi SA, El-Rabaie E-SM et al. Efficient Deep-Learning-Based Autoencoder Denoising Approach for Medical Image Diagnosis. Computers Mater Continua;70(3). 2022.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLoft M, Ladefoged CN, Johnbeck CB, et al. An Investigation of Lesion Detection Accuracy for Artificial Intelligence-Based Denoising of Low-Dose (64)Cu-DOTATATE PET Imaging in Patients with Neuroendocrine Neoplasms. J Nucl Med. 2023;64(6):951\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.2967/jnumed.122.264826\u003c/span\u003e\u003cspan address=\"http://10.2967/jnumed.122.264826\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKaur A, Dong G. A complete review on image denoising techniques for medical images. Neural Process Lett. 2023;55(6):7807\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAmini M, Salimi Y, Hajianfar G, et al. Fully Automated Region-Specific Human-Perceptive-Equivalent Image Quality Assessment: Application to 18F-FDG PET Scans. Clin Nucl Med. 2024;49(12):1079\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang J, Lyu M, Qi Z, Shi Y. Deep learning based image quality assessment: A survey. Procedia Comput Sci. 2023;221:1000\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLitjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60\u0026ndash;88. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.media.2017.07.005\u003c/span\u003e\u003cspan address=\"http://10.1016/j.media.2017.07.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600\u0026ndash;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1109/tip.2003.819861\u003c/span\u003e\u003cspan address=\"http://10.1109/tip.2003.819861\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSui X, Tan H, Yu H, et al. Exploration of the total-body PET/CT reconstruction protocol with ultra-low 18F-FDG activity over a wide range of patient body mass indices. EJNMMI Phys. 2022;9(1):17.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUkita J, Yoshida T, Ohki K. Characterisation of nonlinear receptive fields of visual neurons by convolutional neural network. Sci Rep. 2019;9(1):3791. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1038/s41598-019-40535-4\u003c/span\u003e\u003cspan address=\"http://10.1038/s41598-019-40535-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCheng Y, Abadi E, Smith TB, et al. Validation of algorithmic CT image quality metrics with preferences of radiologists. Med Phys. 2019;46(11):4837\u0026ndash;46. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1002/mp.13795\u003c/span\u003e\u003cspan address=\"http://10.1002/mp.13795\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSartoretti T, Skawran S, Gennari AG, et al. Fully automated computational measurement of noise in positron emission tomography. Eur Radiol. 2024;34(3):1716\u0026ndash;23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1007/s00330-023-10056-w\u003c/span\u003e\u003cspan address=\"http://10.1007/s00330-023-10056-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNikiforaki K, Karatzanis I, Dovrou A, et al. Image Quality Assessment Tool for Conventional and Dynamic Magnetic Resonance Imaging Acquisitions. J Imaging. 2024;10(5). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.3390/jimaging10050115\u003c/span\u003e\u003cspan address=\"http://10.3390/jimaging10050115\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHoeijmakers EJI, Martens B, Hendriks BMF, et al. How subjective CT image quality assessment becomes surprisingly reliable: pairwise comparisons instead of Likert scale. Eur Radiol. 2024;34(7):4494\u0026ndash;503. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1007/s00330-023-10493-7\u003c/span\u003e\u003cspan address=\"http://10.1007/s00330-023-10493-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLin W, Hasenstab K, Moura Cunha G, Schwartzman A. Comparison of handcrafted features and convolutional neural networks for liver MR image adequacy assessment. Sci Rep. 2020;10(1):20336. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1038/s41598-020-77264-y\u003c/span\u003e\u003cspan address=\"http://10.1038/s41598-020-77264-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePiccini D, Demesmaeker R, Heerfordt J, et al. Deep Learning to Automate Reference-Free Image Quality Assessment of Whole-Heart MR Images. Radiol Artif Intell. 2020;2(3):e190123. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1148/ryai.2020190123\u003c/span\u003e\u003cspan address=\"http://10.1148/ryai.2020190123\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEsses SJ, Lu X, Zhao T, et al. Automated image quality evaluation of T(2) -weighted liver MRI utilizing deep learning architecture. J Magn Reson Imaging. 2018;47(3):723\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1002/jmri.25779\u003c/span\u003e\u003cspan address=\"http://10.1002/jmri.25779\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQi C, Wang S, Yu H, et al. An artificial intelligence-driven image quality assessment system for whole-body [(18)F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2023;50(5):1318\u0026ndash;28. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1007/s00259-022-06078-z\u003c/span\u003e\u003cspan address=\"http://10.1007/s00259-022-06078-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang H, Liu Y, Wang Y, et al. Deep learning model for automatic image quality assessment in PET. BMC Med Imaging. 2023;23(1):75. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1186/s12880-023-01017-2\u003c/span\u003e\u003cspan address=\"http://10.1186/s12880-023-01017-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAfshar P, Mohammadi A, Plataniotis KN, Oikonomou A, Benali H. From handcrafted to deep-learning-based cancer radiomics: challenges and opportunities. IEEE Signal Process Mag. 2019;36(4):132\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRodrigues R, L\u0026eacute;v\u0026ecirc;que L, Guti\u0026eacute;rrez J et al. Objective quality assessment of medical images and videos: Review and challenges. Multimedia Tools Appl:1\u0026ndash;34. 2024.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim J, Nguyen AD, Lee S, Deep, CNN-Based Blind Image Quality Predictor. IEEE Trans Neural Netw Learn Syst. 2019;30(1):11\u0026ndash;24. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1109/TNNLS.2018.2829819\u003c/span\u003e\u003cspan address=\"http://10.1109/TNNLS.2018.2829819\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThanki R, Borra S, Dey N, Ashour AS. Medical imaging and its objective quality assessment: an introduction. Classification in BioApps: Automation of Decision Making:3\u0026ndash;32. 2018.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZheng X, Jing B, Zhao Z et al. An interpretable deep learning model for identifying the morphological characteristics of dMMR/MSI-H gastric cancer. iScience;27(3):109243. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1016/j.isci.2024.109243\u003c/span\u003e\u003cspan address=\"http://10.1016/j.isci.2024.109243\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWasserthal J, Breit HC, Meyer MT, et al. TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. Radiol Artif Intell. 2023;5(5):e230024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1148/ryai.230024\u003c/span\u003e\u003cspan address=\"http://10.1148/ryai.230024\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIsensee F, Petersen J, Klein A et al. nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:180910486. 2018.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZelinsky A et al. Learning OpenCV\u0026mdash;Computer vision with the OpenCV library (Bradski, GR. ; 2008)[On the Shelf]. IEEE Robotics \u0026amp; Automation Magazine;16(3):100-. 2009.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eITU-R BT.500\u0026thinsp;\u0026ndash;\u0026thinsp;15. Methodologies for the subjective assessment of the quality of television images. Int Telecommunication Union; 2012. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.itu.int/rec/R-REC-BT.500/en\u003c/span\u003e\u003cspan address=\"https://www.itu.int/rec/R-REC-BT.500/en\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eObuchowicz R, Oszust M, Piorkowski A. Interobserver variability in quality assessment of magnetic resonance images. BMC Med Imaging. 2020;20(1):109. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1186/s12880-020-00505-z\u003c/span\u003e\u003cspan address=\"http://10.1186/s12880-020-00505-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang S, Wu T, Shi S, et al. editors. Maniqa: Multi-dimension attention network for no-reference image quality assessment. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDodge Y. The concise encyclopedia of statistics. Springer Science \u0026amp; Business Media; 2008.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMukhlif AA, Al-Khateeb B, Mohammed MA. An extensive review of state-of-the-art transfer learning techniques used in medical imaging: Open issues and challenges. J Intell Syst. 2022;31(1):1085\u0026ndash;111.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNesamani SL, Rajini SNS. Breast Cancer Detection with Transfer Learning Technique in Convolutional Neural Networks. Des Eng:11102\u0026ndash;9. 2021.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKassem MA, Hosny KM, Fouad MM. Skin lesions classification into eight classes for ISIC 2019 using deep convolutional neural network and transfer learning. IEEE access. 2020;8:114822\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaber A, Sakr M, Abo-Seida OM, Keshk A, Chen H. A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEe Access. 2021;9:71194\u0026ndash;209.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang SH, Xie S, Chen X, et al. Alcoholism Identification Based on an AlexNet Transfer Learning Model. Front Psychiatry. 2019;10:205. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.3389/fpsyt.2019.00205\u003c/span\u003e\u003cspan address=\"http://10.3389/fpsyt.2019.00205\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBradshaw TJ, Boellaard R, Dutta J, et al. Nuclear Medicine and Artificial Intelligence: Best Practices for Algorithm Development. J Nucl Med. 2022;63(4):500\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.2967/jnumed.121.262567\u003c/span\u003e\u003cspan address=\"http://10.2967/jnumed.121.262567\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKang L, Ye P, Li Y, Doermann D, editors. Convolutional neural networks for no-reference image quality assessment. Proceedings of the IEEE conference on computer vision and pattern recognition; 2014.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim J, Nguyen A-D, Lee S. Deep CNN-based blind image quality predictor. IEEE Trans neural networks Learn Syst. 2018;30(1):11\u0026ndash;24.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHopson JB, Neji R, Dunn JT, et al. Pre-training via Transfer Learning and Pretext Learning a Convolutional Neural Network for Automated Assessments of Clinical PET Image Quality. IEEE Trans Radiat Plasma Med Sci. 2023;7(4):372\u0026ndash;81. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1109/TRPMS.2022.3231702\u003c/span\u003e\u003cspan address=\"http://10.1109/TRPMS.2022.3231702\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eReynes-Llompart G, Sabate-Llobera A, Llinares-Tello E, Marti-Climent JM, Gamez-Cenzano C. Image quality evaluation in a modern PET system: impact of new reconstructions methods and a radiomics approach. Sci Rep. 2019;9(1):10640. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1038/s41598-019-46937-8\u003c/span\u003e\u003cspan address=\"http://10.1038/s41598-019-46937-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBinuya MAE, Engelhardt EG, Schats W, Schmidt MK, Steyerberg EW. Methodological guidance for the evaluation and updating of clinical prediction models: a systematic review. BMC Med Res Methodol. 2022;22(1):316. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1186/s12874-022-01801-8\u003c/span\u003e\u003cspan address=\"http://10.1186/s12874-022-01801-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu AC, Mohajer B, Eng J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol Artif Intell. 2022;4(3):e210064. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1148/ryai.210064\u003c/span\u003e\u003cspan address=\"http://10.1148/ryai.210064\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchwyzer M, Skawran S, Gennari AG, et al. Automated F18-FDG PET/CT image quality assessment using deep neural networks on a latest 6-ring digital detector system. Sci Rep. 2023;13(1):11332. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1038/s41598-023-37182-1\u003c/span\u003e\u003cspan address=\"http://10.1038/s41598-023-37182-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMoassefi M, Rouzrokh P, Conte GM, et al. Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review. J Digit Imaging. 2023;36(5):2306\u0026ndash;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1007/s10278-023-00870-5\u003c/span\u003e\u003cspan address=\"http://10.1007/s10278-023-00870-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMongan J, Moy L, Kahn CE. Jr. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell. 2020;2(2):e200029. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://10.1148/ryai.2020200029\u003c/span\u003e\u003cspan address=\"http://10.1148/ryai.2020200029\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"ejnmmi-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"ejre","sideBox":"Learn more about [EJNMMI Research](http://ejnmmires.springeropen.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/ejre/default.aspx","title":"EJNMMI Research","twitterHandle":"@officialEANM","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"[18F]FDG, Image quality assessment, Positron emission tomography/computed tomography, whole-body, deep learning","lastPublishedDoi":"10.21203/rs.3.rs-5559102/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5559102/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eThe quality of clinical PET/CT images is critical for both accurate diagnosis and image-based research. However, current image quality assessment (IQA) methods predominantly rely on handcrafted features and region-specific analyses, thereby limiting automation in whole-body and multi-center evaluations. This study aims to develop an expert-perceptive deep learning-based IQA system for [18F]FDG PET/CT to tackle the lack of automated, interpretable assessments of clinical whole-body PET/CT image quality.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eThis retrospective multicenter study included clinical whole-body [18F]FDG PET/CT scans from 718 patients. Automated identification and localization algorithms were applied to select predefined pairs of PET and CT slices from whole-body images. Fifteen experienced experts, trained to conduct blinded slice-level subjective assessments, provided average visual scores as reference standards. Using the MANIQA framework, the developed IQA model integrates the Vision Transformer, Transposed Attention, and Scale Swin Transformer Blocks to categorize PET and CT images into five quality classes. The model\u0026rsquo;s correlation, consistency, and accuracy with expert evaluations on both PET and CT test sets were statistically analysed to assess the system's IQA performance. Additionally, the model's ability to distinguish high-quality images was evaluated using receiver operating characteristic (ROC) curves.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe IQA model demonstrated high accuracy in predicting image quality categories and showed strong concordance with expert evaluations of PET/CT image quality. In predicting slice-level image quality across all body regions, the model achieved an average accuracy of 0.832 for PET and 0.902 for CT. The model\u0026rsquo;s scores showed substantial agreement with expert assessments, achieving average Spearman coefficients (ρ) of 0.891 for PET and 0.624 for CT, while the average Intraclass Correlation Coefficient (ICC) reached 0.953 for PET and 0.92 for CT. The PET IQA model demonstrated strong discriminative performance, achieving an area under the curve (AUC) of \u0026ge;\u0026thinsp;0.88 for both the thoracic and abdominal regions.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eThis fully automated IQA system provides a robust and comprehensive framework for the objective evaluation of clinical image quality. Furthermore, it demonstrates significant potential as an impartial, expert-level tool for standardised multicenter clinical IQA.\u003c/p\u003e","manuscriptTitle":"A Fully Automated, Expert-Perceptive Image Quality Assessment System for Whole-Body [18F]FDG PET/CT","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-03-26 08:55:55","doi":"10.21203/rs.3.rs-5559102/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Accept","date":"2025-04-05T05:19:58+00:00","index":"","fulltext":""},{"type":"reviewerAgreed","content":"","date":"2025-03-24T08:43:55+00:00","index":0,"fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-03-21T20:29:28+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-03-21T17:38:55+00:00","index":"","fulltext":""},{"type":"submitted","content":"EJNMMI Research","date":"2025-03-20T06:45:17+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"ejnmmi-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"ejre","sideBox":"Learn more about [EJNMMI Research](http://ejnmmires.springeropen.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/ejre/default.aspx","title":"EJNMMI Research","twitterHandle":"@officialEANM","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a3c1e558-b80d-4ddf-b486-4a8df0047916","owner":[],"postedDate":"March 26th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-04-21T16:09:48+00:00","versionOfRecord":{"articleIdentity":"rs-5559102","link":"https://doi.org/10.1186/s13550-025-01238-2","journal":{"identity":"ejnmmi-research","isVorOnly":false,"title":"EJNMMI Research"},"publishedOn":"2025-04-18 15:57:54","publishedOnDateReadable":"April 18th, 2025"},"versionCreatedAt":"2025-03-26 08:55:55","video":"","vorDoi":"10.1186/s13550-025-01238-2","vorDoiUrl":"https://doi.org/10.1186/s13550-025-01238-2","workflowStages":[]},"version":"v1","identity":"rs-5559102","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5559102","identity":"rs-5559102","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00