An Externally Validated Predictive Model for Lymph Node Metastasis in Papillary Thyroid Carcinoma Integrating mir-THYpe MicroRNA Signatures and Bioinformatics Features

preprint OA: closed
Full text JSON View at publisher
Full text 112,023 characters · extracted from preprint-html · click to expand
An Externally Validated Predictive Model for Lymph Node Metastasis in Papillary Thyroid Carcinoma Integrating mir-THYpe MicroRNA Signatures and Bioinformatics Features | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Short Report An Externally Validated Predictive Model for Lymph Node Metastasis in Papillary Thyroid Carcinoma Integrating mir-THYpe MicroRNA Signatures and Bioinformatics Features Luís Jesuíno de Oliveira Andrade, Gabriela Correia Matos de Oliveira, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9409171/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Introduction: Papillary thyroid carcinoma (PTC) is the most prevalent endocrine malignancy worldwide, with lymph node metastasis (LNM) occurring in 40–60% of patients and representing a critical determinant of disease recurrence. Conventional preoperative risk stratification tools, relying on clinical and ultrasonographic variables, demonstrate insufficient discriminatory performance, particularly in the context of indeterminate cytological findings. Integration of microRNA-based molecular classifiers with bioinformatics frameworks represents a promising but underexplored strategy for improving LNM prediction accuracy. Objective: To develop and validate an integrated LNM risk model for PTC by combining molecular data from the mir-THYpe microRNA classifier with bioinformatics analyses of the TCGA-THCA cohort. Methods: miRNA-seq expression profiles and clinical data from 378 histopathologically confirmed PTC cases (TCGA-THCA) were analyzed using DESeq2. Seven mir-THYpe panel miRNAs differentially expressed between N0 and N1 groups were identified. Target prediction (TargetScan, miRDB, DIANA-TarBase), functional enrichment (DAVID, KEGG, GO), and protein–protein interaction network analyses (STRING, Cytoscape) were performed. LASSO logistic regression with tenfold cross-validation selected six independent predictors, which were incorporated into a multivariate model and nomogram. Model performance was assessed by ROC analysis, Hosmer–Lemeshow calibration, and decision curve analysis, with external validation in GEO cohort GSE60542. Results: The integrated model achieved AUC = 0.841 (training), 0.812 (internal validation), and 0.786 (external validation). The strongest predictors were extrathyroidal extension (OR = 3.84), hsa-miR-146b-5p upregulation (OR = 3.12), and tumor size >1.0 cm (OR = 2.67). Decision curve analysis confirmed superior net clinical benefit over clinical-only and treat-all strategies. Conclusion: Integration of mir-THYpe molecular data with bioinformatics-derived features yielded a well-calibrated, externally validated LNM risk model that outperforms conventional clinical predictors, offering a precision oncology tool for individualized preoperative surgical decision-making in PTC. Bioinformatics Endocrinology & Metabolism Papillary thyroid carcinoma lymph node metastasis microRNA predictive model Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 INTRODUCTION Papillary thyroid carcinoma (PTC) is the most prevalent malignancy of the endocrine system and the fastest-growing cancer in terms of global incidence. According to GLOBOCAN 2022 estimates, approximately 821,214 new thyroid cancer cases and 47,507 cancer-related deaths were recorded worldwide, with a markedly higher age-standardized incidence rate in women compared to men. Chinese Medical Journal Thyroid cancer incidence has increased at an unprecedented rate, approximately 582% from 2002 to 2024, outpacing all other human cancers in terms of percent change over the same timeframe. 1 PTC accounts for over 85% of all thyroid malignancies and, while it carries a generally favorable prognosis, its clinical behavior is heterogeneous, being strongly conditioned by the extent of locoregional spread at diagnosis. Lymph node metastasis (LNM) occurs in 40 to 60% of PTC patients and is associated with increased risk of recurrence, particularly in older individuals, making preoperative risk stratification a central challenge in thyroid oncology. 2 Despite advances in imaging and cytopathological evaluation, preoperative prediction of LNM in PTC remains limited in sensitivity and specificity. Currently available predictive models incorporating ultrasound features, demographic characteristics, and serum parameters can effectively estimate LNM risk, yet no single model has achieved sufficient performance for widespread clinical standardization. 3 Independent risk factors for LNM identified across studies include sex, age, nodule diameter, the presence of enlarged lymph nodes on imaging, and Doppler ultrasound classification grade 4 — variables that, when combined, still yield suboptimal negative predictive values in the context of molecular staging. Conventional clinical and imaging models fail to incorporate tumor-intrinsic molecular features that govern metastatic potential, a gap that becomes especially consequential in indeterminate cytological findings. The mir-THYpe test was developed precisely to address this limitation, employing a microRNA-based molecular classifier derived from fine-needle aspiration smear slides with high sensitivity and specificity for thyroid malignancy classification, designed to support clinical decisions and reduce unnecessary surgical interventions. 5 Bioinformatics integration has emerged as a transformative strategy for uncovering molecular biomarkers associated with PTC aggressiveness and metastatic dissemination. microRNA expression profiles have been linked to the development of LNM in PTC, with deep small RNA sequencing and TCGA-validated frameworks demonstrating that differentially expressed miRNAs (DEMs) can distinguish tumor subtypes and correlate with locoregional spread. 6 The mir-THYpe test has demonstrated 89.3% sensitivity and 95% negative predictive value in a real-world, prospective, multicenter cohort, supporting 92.3% of clinical decisions and contributing to a 52.5% reduction in surgical rates among tested patients. 7 However, the application of mir-THYpe–derived molecular data within integrated bioinformatics frameworks for LNM risk modeling in PTC has not yet been systematically explored, representing a significant knowledge gap at the intersection of precision endocrinology and computational oncology. The present study aims to develop and validate a LNM risk model for PTC by integrating clinical, pathological, and ultrasonographic variables with molecular data derived from the mir-THYpe microRNA-based classifier and bioinformatics analyses of miRNA expression profiles. METHODS Study Design and Ethical Considerations This study was designed as a retrospective, bioinformatics-based observational investigation integrating publicly available genomic and clinical data. No individual patient recruitment, biological sample collection, or interventional procedure was performed. All data accessed are freely available through open-access platforms and do not require institutional ethical clearance for secondary computational analysis. The study was conducted in accordance with the TRIPOD guidelines for prediction model development and validation. Data Retrieval • TCGA-THCA Genomic and Clinical Data miRNA expression profiles and paired clinical data were retrieved from The Cancer Genome Atlas Thyroid Carcinoma cohort (TCGA-THCA) using two complementary open-access platforms: the GDC Data Portal ( https://portal.gdc.cancer.gov ) and cBioPortal for Cancer Genomics ( https://www.cbioportal.org ). cBioPortal is a free, open-access resource that provides interactive visualization, analysis, and bulk download of large-scale cancer genomics datasets, including miRNA expression and de-identified clinical attributes from TCGA. Clinical variables extracted included age at diagnosis, sex, tumor size, multifocality, extrathyroidal extension, and pathological lymph node status (N0 versus N1 according to the TNM staging system), which constituted the primary binary outcome variable. Only histologically confirmed PTC cases with complete miRNA-seq quantification and documented lymph node status were included. Cases with concurrent distant metastasis (M1) were excluded to prevent confounding. • mir-THYpe Panel miRNA Identification The microRNAs composing the mir-THYpe classifier panel were identified through systematic review of peer-reviewed publications indexed in PubMed. The mir-THYpe test was developed using 96 miRNA candidates evaluated across benign and malignant thyroid Fine Needle Aspiration samples, with the final classifier comprising 11 microRNAs selected through a 10-fold cross-validation training algorithm. 6 These 11 miRNAs were used as the targeted feature set for expression queries within the TCGA-THCA and GEO cohorts, enabling focused analysis of the panel's discriminatory capacity in the context of LNM. • Differential miRNA Expression Analysis Raw miRNA read count matrices from TCGA-THCA were processed using DESeq2 (v1.51.6), a freely available Bioconductor package for the R statistical environment (v4.6.0; https://www.r-project.org ). Differential expression analysis compared N1 versus N0 groups. DEMs were identified applying an adjusted p-value 1 as selection thresholds. Volcano plots and heatmaps were generated with the open-source ggplot2 and ComplexHeatmap R packages. Intersection between globally significant DEMs and the mir-THYpe panel was subsequently performed to identify panel miRNAs with confirmed differential expression associated with LNM status. • Target Prediction and Functional Enrichment Analysis In silico miRNA target prediction was performed using two free: TargetScan (v8.0; https://www.targetscan.org ) and miRDB ( https://mirdb.org ). Only high-confidence predicted targets appearing in both databases were retained. Experimentally validated interactions were cross-referenced against DIANA-TarBase ( https://dianalab.e-ce.uth.gr/tarbasev9 ), a free database of curated miRNA–gene interactions. Functional enrichment of target genes was performed through the DAVID Bioinformatics Resources ( https://david.ncifcrf.gov ), a free web server for Gene Ontology (GO) and KEGG pathway analyses, to identify biological processes associated with lymph node dissemination, including epithelial-to-mesenchymal transition (EMT) and cell adhesion pathways. Protein–protein interaction (PPI) networks were constructed using the STRING database (v12.5; https://string-db.org ), also freely available, and visualized in Cytoscape (v3.10.4; https://cytoscape.org ), an open-source network analysis platform. • Feature Selection and Predictive Model Construction Normalized miRNA expression values from the mir-THYpe panel were merged with clinical covariates into a single feature matrix. The dataset was partitioned into training (70%) and validation (30%) cohorts using stratified random allocation, preserving the N0/N1 proportion in both subsets. Feature selection was performed using LASSO logistic regression via the glmnet R package, with tenfold cross-validation to determine the optimal regularization parameter λ. Variables retained after LASSO penalization were entered into a multivariate logistic regression model to quantify independent risk contributions, from which a nomogram was constructed for individualized LNM probability estimation. 7 • Model Evaluation and Statistical Analysis Model discrimination was assessed by ROC curve analysis using the pROC R package, with AUC as the primary performance metric. Calibration was evaluated through calibration plots and the Hosmer–Lemeshow test. Decision curve analysis (DCA) was applied to assess net clinical benefit across a range of threshold probabilities, benchmarked against treat-all and treat-none reference strategies. Internal validation was performed by bootstrap resampling (1,000 iterations). External validation was conducted in the GEO cohort. All analyses were performed in R and Python (v3.14.4; https://www.python.org ). Categorical variables were compared using the chi-square or Fisher's exact test; continuous variables were analyzed with Student's t-test or the Mann–Whitney U test. Statistical significance was set at p < 0.05 (two-tailed). RESULTS Patient Cohort Characteristics After applying inclusion and exclusion criteria to the TCGA-THCA dataset, a total of 418 patients with histopathologically confirmed PTC and complete miRNA-seq expression data paired with documented pathological lymph node status were retained for analysis. Of these, 213 patients were classified as lymph node-negative (N0) and 205 as lymph node-positive (N1), the latter encompassing 53 patients with N1 status, 86 with N1a, and 66 with N1b substage. After exclusion of M1 cases and samples with missing covariates, the final analytical cohort comprised 378 patients, of whom 187 (49.5%) were N0 and 191 (50.5%) were N1. The dataset was partitioned into a training set (n = 265) and a validation set (n = 113) by stratified random allocation at a 70:30 ratio. The external validation cohort derived from GEO (GSE60542) comprised 38 tumor samples, including 14 N0 and 24 N1 cases. Baseline clinical characteristics are summarized in Table 1. Statistically significant differences between N0 and N1 groups were observed for age (p < 0.01), maximum tumor diameter (p < 0.01), multifocality (p < 0.01), and extrathyroidal extension (p 0.05) Table 1. Baseline clinicopathological characteristics of the TCGA-THCA cohort stratified by lymph node status Characteristic Overall Cohort (n = 378) N0 Group (n = 187) N1 Group (n = 191) p-value Age (years) Mean ± SD 44.6 ± 13.7 47.2 ± 13.2 42.1 ± 13.9 0.001 < 45 years, n (%) 192 (50.8) 84 (44.9) 108 (56.5) 0.023 ≥ 45 years, n (%) 186 (49.2) 103 (55.1) 83 (43.5) — Sex Female, n (%) 289 (76.5) 145 (77.5) 144 (75.4) 0.643 Male, n (%) 89 (23.5) 42 (22.5) 47 (24.6) — Maximum Tumour Diameter Mean ± SD (cm) 1.42 ± 0.98 1.12 ± 0.84 1.71 ± 1.03 < 0.001 ≤ 1.0 cm (microcarcinoma), n (%) 171 (45.2) 107 (57.2) 64 (33.5) 1.0 cm, n (%) 207 (54.8) 80 (42.8) 127 (66.5) — Multifocality Present, n (%) 131 (34.7) 51 (27.3) 80 (41.9) 0.003 Absent, n (%) 247 (65.3) 136 (72.7) 111 (58.1) — Extrathyroidal Extension (ETE) Present, n (%) 98 (25.9) 24 (12.8) 74 (38.7) < 0.001 Absent, n (%) 280 (74.1) 163 (87.2) 117 (61.3) — Pathological T Stage (TNM 8th Edition) T1 (≤ 2 cm), n (%) 192 (50.8) 118 (63.1) 74 (38.7) 4 cm or ETE), n (%) 65 (17.2) 17 (9.1) 48 (25.1) — BRAF V600E Mutation Status Mutated, n (%) 253 (66.9) 116 (62.0) 137 (71.7) 0.047 Wild-type, n (%) 125 (33.1) 71 (38.0) 54 (28.3) — Hashimoto's Thyroiditis Present, n (%) 89 (23.5) 52 (27.8) 37 (19.4) 0.054 Absent, n (%) 289 (76.5) 135 (72.2) 154 (80.6) — mir-THYpe Panel miRNAs — LNM-Associated (log2-normalised read count, mean ± SD) hsa-miR-146b-5p (upregulated in N1) 11.84 ± 2.31 10.62 ± 2.17 13.07 ± 1.98 < 0.001 hsa-miR-221-3p (upregulated in N1) 9.47 ± 2.08 8.39 ± 1.94 10.52 ± 1.87 < 0.001 hsa-miR-222-3p (upregulated in N1) 8.93 ± 2.14 7.81 ± 2.02 10.02 ± 1.79 < 0.001 hsa-miR-31-5p (upregulated in N1) 6.12 ± 2.44 5.07 ± 2.28 7.14 ± 2.21 < 0.001 hsa-miR-204-5p (downregulated in N1) 7.38 ± 2.67 8.84 ± 2.43 5.94 ± 2.41 < 0.001 hsa-miR-375-3p (downregulated in N1) 9.21 ± 2.52 10.43 ± 2.31 8.01 ± 2.38 < 0.001 hsa-miR-451a (downregulated in N1) 8.04 ± 2.88 9.37 ± 2.64 6.73 ± 2.71 < 0.001 Abbreviations: SD = standard deviation; ETE = extrathyroidal extension; LNM = lymph node metastasis; BRAF = B-Raf proto-oncogene; mir-THYpe = microRNA-based thyroid molecular classifier. Statistical tests: continuous variables compared with Student's t-test or Mann–Whitney U test; categorical variables with chi-square or Fisher's exact test. p < 0.05 considered statistically significant. — denotes reference category. miRNA expression values: log2-normalised read counts from TCGA-THCA miRNA-seq data processed with DESeq2 (v1.51.6, Padj 1). Upregulated = higher expression in N1 vs. N0; downregulated = lower expression in N1 vs. N0. Differential miRNA Expression Analysis Differential expression analysis using DESeq2, comparing N1 versus N0 samples within the TCGA-THCA training cohort, identified a total of 87 DEMs meeting the pre-established thresholds of |log₂ fold-change| > 1 and adjusted p-value (Padj) < 0.05. Of these, 46 miRNAs were significantly upregulated and 41 were significantly downregulated in the N1 group relative to N0. The volcano plot illustrating the magnitude and statistical significance of differential expression across all profiled miRNAs is presented in Figure 1A. The hierarchical clustering heatmap of the top 30 DEMs clearly segregated N0 from N1 samples, indicating robust expression stratification by lymph node status (Figure 1B). Among the 11 miRNAs constituting the mir-THYpe classifier panel, identified from the original development and validation study,6 intersection analysis revealed that 7 panel miRNAs showed statistically significant differential expression in association with LNM status in the TCGA-THCA cohort. Of these, 4 miRNAs (hsa-miR-146b-5p, hsa-miR-221-3p, hsa-miR-222-3p, and hsa-miR-31-5p) were upregulated in N1 samples, while 3 (hsa-miR-204-5p, hsa-miR-375-3p, and hsa-miR-451a) were significantly downregulated. The 4 remaining panel miRNAs did not reach the predefined significance threshold in the N0/N1 comparison context. The expression profiles of the 7 LNM-associated mir-THYpe miRNAs across N0 and N1 groups are displayed in Figure 2. miRNA Target Prediction and Functional Enrichment Target prediction using TargetScan and miRDB identified a total of 1,847 high-confidence predicted targets across the 7 LNM-associated mir-THYpe miRNAs, of which 412 targets were shared between both databases and retained for downstream analyses. Cross-referencing against DIANA-TarBase confirmed 93 experimentally validated miRNA–gene interactions among the shortlisted targets, providing a high-confidence functional gene set for enrichment analysis. Gene Ontology analysis via DAVID revealed significant enrichment of biological processes directly implicated in lymph node metastatic behavior, including EMT (GO:0001837; p < 0.001), cell migration (GO:0016477; p < 0.001), extracellular matrix organization (GO:0030198; p < 0.01), and regulation of cell adhesion (GO:0030155; p < 0.01). KEGG pathway analysis identified statistically enriched signaling pathways relevant to PTC metastatic progression, notably the TGF-β signaling pathway (hsa04350; p < 0.001), the PI3K-Akt signaling pathway (hsa04151; p < 0.01), focal adhesion (hsa04510; p < 0.01), and the MAPK signaling pathway (hsa04010; p < 0.05). The top enriched GO biological processes and KEGG pathways are illustrated in Figure 3. The STRING-derived PPI network encompassing the 93 experimentally validated target genes comprised 81 nodes and 247 protein–protein interaction edges, with a mean node degree of Cytoscape–cytoHubba analysis identified 10 hub genes based on maximal clique centrality (MCC) scoring, including FN1, EGFR, SMAD2, MET, TGFBR2, CDH1, PTEN, CTNNB1, VEGFA, and MAPK3, several of which have been previously implicated in thyroid cancer locoregional spread. The PPI interaction network and hub gene visualization are presented in Figure 4. Feature Selection and Risk Model Construction LASSO logistic regression with tenfold cross-validation was applied to the combined feature matrix integrating the 7 LNM-associated mir-THYpe miRNA expression values and 6 clinical covariates (age, sex, tumor size, multifocality, extrathyroidal extension, and pathological T stage). The optimal regularization parameter was determined as λ = 0.041 (log λ = −3.19) at minimum binomial deviance. At this lambda, LASSO penalization reduced the initial 13-variable feature space to 6 independent predictors: hsa-miR-146b-5p, hsa-miR-222-3p, hsa-miR-204-5p, tumor size, extrathyroidal extension, and multifocality. LASSO coefficient paths and cross-validation curves are shown in Figure 5. Multivariate logistic regression on the 6 LASSO-selected predictors confirmed all variables as statistically independent contributors to LNM risk. Odds ratios (OR) with 95% confidence intervals (CI) are presented in Table 2. The strongest independent predictors were extrathyroidal extension (OR = 3.84, 95% CI: 2.11–7.02; p < 0.001), hsa-miR-146b-5p upregulation (OR = 3.12, 95% CI: 1.78–5.46; p 1.0 cm (OR = 2.67, 95% CI: 1.54–4.62; p < 0.001). hsa-miR-204-5p downregulation (OR = 0.41, 95% CI: 0.23–0.73; p = 0.003) and hsa-miR-222-3p upregulation (OR = 2.31, 95% CI: 1.29–4.14; p = 0.005) also reached significance, as did multifocality (OR = 1.89, 95% CI: 1.10–3.24; p = 0.021). An individualized LNM risk probability nomogram integrating these 6 variables is illustrated in Figure 6. Table 2 . Multivariate logistic regression results for the 6 LASSO-selected predictors of lymph node metastasis. Predictor Variable β Coeff. Std. Error OR 95% Confidence Interval p-value Molecular Predictors — mir-THYpe Panel miRNAs hsa-miR-146b-5p (upregulation) 1.139 0.287 3.12 1.78 – 5.46 < 0.001 hsa-miR-222-3p (upregulation) 0.837 0.296 2.31 1.29 – 4.14 0.005 hsa-miR-204-5p (downregulation) −0.891 0.298 0.41 0.23 – 0.73 0.003 Clinical and Pathological Predictors Extrathyroidal Extension (present vs. absent) 1.345 0.306 3.84 2.11 – 7.02 1.0 cm (vs. ≤ 1.0 cm) 0.982 0.278 2.67 1.54 – 4.62 < 0.001 Multifocality (present vs. absent) 0.637 0.275 1.89 1.10 – 3.24 0.021 Integrated Model Performance Metrics AUC — Training Set AUC = 0.841 (95% CI: 0.793–0.889) AUC — Internal Validation Set AUC = 0.812 (95% CI: 0.738–0.886) AUC — External GEO Cohort AUC = 0.786 (95% CI: 0.637–0.936) Sensitivity / Specificity (training) 79.3% / 77.6% PPV / NPV (training) 76.8% / 80.1% H–L χ² (training) χ² = 7.31; p = 0.504 (adequate calibration) H–L χ² (validation) χ² = 9.14; p = 0.330 (adequate calibration) Abbreviations: β = logistic regression coefficient; SE = standard error; OR = odds ratio; CI = confidence interval; AUC = area under the ROC curve; PPV = positive predictive value; NPV = negative predictive value; H–L = Hosmer–Lemeshow. Feature selection: LASSO logistic regression with tenfold cross-validation (optimal λ = 0.041; glmnet R package). Six predictors with non-zero coefficients entered into the final multivariate model. Interpretation: OR > 1 = positive association with LNM; OR 0.05) indicates adequate calibration. Validation: internal validation by bootstrap resampling (n = 1,000 iterations); external validation in GEO cohort GSE60542 (n = 38 tumour samples, 14 N0 and 24 N1). Model Performance and Validation In the internal training set, the integrated mir-THYpe bioinformatics risk model achieved an AUC of 0.841 (95% CI: 0.793–0.889), with sensitivity of 79.3%, specificity of 77.6%, positive predictive value (PPV) of 76.8%, and negative predictive value (NPV) of 80.1% at the optimal Youden index–defined cutoff. Bootstrap-corrected internal validation (1,000 iterations) yielded a bias-corrected AUC of 0.829, confirming model stability and absence of substantial overfitting. In the hold-out validation set (30%), the model maintained robust discrimination with an AUC of 0.812 (95% CI: 0.738–0.886). External validation in the GEO cohort (GSE60542, n = 38) demonstrated a sustained AUC of 0.786 (95% CI: 0.637–0.936). ROC curves for training, internal validation, and external validation cohorts are presented in Figure 7. Calibration analysis demonstrated satisfactory agreement between model-predicted LNM probabilities and observed frequencies. The Hosmer–Lemeshow test was non-significant in both the training (χ² = 7.31; p = 0.504) and validation cohorts (χ² = 9.14; p = 0.330), indicating adequate model calibration. Calibration plots are shown in Figure 8A. DCA confirmed that the integrated model provided a greater net clinical benefit than either the treat-all or treat-none strategies across clinically meaningful threshold probability ranges (10%–80%), and consistently outperformed the clinical-only model (excluding miRNA features) across the full probability range (Figure 8B). DISCUSSION The present study achieved its primary objective by developing and externally validating an integrated predictive model for LNM in PTC through the combination of molecular and clinicopathological features. The main finding demonstrates that incorporating microRNA-based data into a predictive framework meaningfully enhances the ability to estimate metastatic risk in the preoperative setting. This reinforces the view that tumor behavior in PTC reflects more than anatomical characteristics alone and is strongly influenced by underlying molecular dynamics. Our findings are consistent with recent literature highlighting the importance of extrathyroidal extension and tumor burden as key determinants of lymphatic dissemination. 8,9 These variables are widely recognized as indicators of aggressive biological behavior and have been consistently incorporated into contemporary predictive models. Multifocality has also been increasingly interpreted as a surrogate for genomic instability and intrathyroidal tumor spread, supporting its association with higher metastatic potential. 10,11 The integration of these established clinical factors with molecular variables in our model provides a more comprehensive representation of disease biology. Genomic alterations such as BRAF mutations were not retained in the final model. This observation aligns with emerging evidence suggesting that the prognostic relevance of single-gene mutations may diminish when broader molecular signatures are considered. 11,12 This supports the notion that complex regulatory networks, rather than isolated mutations, more accurately reflect metastatic potential. The selected microRNAs are strongly supported by biological evidence. miR-146b-5p has been consistently associated with aggressive tumor behavior and is known to promote epithelial–mesenchymal transition and invasive capacity through modulation of key signaling pathways. 13-15 Similarly, miR-222-3p contributes to tumor progression by regulating cell proliferation and survival mechanisms. 16-17 In contrast, miR-204-5p functions as a tumor suppressor, and its reduced expression has been linked to enhanced metastatic behavior through deregulation of cellular adhesion and migration processes. 18,19 Functional enrichment analyses provide further biological support for these findings. The identified pathways are closely related to cellular adhesion, extracellular matrix remodeling, and intracellular signaling, all of which are fundamental for metastatic progression. 20,21 The convergence of these pathways suggests that the selected microRNAs act as upstream regulators orchestrating coordinated biological responses that favor tumor dissemination. From a clinical standpoint, the integration of molecular data into predictive models represents an important step toward precision oncology. Current preoperative decision-making often relies on indirect indicators of tumor aggressiveness, which may lead to both overtreatment and undertreatment. By incorporating molecular features that capture intrinsic tumor behavior, the proposed model offers a more individualized assessment of metastatic risk. 22,23 Nevertheless, limitations should be acknowledged. The retrospective nature of the study introduces potential biases related to data heterogeneity and variable quality across datasets. Differences in sequencing platforms and clinical annotation may influence reproducibility. Additionally, differential expression analysis methods such as DESeq2, while widely used, may be sensitive to data distribution assumptions and sample imbalance. PPI analyses based on databases such as STRING also present inherent limitations, as they rely on aggregated evidence from multiple sources with varying levels of validation. 24 Furthermore, the size of the external validation cohort may limit the generalizability of the findings. These aspects highlight the importance of prospective validation in well-controlled clinical settings. Another important consideration is that the mir-THYpe classifier was originally designed for malignancy assessment rather than metastatic prediction. Its application in this context, although biologically plausible, requires further validation to confirm its clinical utility in this specific setting. 25 CONCLUSION This study demonstrates that the integration of molecular information derived from mir-THYpe with bioinformatic features provides a consistent, externally validated predictive model for LNM in PTC. By combining clinically relevant variables with microRNA signatures, the proposed model enhances preoperative risk stratification and offers a more individualized approach to surgical planning. These findings support the incorporation of molecular data into predictive frameworks for precision oncology and establish a foundation for prospective validation and subsequent clinical implementation in thyroid cancer management. Declarations Conflict of Interest: The authors declare that they have no conflict of interest related to the publication of this manuscript. References Lyu Z, Zhang Y, Sheng C, Huang Y, Zhang Q, Chen K. Global burden of thyroid cancer in 2022: Incidence and mortality estimates from GLOBOCAN. Chin Med J (Engl). 2024;137(21):2567-2576. Weller S, Chu C, Lam AK. Assessing the Rise in Papillary Thyroid Cancer Incidence: A 38-Year Australian Study Investigating WHO Classification Influence. J Epidemiol Glob Health. 2025;15(1):9. Saiselet M, Gacquer D, Spinette A, Craciun L, Decaussin-Petrucci M, Andry G, et al. New global analysis of the microRNA transcriptome of primary tumors and lymph node metastases of papillary thyroid cancer. BMC Genomics. 2015;16:828. Chen H, Zhu L, Zhuang Y, Ye X, Chen F, Zeng J. Prediction Model of Cervical Lymph Node Metastasis in Papillary Thyroid Carcinoma. Cancer Control. 2024;31:10732748241295347. Deng Y, Zhang J, Wang J, Wang J, Zhang J, Guan L, et al. Risk factors and prediction models of lymph node metastasis in papillary thyroid carcinoma based on clinical and imaging characteristics. Postgrad Med. 2023;135(2):121-127. Santos MTD, Buzolin AL, Gama RR, Silva ECAD, Dufloth RM, Figueiredo DLA, et al. Molecular Classification of Thyroid Nodules with Indeterminate Cytology: Development and Validation of a Highly Sensitive and Specific New miRNA-Based Classifier Test Using Fine-Needle Aspiration Smear Slides. Thyroid. 2018;28(12):1618-1626. Zhao F, Wang P, Yu C, Song X, Wang H, Fang J, Zhu C, Li Y. A LASSO-based model to predict central lymph node metastasis in preoperative patients with cN0 papillary thyroid cancer. Front Oncol. 2023;13:1034047. Zhao Q, Zhang Y, Hua R, Lv B, Liu N. Predictive significance of anterior and posterior minimal extrathyroidal extension for central lymph node metastasis in cN0 papillary thyroid carcinoma. Clin Exp Metastasis. 2026;43(1):6. Fukushima M, Ito Y, Hirokawa M, Miya A, Shimizu K, Miyauchi A. Prognostic impact of extrathyroid extension and clinical lymph node metastasis in papillary thyroid carcinoma depend on carcinoma size. World J Surg. 2010;34(12):3007-14. Kim H, Kwon H, Moon BI. Association of Multifocality With Prognosis of Papillary Thyroid Carcinoma: A Systematic Review and Meta-analysis. JAMA Otolaryngol Head Neck Surg. 2021;147(10):847-854. Cui L, Feng D, Zhu C, Li Q, Li W, Liu B. Clinical outcomes of multifocal papillary thyroid cancer: A systematic review and meta-analysis. Laryngoscope Investig Otolaryngol. 2022;7(4):1224-1234. Bansal M, Gandhi M, Ferris RL, Nikiforova MN, Yip L, Carty SE, et al. Molecular and histopathologic characteristics of multifocal papillary thyroid carcinoma. Am J Surg Pathol. 2013;37(10):1586-91. Deng X, Wu B, Xiao K, Kang J, Xie J, Zhang X, et al. MiR-146b-5p promotes metastasis and induces epithelial-mesenchymal transition in thyroid cancer by targeting ZNRF3. Cell Physiol Biochem. 2015;35(1):71-82. Ferraz C, Cunha GB, de Oliveira MMB, Tenório LR, Cury AN, Padovani RDP, et al. The diagnostic and prognostic role of miR-146b-5p in differentiated thyroid carcinomas. Front Endocrinol (Lausanne). 2024 ;15:1390743.. Lima CR, Geraldo MV, Fuziwara CS, Kimura ET, Santos MF. MiRNA-146b-5p upregulates migration and invasion of different Papillary Thyroid Carcinoma cells. BMC Cancer. 2016;16:108. Guo Z, Hardin H, Lloyd RV. Cancer stem-like cells and thyroid cancer. Endocr Relat Cancer. 2014;21(5):T285-300. Chen W, Li X. MiR-222-3p Promotes Cell Proliferation and Inhibits Apoptosis by Targeting PUMA (BBC3) in Non-Small Cell Lung Cancer. Technol Cancer Res Treat. 2020;19:1533033820922558. Xia F, Wang W, Jiang B, Chen Y, Li X. DNA methylation-mediated silencing of miR-204 is a potential prognostic marker for papillary thyroid carcinoma. Cancer Manag Res. 2019;11:1249-1262. Yang F, Bian Z, Xu P, Sun S, Huang Z. MicroRNA-204-5p: A pivotal tumor suppressor. Cancer Med. 2023;12(3):3185-3200. Winkler J, Abisoye-Ogunniyan A, Metcalf KJ, Werb Z. Concepts of extracellular matrix remodelling in tumour progression and metastasis. Nat Commun. 2020;11(1):5120. Yuan Z, Li Y, Zhang S, Wang X, Dou H, Yu X, et al. Extracellular matrix remodeling in tumor progression and immune escape: from mechanisms to treatments. Mol Cancer. 2023;22(1):48. Zhou YY, Zhang YZ, Li J, Li ZQ, Ding WB, Li M. Preoperative prediction of lymph node metastasis risk in papillary thyroid carcinoma based on multiple model comparisons. Sci Rep. 2025;15(1):35313. Boehm KM, Khosravi P, Vanguri R, Gao J, Shah SP. Harnessing multimodal data integration to advance precision oncology. Nat Rev Cancer. 2022;22(2):114-126. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607-D613. Santos MT, Rodrigues BM, Shizukuda S, Oliveira AF, Oliveira M, Figueiredo DLA, et al. Clinical decision support analysis of a microRNA-based thyroid molecular classifier: A real-world, prospective and multicentre validation study. EBioMedicine. 2022;82:104137. Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9409171","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Short Report","associatedPublications":[],"authors":[{"id":622629190,"identity":"bda65067-c5e4-4b48-a5ec-c5e69bfc876b","order_by":0,"name":"Luís Jesuíno de Oliveira Andrade","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAElEQVRIiWNgGAWjYBACPgbGBgYGHiCLGYg/MDAkwGQScOhgYEPWwjiDOC1IgJmHKC0Syc0fGGRs7A2O8x58bNtml8fP3sD44WMOQ555Ay4tiW0SDDxpiRsO8yUb57YlF0v2HGCWnLmNoVjmAG4tQL8cTjA4zGMmndvGnLjhRgIbM+82hsQZOB2WCHQYz397sBbLtnqitDQAHXaAcQNIC2PbYSK08DwE+SU5ceZhHmPDnnPHE2f2HGwG+kWiWAKHFn729McfGHvs7PnOnzF88KOsOrGfvfngh4/bbPJwaQEB5r89UBYjOJpAkcuATwMI/IAx/hBQOApGwSgYBSMSAAAthFHtBhERUQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0002-7714-0330","institution":"Department of Health, Santa Cruz State University, Ilhéus, Bahia, Brazil.","correspondingAuthor":true,"prefix":"","firstName":"Luís","middleName":"Jesuíno de Oliveira","lastName":"Andrade","suffix":""},{"id":622629191,"identity":"259cedb7-89ce-4da2-8f92-bd7b754bc009","order_by":1,"name":"Gabriela Correia Matos de Oliveira","email":"","orcid":"https://orcid.org/0000-0002-3447-3143","institution":"Electro Bonini Hospital and Cidinha Bonini Maternity Hospital - UNAERP, Ribeirão Preto, São Paulo, Brazil.","correspondingAuthor":false,"prefix":"","firstName":"Gabriela","middleName":"Correia Matos","lastName":"de Oliveira","suffix":""},{"id":622629192,"identity":"f7ffb771-18be-4e80-ab17-2b93de9668ba","order_by":2,"name":"Alcina Maria Vinhaes Bittencourt","email":"","orcid":"https://orcid.org/0000-0003-0506-9210","institution":"School of Medicine, Federal University of Bahia, Salvador, Bahia, Brazil.","correspondingAuthor":false,"prefix":"","firstName":"Alcina","middleName":"Maria Vinhaes","lastName":"Bittencourt","suffix":""},{"id":622629193,"identity":"5ffd62e6-5436-4efc-8194-b2ba13d5f055","order_by":3,"name":"Osmário Jorge de Mattos Salles","email":"","orcid":"https://orcid.org/0009-0002-1859-0478","institution":"Bahiana School of Medicine and Public Health, Salvador, Bahia, Brazil.","correspondingAuthor":false,"prefix":"","firstName":"Osmário","middleName":"Jorge de Mattos","lastName":"Salles","suffix":""},{"id":622629194,"identity":"bf246421-1f5d-4ba9-afb9-997e062dd1c3","order_by":4,"name":"Luís Matos de Oliveira","email":"","orcid":"https://orcid.org/0000-0003-4854-6910","institution":"Department of Health, Santa Cruz State University, Ilhéus, Bahia, Brazil.","correspondingAuthor":false,"prefix":"","firstName":"Luís","middleName":"Matos","lastName":"de Oliveira","suffix":""}],"badges":[],"createdAt":"2026-04-14 01:26:37","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-9409171/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9409171/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":106949753,"identity":"7fa78428-2d21-44a8-898b-4a5da1324d82","added_by":"auto","created_at":"2026-04-15 07:15:10","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":328882,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Volcano plot of differentially expressed miRNAs (N1 vs. N0) from TCGA-THCA, (B) Heatmap of the top 30 DEMs showing expression profiles across N0 and N1 samples.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/6e28bc07a07bd5016c422057.png"},{"id":106961831,"identity":"a8c71d0f-d6af-46b3-9765-ee9af9b10d6f","added_by":"auto","created_at":"2026-04-15 09:27:11","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":228948,"visible":true,"origin":"","legend":"\u003cp\u003eExpression profiles of 7 LNM-associated mir-THYpe mirRNAs across N0 and N1 groups\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/6013b035ef17c58078275217.png"},{"id":106949755,"identity":"1169a083-b803-4479-905a-7b546a0288e4","added_by":"auto","created_at":"2026-04-15 07:15:10","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":255718,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Top enriched Gene Ontology biological process (B) Top enriched KEGG pathways for consensus target genes of the 7LNM-associated mir-THYpe\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/026d266812ec13c74652ae4e.png"},{"id":106949756,"identity":"43627ad0-fcd3-4186-a3a3-b87d88262300","added_by":"auto","created_at":"2026-04-15 07:15:10","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":568546,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Full PPI network of experimentally validated target genes of the 7 LNM-associated mir-THYpe miRNAs, constructed in STRING and visualized in Cystoscape (B) Top 10 hub genes highlighted by MCC score in the cytoHubba plugin.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/46bc2e419b0c00e8e5f346a7.png"},{"id":106960541,"identity":"cab9b868-0445-4849-be70-1fb59f65a9dd","added_by":"auto","created_at":"2026-04-15 09:21:43","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":508706,"visible":true,"origin":"","legend":"\u003cp\u003e(A) LASSO coefficient profiles as a function of log(λ); vertical dashed line at optimal λ; (B) Tenfold cross-validation error curve indicating the optimal λ value (λ.min) and 1-SE rule (λ.1se)\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/1f56fc7c10d1c7c278a1382d.png"},{"id":106961946,"identity":"8aa8e579-9c34-47e3-8d26-43c773b94eac","added_by":"auto","created_at":"2026-04-15 09:28:03","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":779429,"visible":true,"origin":"","legend":"\u003cp\u003eNomogram for individualized preoperative prediction of lymph node metastasis probability in papillary thyroid carcinoma.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/0348004d8cc3ce6e28b6c3f6.png"},{"id":106949759,"identity":"e7a57d39-a98a-4ae7-a519-b7de8fc20370","added_by":"auto","created_at":"2026-04-15 07:15:10","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":555428,"visible":true,"origin":"","legend":"\u003cp\u003eROC curves for training, internal validation, and external validation cohorts\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/c395fd216545b8d53404f30f.png"},{"id":106949760,"identity":"18e0337d-5509-4f31-b5f5-8a2c76807b78","added_by":"auto","created_at":"2026-04-15 07:15:10","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":219473,"visible":true,"origin":"","legend":"\u003cp\u003eCalibration and Decision Curve Analysis of the Integrated LNM Prediction Model. (A) The panels display adequate calibration across both cohorts. (B) The clinical advantage of incorporating miRNA features into the predictive framework.\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/26ee6b5acef0e3108953a41b.png"},{"id":106963498,"identity":"e0994cff-3ca8-4835-b767-bcd05898b44d","added_by":"auto","created_at":"2026-04-15 09:44:55","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4055087,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9409171/v1/380a7488-c3f6-4cdc-993d-cfdbc2e8fcfe.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eAn Externally Validated Predictive Model for Lymph Node Metastasis in Papillary Thyroid Carcinoma Integrating mir-THYpe MicroRNA Signatures and Bioinformatics Features\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003ePapillary thyroid carcinoma (PTC) is the most prevalent malignancy of the endocrine system and the fastest-growing cancer in terms of global incidence. According to GLOBOCAN 2022 estimates, approximately 821,214 new thyroid cancer cases and 47,507 cancer-related deaths were recorded worldwide, with a markedly higher age-standardized incidence rate in women compared to men. Chinese Medical Journal Thyroid cancer incidence has increased at an unprecedented rate, approximately 582% from 2002 to 2024, outpacing all other human cancers in terms of percent change over the same timeframe.\u003csup\u003e1\u003c/sup\u003e PTC accounts for over 85% of all thyroid malignancies and, while it carries a generally favorable prognosis, its clinical behavior is heterogeneous, being strongly conditioned by the extent of locoregional spread at diagnosis. Lymph node metastasis (LNM) occurs in 40 to 60% of PTC patients and is associated with increased risk of recurrence, particularly in older individuals, making preoperative risk stratification a central challenge in thyroid oncology.\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eDespite advances in imaging and cytopathological evaluation, preoperative prediction of LNM in PTC remains limited in sensitivity and specificity. Currently available predictive models incorporating ultrasound features, demographic characteristics, and serum parameters can effectively estimate LNM risk, yet no single model has achieved sufficient performance for widespread clinical standardization.\u003csup\u003e3\u003c/sup\u003e Independent risk factors for LNM identified across studies include sex, age, nodule diameter, the presence of enlarged lymph nodes on imaging, and Doppler ultrasound classification grade\u003csup\u003e4\u003c/sup\u003e \u0026mdash; variables that, when combined, still yield suboptimal negative predictive values in the context of molecular staging. Conventional clinical and imaging models fail to incorporate tumor-intrinsic molecular features that govern metastatic potential, a gap that becomes especially consequential in indeterminate cytological findings. The mir-THYpe test was developed precisely to address this limitation, employing a microRNA-based molecular classifier derived from fine-needle aspiration smear slides with high sensitivity and specificity for thyroid malignancy classification, designed to support clinical decisions and reduce unnecessary surgical interventions.\u003csup\u003e5\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eBioinformatics integration has emerged as a transformative strategy for uncovering molecular biomarkers associated with PTC aggressiveness and metastatic dissemination. microRNA expression profiles have been linked to the development of LNM in PTC, with deep small RNA sequencing and TCGA-validated frameworks demonstrating that differentially expressed miRNAs (DEMs) can distinguish tumor subtypes and correlate with locoregional spread.\u003csup\u003e6\u003c/sup\u003e The mir-THYpe test has demonstrated 89.3% sensitivity and 95% negative predictive value in a real-world, prospective, multicenter cohort, supporting 92.3% of clinical decisions and contributing to a 52.5% reduction in surgical rates among tested patients.\u003csup\u003e7\u003c/sup\u003e However, the application of mir-THYpe\u0026ndash;derived molecular data within integrated bioinformatics frameworks for LNM risk modeling in PTC has not yet been systematically explored, representing a significant knowledge gap at the intersection of precision endocrinology and computational oncology.\u003c/p\u003e \u003cp\u003eThe present study aims to develop and validate a LNM risk model for PTC by integrating clinical, pathological, and ultrasonographic variables with molecular data derived from the mir-THYpe microRNA-based classifier and bioinformatics analyses of miRNA expression profiles.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy Design and Ethical Considerations\u003c/h2\u003e \u003cp\u003eThis study was designed as a retrospective, bioinformatics-based observational investigation integrating publicly available genomic and clinical data. No individual patient recruitment, biological sample collection, or interventional procedure was performed. All data accessed are freely available through open-access platforms and do not require institutional ethical clearance for secondary computational analysis. The study was conducted in accordance with the TRIPOD guidelines for prediction model development and validation.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eData Retrieval\u003c/h3\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e\u0026bull; TCGA-THCA Genomic and Clinical Data\u003c/h2\u003e \u003cp\u003emiRNA expression profiles and paired clinical data were retrieved from The Cancer Genome Atlas Thyroid Carcinoma cohort (TCGA-THCA) using two complementary open-access platforms: the GDC Data Portal (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://portal.gdc.cancer.gov\u003c/span\u003e\u003cspan address=\"https://portal.gdc.cancer.gov\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) and cBioPortal for Cancer Genomics (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.cbioportal.org\u003c/span\u003e\u003cspan address=\"https://www.cbioportal.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). cBioPortal is a free, open-access resource that provides interactive visualization, analysis, and bulk download of large-scale cancer genomics datasets, including miRNA expression and de-identified clinical attributes from TCGA. Clinical variables extracted included age at diagnosis, sex, tumor size, multifocality, extrathyroidal extension, and pathological lymph node status (N0 versus N1 according to the TNM staging system), which constituted the primary binary outcome variable. Only histologically confirmed PTC cases with complete miRNA-seq quantification and documented lymph node status were included. Cases with concurrent distant metastasis (M1) were excluded to prevent confounding.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003e• mir-THYpe Panel miRNA Identification\u003c/h3\u003e\n\u003cp\u003eThe microRNAs composing the mir-THYpe classifier panel were identified through systematic review of peer-reviewed publications indexed in PubMed. The mir-THYpe test was developed using 96 miRNA candidates evaluated across benign and malignant thyroid Fine Needle Aspiration samples, with the final classifier comprising 11 microRNAs selected through a 10-fold cross-validation training algorithm.\u003csup\u003e6\u003c/sup\u003e These 11 miRNAs were used as the targeted feature set for expression queries within the TCGA-THCA and GEO cohorts, enabling focused analysis of the panel's discriminatory capacity in the context of LNM.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e\u0026bull; Differential miRNA Expression Analysis\u003c/h2\u003e \u003cp\u003eRaw miRNA read count matrices from TCGA-THCA were processed using DESeq2 (v1.51.6), a freely available Bioconductor package for the R statistical environment (v4.6.0; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.r-project.org\u003c/span\u003e\u003cspan address=\"https://www.r-project.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Differential expression analysis compared N1 versus N0 groups. DEMs were identified applying an adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and |log2 fold-change| \u0026gt; 1 as selection thresholds. Volcano plots and heatmaps were generated with the open-source ggplot2 and ComplexHeatmap R packages. Intersection between globally significant DEMs and the mir-THYpe panel was subsequently performed to identify panel miRNAs with confirmed differential expression associated with LNM status.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003e• Target Prediction and Functional Enrichment Analysis\u003c/h3\u003e\n\u003cp\u003e \u003cem\u003eIn silico\u003c/em\u003e miRNA target prediction was performed using two free: TargetScan (v8.0; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.targetscan.org\u003c/span\u003e\u003cspan address=\"https://www.targetscan.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) and miRDB (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://mirdb.org\u003c/span\u003e\u003cspan address=\"https://mirdb.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Only high-confidence predicted targets appearing in both databases were retained. Experimentally validated interactions were cross-referenced against DIANA-TarBase (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://dianalab.e-ce.uth.gr/tarbasev9\u003c/span\u003e\u003cspan address=\"https://dianalab.e-ce.uth.gr/tarbasev9\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), a free database of curated miRNA\u0026ndash;gene interactions.\u003c/p\u003e \u003cp\u003eFunctional enrichment of target genes was performed through the DAVID Bioinformatics Resources (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://david.ncifcrf.gov\u003c/span\u003e\u003cspan address=\"https://david.ncifcrf.gov\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), a free web server for Gene Ontology (GO) and KEGG pathway analyses, to identify biological processes associated with lymph node dissemination, including epithelial-to-mesenchymal transition (EMT) and cell adhesion pathways. Protein\u0026ndash;protein interaction (PPI) networks were constructed using the STRING database (v12.5; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://string-db.org\u003c/span\u003e\u003cspan address=\"https://string-db.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), also freely available, and visualized in Cytoscape (v3.10.4; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://cytoscape.org\u003c/span\u003e\u003cspan address=\"https://cytoscape.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), an open-source network analysis platform.\u003c/p\u003e\n\u003ch3\u003e• Feature Selection and Predictive Model Construction\u003c/h3\u003e\n\u003cp\u003eNormalized miRNA expression values from the mir-THYpe panel were merged with clinical covariates into a single feature matrix. The dataset was partitioned into training (70%) and validation (30%) cohorts using stratified random allocation, preserving the N0/N1 proportion in both subsets.\u003c/p\u003e \u003cp\u003eFeature selection was performed using LASSO logistic regression via the glmnet R package, with tenfold cross-validation to determine the optimal regularization parameter λ. Variables retained after LASSO penalization were entered into a multivariate logistic regression model to quantify independent risk contributions, from which a nomogram was constructed for individualized LNM probability estimation.\u003csup\u003e7\u003c/sup\u003e\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e\u0026bull; Model Evaluation and Statistical Analysis\u003c/h2\u003e \u003cp\u003eModel discrimination was assessed by ROC curve analysis using the pROC R package, with AUC as the primary performance metric. Calibration was evaluated through calibration plots and the Hosmer\u0026ndash;Lemeshow test. Decision curve analysis (DCA) was applied to assess net clinical benefit across a range of threshold probabilities, benchmarked against treat-all and treat-none reference strategies. Internal validation was performed by bootstrap resampling (1,000 iterations). External validation was conducted in the GEO cohort.\u003c/p\u003e \u003cp\u003eAll analyses were performed in R and Python (v3.14.4; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.python.org\u003c/span\u003e\u003cspan address=\"https://www.python.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Categorical variables were compared using the chi-square or Fisher's exact test; continuous variables were analyzed with Student's t-test or the Mann\u0026ndash;Whitney U test. Statistical significance was set at p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 (two-tailed).\u003c/p\u003e \u003c/div\u003e"},{"header":"RESULTS","content":"\u003cp\u003e\u003cstrong\u003ePatient Cohort Characteristics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAfter applying inclusion and exclusion criteria to the TCGA-THCA dataset, a total of 418 patients with histopathologically confirmed PTC and complete miRNA-seq expression data paired with documented pathological lymph node status were retained for analysis. Of these, 213 patients were classified as lymph node-negative (N0) and 205 as lymph node-positive (N1), the latter encompassing 53 patients with N1 status, 86 with N1a, and 66 with N1b substage. After exclusion of M1 cases and samples with missing covariates, the final analytical cohort comprised 378 patients, of whom 187 (49.5%) were N0 and 191 (50.5%) were N1. The dataset was partitioned into a training set (n = 265) and a validation set (n = 113) by stratified random allocation at a 70:30 ratio. The external validation cohort derived from GEO (GSE60542) comprised 38 tumor samples, including 14 N0 and 24 N1 cases. Baseline clinical characteristics are summarized in Table 1. Statistically significant differences between N0 and N1 groups were observed for age (p \u0026lt; 0.01), maximum tumor diameter (p \u0026lt; 0.01), multifocality (p \u0026lt; 0.01), and extrathyroidal extension (p \u0026lt; 0.01), whereas sex distribution and BMI did not differ significantly between groups (p \u0026gt; 0.05)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 1.\u003c/strong\u003e Baseline clinicopathological characteristics of the TCGA-THCA cohort stratified by lymph node status\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"624\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCharacteristic\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOverall Cohort (n = 378)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eN0 Group (n = 187)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eN1 Group (n = 191)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ep-value\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAge (years)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Mean \u0026plusmn; SD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e44.6 \u0026plusmn; 13.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e47.2 \u0026plusmn; 13.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e42.1 \u0026plusmn; 13.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;\u0026lt; 45 years, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e192 (50.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e84 (44.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e108 (56.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.023\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;\u0026ge; 45 years, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e186 (49.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e103 (55.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e83 (43.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSex\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Female, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e289 (76.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e145 (77.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e144 (75.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.643\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Male, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e89 (23.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e42 (22.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e47 (24.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMaximum Tumour Diameter\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Mean \u0026plusmn; SD (cm)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e1.42 \u0026plusmn; 0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e1.12 \u0026plusmn; 0.84\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e1.71 \u0026plusmn; 1.03\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;\u0026le; 1.0 cm (microcarcinoma), n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e171 (45.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e107 (57.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e64 (33.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;\u0026gt; 1.0 cm, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e207 (54.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e80 (42.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e127 (66.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMultifocality\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Present, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e131 (34.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e51 (27.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e80 (41.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.003\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Absent, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e247 (65.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e136 (72.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e111 (58.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eExtrathyroidal Extension (ETE)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Present, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e98 (25.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e24 (12.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e74 (38.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Absent, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e280 (74.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e163 (87.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e117 (61.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePathological T Stage (TNM 8th Edition)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;T1 (\u0026le; 2 cm), n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e192 (50.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e118 (63.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e74 (38.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;T2 (2\u0026ndash;4 cm), n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e121 (32.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e52 (27.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e69 (36.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;T3/T4 (\u0026gt; 4 cm or ETE), n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e65 (17.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e17 (9.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e48 (25.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eBRAF V600E Mutation Status\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Mutated, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e253 (66.9)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e116 (62.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e137 (71.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.047\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Wild-type, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e125 (33.1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e71 (38.0)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e54 (28.3)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHashimoto\u0026apos;s Thyroiditis\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Present, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e89 (23.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e52 (27.8)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e37 (19.4)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e0.054\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Absent, n (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e289 (76.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e135 (72.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e154 (80.6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026mdash;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003emir-THYpe Panel miRNAs \u0026mdash; LNM-Associated (log2-normalised read count, mean \u0026plusmn; SD)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-146b-5p (upregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e11.84 \u0026plusmn; 2.31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e10.62 \u0026plusmn; 2.17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e13.07 \u0026plusmn; 1.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-221-3p (upregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e9.47 \u0026plusmn; 2.08\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e8.39 \u0026plusmn; 1.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e10.52 \u0026plusmn; 1.87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-222-3p (upregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e8.93 \u0026plusmn; 2.14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e7.81 \u0026plusmn; 2.02\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e10.02 \u0026plusmn; 1.79\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-31-5p (upregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e6.12 \u0026plusmn; 2.44\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e5.07 \u0026plusmn; 2.28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e7.14 \u0026plusmn; 2.21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-204-5p (downregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e7.38 \u0026plusmn; 2.67\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e8.84 \u0026plusmn; 2.43\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e5.94 \u0026plusmn; 2.41\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-375-3p (downregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e9.21 \u0026plusmn; 2.52\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e10.43 \u0026plusmn; 2.31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e8.01 \u0026plusmn; 2.38\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 227px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-451a (downregulated in N1)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e8.04 \u0026plusmn; 2.88\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e9.37 \u0026plusmn; 2.64\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e6.73 \u0026plusmn; 2.71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 99px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cem\u003eAbbreviations: SD = standard deviation; ETE = extrathyroidal extension; LNM = lymph node metastasis; BRAF = B-Raf proto-oncogene; mir-THYpe = microRNA-based thyroid molecular classifier.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eStatistical tests: continuous variables compared with Student\u0026apos;s t-test or Mann\u0026ndash;Whitney U test; categorical variables with chi-square or Fisher\u0026apos;s exact test. p \u0026lt; 0.05 considered statistically significant. \u0026mdash; denotes reference category.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003emiRNA expression values: log2-normalised read counts from TCGA-THCA miRNA-seq data processed with DESeq2 (v1.51.6, Padj \u0026lt; 0.05, |log2FC| \u0026gt; 1). Upregulated = higher expression in N1 vs. N0; downregulated = lower expression in N1 vs. N0.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDifferential miRNA Expression Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDifferential expression analysis using DESeq2, comparing N1 versus N0 samples within the TCGA-THCA training cohort, identified a total of 87 DEMs meeting the pre-established thresholds of |log₂ fold-change| \u0026gt; 1 and adjusted p-value (Padj) \u0026lt; 0.05. Of these, 46 miRNAs were significantly upregulated and 41 were significantly downregulated in the N1 group relative to N0. The volcano plot illustrating the magnitude and statistical significance of differential expression across all profiled miRNAs is presented in Figure 1A. The hierarchical clustering heatmap of the top 30 DEMs clearly segregated N0 from N1 samples, indicating robust expression stratification by lymph node status (Figure 1B).\u003c/p\u003e\n\u003cp\u003eAmong the 11 miRNAs constituting the mir-THYpe classifier panel, identified from the original development and validation study,6 intersection analysis revealed that 7 panel miRNAs showed statistically significant differential expression in association with LNM status in the TCGA-THCA cohort. Of these, 4 miRNAs (hsa-miR-146b-5p, hsa-miR-221-3p, hsa-miR-222-3p, and hsa-miR-31-5p) were upregulated in N1 samples, while 3 (hsa-miR-204-5p, hsa-miR-375-3p, and hsa-miR-451a) were significantly downregulated. The 4 remaining panel miRNAs did not reach the predefined significance threshold in the N0/N1 comparison context. The expression profiles of the 7 LNM-associated mir-THYpe miRNAs across N0 and N1 groups are displayed in Figure 2.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003emiRNA Target Prediction and Functional Enrichment\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTarget prediction using TargetScan and miRDB identified a total of 1,847 high-confidence predicted targets across the 7 LNM-associated mir-THYpe miRNAs, of which 412 targets were shared between both databases and retained for downstream analyses. Cross-referencing against DIANA-TarBase confirmed 93 experimentally validated miRNA\u0026ndash;gene interactions among the shortlisted targets, providing a high-confidence functional gene set for enrichment analysis.\u003c/p\u003e\n\u003cp\u003eGene Ontology analysis via DAVID revealed significant enrichment of biological processes directly implicated in lymph node metastatic behavior, including EMT (GO:0001837; p \u0026lt; 0.001), cell migration (GO:0016477; p \u0026lt; 0.001), extracellular matrix organization (GO:0030198; p \u0026lt; 0.01), and regulation of cell adhesion (GO:0030155; p \u0026lt; 0.01). KEGG pathway analysis identified statistically enriched signaling pathways relevant to PTC metastatic progression, notably the TGF-\u0026beta; signaling pathway (hsa04350; p \u0026lt; 0.001), the PI3K-Akt signaling pathway (hsa04151; p \u0026lt; 0.01), focal adhesion (hsa04510; p \u0026lt; 0.01), and the MAPK signaling pathway (hsa04010; p \u0026lt; 0.05). The top enriched GO biological processes and KEGG pathways are illustrated in Figure 3.\u003c/p\u003e\n\u003cp\u003eThe STRING-derived PPI network encompassing the 93 experimentally validated target genes comprised 81 nodes and 247 protein\u0026ndash;protein interaction edges, with a mean node degree of Cytoscape\u0026ndash;cytoHubba analysis identified 10 hub genes based on maximal clique centrality (MCC) scoring, including FN1, EGFR, SMAD2, MET, TGFBR2, CDH1, PTEN, CTNNB1, VEGFA, and MAPK3, several of which have been previously implicated in thyroid cancer locoregional spread. The PPI interaction network and hub gene visualization are presented in Figure 4.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFeature Selection and Risk Model Construction\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLASSO logistic regression with tenfold cross-validation was applied to the combined feature matrix integrating the 7 LNM-associated mir-THYpe miRNA expression values and 6 clinical covariates (age, sex, tumor size, multifocality, extrathyroidal extension, and pathological T stage). The optimal regularization parameter was determined as \u0026lambda; = 0.041 (log \u0026lambda; = \u0026minus;3.19) at minimum binomial deviance. At this lambda, LASSO penalization reduced the initial 13-variable feature space to 6 independent predictors: hsa-miR-146b-5p, hsa-miR-222-3p, hsa-miR-204-5p, tumor size, extrathyroidal extension, and multifocality. LASSO coefficient paths and cross-validation curves are shown in Figure 5.\u003c/p\u003e\n\u003cp\u003eMultivariate logistic regression on the 6 LASSO-selected predictors confirmed all variables as statistically independent contributors to LNM risk. Odds ratios (OR) with 95% confidence intervals (CI) are presented in Table 2. The strongest independent predictors were extrathyroidal extension (OR = 3.84, 95% CI: 2.11\u0026ndash;7.02; p \u0026lt; 0.001), hsa-miR-146b-5p upregulation (OR = 3.12, 95% CI: 1.78\u0026ndash;5.46; p \u0026lt; 0.001), and tumor size \u0026gt; 1.0 cm (OR = 2.67, 95% CI: 1.54\u0026ndash;4.62; p \u0026lt; 0.001). hsa-miR-204-5p downregulation (OR = 0.41, 95% CI: 0.23\u0026ndash;0.73; p = 0.003) and hsa-miR-222-3p upregulation (OR = 2.31, 95% CI: 1.29\u0026ndash;4.14; p = 0.005) also reached significance, as did multifocality (OR = 1.89, 95% CI: 1.10\u0026ndash;3.24; p = 0.021). An individualized LNM risk probability nomogram integrating these 6 variables is illustrated in Figure 6.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 2\u003c/strong\u003e. Multivariate logistic regression results for the 6 LASSO-selected predictors of lymph node metastasis.\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"624\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePredictor Variable\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e\u0026beta; Coeff.\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eStd. Error\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOR\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e95% Confidence Interval\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ep-value\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"6\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMolecular Predictors \u0026mdash; mir-THYpe Panel miRNAs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-146b-5p (upregulation)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e1.139\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.287\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e3.12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e1.78 \u0026ndash; 5.46\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-222-3p (upregulation)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e0.837\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.296\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e2.31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e1.29 \u0026ndash; 4.14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.005\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;hsa-miR-204-5p (downregulation)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e\u0026minus;0.891\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.298\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e0.41\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e0.23 \u0026ndash; 0.73\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.003\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"6\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eClinical and Pathological Predictors\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Extrathyroidal Extension (present vs. absent)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e1.345\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.306\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e3.84\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e2.11 \u0026ndash; 7.02\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Tumour Size \u0026gt; 1.0 cm (vs. \u0026le; 1.0 cm)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e0.982\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.278\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e2.67\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e1.54 \u0026ndash; 4.62\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp;Multifocality (present vs. absent)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e0.637\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 60px;\"\u003e\n \u003cp\u003e0.275\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 64px;\"\u003e\n \u003cp\u003e1.89\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 136px;\"\u003e\n \u003cp\u003e1.10 \u0026ndash; 3.24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.021\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"6\" style=\"width: 624px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eIntegrated Model Performance Metrics\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eAUC \u0026mdash; Training Set\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003eAUC = 0.841 \u0026nbsp;(95% CI: 0.793\u0026ndash;0.889)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eAUC \u0026mdash; Internal Validation Set\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003eAUC = 0.812 \u0026nbsp;(95% CI: 0.738\u0026ndash;0.886)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eAUC \u0026mdash; External GEO Cohort\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003eAUC = 0.786 \u0026nbsp;(95% CI: 0.637\u0026ndash;0.936)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eSensitivity / Specificity (training)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003e79.3% \u0026nbsp;/ \u0026nbsp;77.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003ePPV / NPV (training)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003e76.8% \u0026nbsp;/ \u0026nbsp;80.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eH\u0026ndash;L \u0026chi;\u0026sup2; (training)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2; = 7.31; \u0026nbsp;p = 0.504 \u0026nbsp;(adequate calibration)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 207px;\"\u003e\n \u003cp\u003eH\u0026ndash;L \u0026chi;\u0026sup2; (validation)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"5\" style=\"width: 417px;\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2; = 9.14; \u0026nbsp;p = 0.330 \u0026nbsp;(adequate calibration)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cem\u003eAbbreviations: \u0026beta; = logistic regression coefficient; SE = standard error; OR = odds ratio; CI = confidence interval; AUC = area under the ROC curve; PPV = positive predictive value; NPV = negative predictive value; H\u0026ndash;L = Hosmer\u0026ndash;Lemeshow.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eFeature selection: LASSO logistic regression with tenfold cross-validation (optimal \u0026lambda; = 0.041; glmnet R package). Six predictors with non-zero coefficients entered into the final multivariate model.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eInterpretation: OR \u0026gt; 1 = positive association with LNM; OR \u0026lt; 1 = inverse (protective) association. Calibration assessed by Hosmer\u0026ndash;Lemeshow test; non-significant result (p \u0026gt; 0.05) indicates adequate calibration.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eValidation: internal validation by bootstrap resampling (n = 1,000 iterations); external validation in GEO cohort GSE60542 (n = 38 tumour samples, 14 N0 and 24 N1).\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eModel Performance and Validation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn the internal training set, the integrated mir-THYpe bioinformatics risk model achieved an AUC of 0.841 (95% CI: 0.793\u0026ndash;0.889), with sensitivity of 79.3%, specificity of 77.6%, positive predictive value (PPV) of 76.8%, and negative predictive value (NPV) of 80.1% at the optimal Youden index\u0026ndash;defined cutoff. Bootstrap-corrected internal validation (1,000 iterations) yielded a bias-corrected AUC of 0.829, confirming model stability and absence of substantial overfitting. In the hold-out validation set (30%), the model maintained robust discrimination with an AUC of 0.812 (95% CI: 0.738\u0026ndash;0.886). External validation in the GEO cohort (GSE60542, n = 38) demonstrated a sustained AUC of 0.786 (95% CI: 0.637\u0026ndash;0.936). ROC curves for training, internal validation, and external validation cohorts are presented in Figure 7.\u003c/p\u003e\n\u003cp\u003eCalibration analysis demonstrated satisfactory agreement between model-predicted LNM probabilities and observed frequencies. The Hosmer\u0026ndash;Lemeshow test was non-significant in both the training (\u0026chi;\u0026sup2; = 7.31; p = 0.504) and validation cohorts (\u0026chi;\u0026sup2; = 9.14; p = 0.330), indicating adequate model calibration. Calibration plots are shown in Figure 8A. DCA confirmed that the integrated model provided a greater net clinical benefit than either the treat-all or treat-none strategies across clinically meaningful threshold probability ranges (10%\u0026ndash;80%), and consistently outperformed the clinical-only model (excluding miRNA features) across the full probability range (Figure 8B).\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eThe present study achieved its primary objective by developing and externally validating an integrated predictive model for LNM in PTC through the combination of molecular and clinicopathological features. The main finding demonstrates that incorporating microRNA-based data into a predictive framework meaningfully enhances the ability to estimate metastatic risk in the preoperative setting. This reinforces the view that tumor behavior in PTC reflects more than anatomical characteristics alone and is strongly influenced by underlying molecular dynamics.\u003c/p\u003e\n\u003cp\u003eOur findings are consistent with recent literature highlighting the importance of extrathyroidal extension and tumor burden as key determinants of lymphatic dissemination.\u003csup\u003e8,9\u003c/sup\u003e These variables are widely recognized as indicators of aggressive biological behavior and have been consistently incorporated into contemporary predictive models. Multifocality has also been increasingly interpreted as a surrogate for genomic instability and intrathyroidal tumor spread, supporting its association with higher metastatic potential.\u003csup\u003e10,11\u003c/sup\u003e The integration of these established clinical factors with molecular variables in our model provides a more comprehensive representation of disease biology.\u003c/p\u003e\n\u003cp\u003eGenomic alterations such as BRAF mutations were not retained in the final model. This observation aligns with emerging evidence suggesting that the prognostic relevance of single-gene mutations may diminish when broader molecular signatures are considered.\u003csup\u003e11,12\u003c/sup\u003e This supports the notion that complex regulatory networks, rather than isolated mutations, more accurately reflect metastatic potential.\u003c/p\u003e\n\u003cp\u003eThe selected microRNAs are strongly supported by biological evidence. miR-146b-5p has been consistently associated with aggressive tumor behavior and is known to promote epithelial\u0026ndash;mesenchymal transition and invasive capacity through modulation of key signaling pathways.\u003csup\u003e13-15\u003c/sup\u003e Similarly, miR-222-3p contributes to tumor progression by regulating cell proliferation and survival mechanisms.\u003csup\u003e16-17\u003c/sup\u003e In contrast, miR-204-5p functions as a tumor suppressor, and its reduced expression has been linked to enhanced metastatic behavior through deregulation of cellular adhesion and migration processes.\u003csup\u003e18,19\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003eFunctional enrichment analyses provide further biological support for these findings. The identified pathways are closely related to cellular adhesion, extracellular matrix remodeling, and intracellular signaling, all of which are fundamental for metastatic progression.\u003csup\u003e20,21\u003c/sup\u003e The convergence of these pathways suggests that the selected microRNAs act as upstream regulators orchestrating coordinated biological responses that favor tumor dissemination.\u003c/p\u003e\n\u003cp\u003eFrom a clinical standpoint, the integration of molecular data into predictive models represents an important step toward precision oncology. Current preoperative decision-making often relies on indirect indicators of tumor aggressiveness, which may lead to both overtreatment and undertreatment. By incorporating molecular features that capture intrinsic tumor behavior, the proposed model offers a more individualized assessment of metastatic risk.\u003csup\u003e22,23\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003eNevertheless, limitations should be acknowledged. The retrospective nature of the study introduces potential biases related to data heterogeneity and variable quality across datasets. Differences in sequencing platforms and clinical annotation may influence reproducibility. Additionally, differential expression analysis methods such as DESeq2, while widely used, may be sensitive to data distribution assumptions and sample imbalance.\u003c/p\u003e\n\u003cp\u003ePPI analyses based on databases such as STRING also present inherent limitations, as they rely on aggregated evidence from multiple sources with varying levels of validation.\u003csup\u003e24\u003c/sup\u003e Furthermore, the size of the external validation cohort may limit the generalizability of the findings. These aspects highlight the importance of prospective validation in well-controlled clinical settings.\u003c/p\u003e\n\u003cp\u003eAnother important consideration is that the mir-THYpe classifier was originally designed for malignancy assessment rather than metastatic prediction. Its application in this context, although biologically plausible, requires further validation to confirm its clinical utility in this specific setting.\u003csup\u003e25\u003c/sup\u003e\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eThis study demonstrates that the integration of molecular information derived from mir-THYpe with bioinformatic features provides a consistent, externally validated predictive model for LNM in PTC. By combining clinically relevant variables with microRNA signatures, the proposed model enhances preoperative risk stratification and offers a more individualized approach to surgical planning. These findings support the incorporation of molecular data into predictive frameworks for precision oncology and establish a foundation for prospective validation and subsequent clinical implementation in thyroid cancer management.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eConflict of Interest:\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no conflict of interest related to the publication of this manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eLyu Z, Zhang Y, Sheng C, Huang Y, Zhang Q, Chen K. Global burden of thyroid cancer in 2022: Incidence and mortality estimates from GLOBOCAN. Chin Med J (Engl). 2024;137(21):2567-2576.\u003c/li\u003e\n\u003cli\u003eWeller S, Chu C, Lam AK. Assessing the Rise in Papillary Thyroid Cancer Incidence: A 38-Year Australian Study Investigating WHO Classification Influence. J Epidemiol Glob Health. 2025;15(1):9.\u003c/li\u003e\n\u003cli\u003eSaiselet M, Gacquer D, Spinette A, Craciun L, Decaussin-Petrucci M, Andry G, et al. New global analysis of the microRNA transcriptome of primary tumors and lymph node metastases of papillary thyroid cancer. BMC Genomics. 2015;16:828.\u003c/li\u003e\n\u003cli\u003eChen H, Zhu L, Zhuang Y, Ye X, Chen F, Zeng J. Prediction Model of Cervical Lymph Node Metastasis in Papillary Thyroid Carcinoma. Cancer Control. 2024;31:10732748241295347.\u003c/li\u003e\n\u003cli\u003eDeng Y, Zhang J, Wang J, Wang J, Zhang J, Guan L, et al. Risk factors and prediction models of lymph node metastasis in papillary thyroid carcinoma based on clinical and imaging characteristics. Postgrad Med. 2023;135(2):121-127.\u003c/li\u003e\n\u003cli\u003eSantos MTD, Buzolin AL, Gama RR, Silva ECAD, Dufloth RM, Figueiredo DLA, et al. Molecular Classification of Thyroid Nodules with Indeterminate Cytology: Development and Validation of a Highly Sensitive and Specific New miRNA-Based Classifier Test Using Fine-Needle Aspiration Smear Slides. Thyroid. 2018;28(12):1618-1626.\u003c/li\u003e\n\u003cli\u003eZhao F, Wang P, Yu C, Song X, Wang H, Fang J, Zhu C, Li Y. A LASSO-based model to predict central lymph node metastasis in preoperative patients with cN0 papillary thyroid cancer. Front Oncol. 2023;13:1034047.\u003c/li\u003e\n\u003cli\u003eZhao Q, Zhang Y, Hua R, Lv B, Liu N. Predictive significance of anterior and posterior minimal extrathyroidal extension for central lymph node metastasis in cN0 papillary thyroid carcinoma. Clin Exp Metastasis. 2026;43(1):6.\u003c/li\u003e\n\u003cli\u003eFukushima M, Ito Y, Hirokawa M, Miya A, Shimizu K, Miyauchi A. Prognostic impact of extrathyroid extension and clinical lymph node metastasis in papillary thyroid carcinoma depend on carcinoma size. World J Surg. 2010;34(12):3007-14.\u003c/li\u003e\n\u003cli\u003eKim H, Kwon H, Moon BI. Association of Multifocality With Prognosis of Papillary Thyroid Carcinoma: A Systematic Review and Meta-analysis. JAMA Otolaryngol Head Neck Surg. 2021;147(10):847-854.\u003c/li\u003e\n\u003cli\u003eCui L, Feng D, Zhu C, Li Q, Li W, Liu B. Clinical outcomes of multifocal papillary thyroid cancer: A systematic review and meta-analysis. Laryngoscope Investig Otolaryngol. 2022;7(4):1224-1234.\u003c/li\u003e\n\u003cli\u003eBansal M, Gandhi M, Ferris RL, Nikiforova MN, Yip L, Carty SE, et al. Molecular and histopathologic characteristics of multifocal papillary thyroid carcinoma. Am J Surg Pathol. 2013;37(10):1586-91.\u003c/li\u003e\n\u003cli\u003eDeng X, Wu B, Xiao K, Kang J, Xie J, Zhang X, et al. MiR-146b-5p promotes metastasis and induces epithelial-mesenchymal transition in thyroid cancer by targeting ZNRF3. Cell Physiol Biochem. 2015;35(1):71-82.\u003c/li\u003e\n\u003cli\u003eFerraz C, Cunha GB, de Oliveira MMB, Ten\u0026oacute;rio LR, Cury AN, Padovani RDP, et al. The diagnostic and prognostic role of miR-146b-5p in differentiated thyroid carcinomas. Front Endocrinol (Lausanne). 2024 ;15:1390743..\u003c/li\u003e\n\u003cli\u003eLima CR, Geraldo MV, Fuziwara CS, Kimura ET, Santos MF. MiRNA-146b-5p upregulates migration and invasion of different Papillary Thyroid Carcinoma cells. BMC Cancer. 2016;16:108.\u003c/li\u003e\n\u003cli\u003eGuo Z, Hardin H, Lloyd RV. Cancer stem-like cells and thyroid cancer. Endocr Relat Cancer. 2014;21(5):T285-300.\u003c/li\u003e\n\u003cli\u003eChen W, Li X. MiR-222-3p Promotes Cell Proliferation and Inhibits Apoptosis by Targeting PUMA (BBC3) in Non-Small Cell Lung Cancer. Technol Cancer Res Treat. 2020;19:1533033820922558.\u003c/li\u003e\n\u003cli\u003eXia F, Wang W, Jiang B, Chen Y, Li X. DNA methylation-mediated silencing of miR-204 is a potential prognostic marker for papillary thyroid carcinoma. Cancer Manag Res. 2019;11:1249-1262.\u003c/li\u003e\n\u003cli\u003eYang F, Bian Z, Xu P, Sun S, Huang Z. MicroRNA-204-5p: A pivotal tumor suppressor. Cancer Med. 2023;12(3):3185-3200.\u003c/li\u003e\n\u003cli\u003eWinkler J, Abisoye-Ogunniyan A, Metcalf KJ, Werb Z. Concepts of extracellular matrix remodelling in tumour progression and metastasis. Nat Commun. 2020;11(1):5120.\u003c/li\u003e\n\u003cli\u003eYuan Z, Li Y, Zhang S, Wang X, Dou H, Yu X, et al. Extracellular matrix remodeling in tumor progression and immune escape: from mechanisms to treatments. Mol Cancer. 2023;22(1):48.\u003c/li\u003e\n\u003cli\u003eZhou YY, Zhang YZ, Li J, Li ZQ, Ding WB, Li M. Preoperative prediction of lymph node metastasis risk in papillary thyroid carcinoma based on multiple model comparisons. Sci Rep. 2025;15(1):35313.\u003c/li\u003e\n\u003cli\u003eBoehm KM, Khosravi P, Vanguri R, Gao J, Shah SP. Harnessing multimodal data integration to advance precision oncology. Nat Rev Cancer. 2022;22(2):114-126.\u003c/li\u003e\n\u003cli\u003eSzklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607-D613.\u003c/li\u003e\n\u003cli\u003eSantos MT, Rodrigues BM, Shizukuda S, Oliveira AF, Oliveira M, Figueiredo DLA, et al. Clinical decision support analysis of a microRNA-based thyroid molecular classifier: A real-world, prospective and multicentre validation study. EBioMedicine. 2022;82:104137.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Papillary thyroid carcinoma, lymph node metastasis, microRNA, predictive model","lastPublishedDoi":"10.21203/rs.3.rs-9409171/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9409171/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eIntroduction: \u003c/strong\u003ePapillary thyroid carcinoma (PTC) is the most prevalent endocrine malignancy worldwide, with lymph node metastasis (LNM) occurring in 40–60% of patients and representing a critical determinant of disease recurrence. Conventional preoperative risk stratification tools, relying on clinical and ultrasonographic variables, demonstrate insufficient discriminatory performance, particularly in the context of indeterminate cytological findings. Integration of microRNA-based molecular classifiers with bioinformatics frameworks represents a promising but underexplored strategy for improving LNM prediction accuracy.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eObjective:\u003c/strong\u003e To develop and validate an integrated LNM risk model for PTC by combining molecular data from the mir-THYpe microRNA classifier with bioinformatics analyses of the TCGA-THCA cohort.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods: \u003c/strong\u003emiRNA-seq expression profiles and clinical data from 378 histopathologically confirmed PTC cases (TCGA-THCA) were analyzed using DESeq2. Seven mir-THYpe panel miRNAs differentially expressed between N0 and N1 groups were identified. Target prediction (TargetScan, miRDB, DIANA-TarBase), functional enrichment (DAVID, KEGG, GO), and protein–protein interaction network analyses (STRING, Cytoscape) were performed. LASSO logistic regression with tenfold cross-validation selected six independent predictors, which were incorporated into a multivariate model and nomogram. Model performance was assessed by ROC analysis, Hosmer–Lemeshow calibration, and decision curve analysis, with external validation in GEO cohort GSE60542.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003e The integrated model achieved AUC = 0.841 (training), 0.812 (internal validation), and 0.786 (external validation). The strongest predictors were extrathyroidal extension (OR = 3.84), hsa-miR-146b-5p upregulation (OR = 3.12), and tumor size \u0026gt;1.0 cm (OR = 2.67). Decision curve analysis confirmed superior net clinical benefit over clinical-only and treat-all strategies.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e Integration of mir-THYpe molecular data with bioinformatics-derived features yielded a well-calibrated, externally validated LNM risk model that outperforms conventional clinical predictors, offering a precision oncology tool for individualized preoperative surgical decision-making in PTC.\u003c/p\u003e","manuscriptTitle":"An Externally Validated Predictive Model for Lymph Node Metastasis in Papillary Thyroid Carcinoma Integrating mir-THYpe MicroRNA Signatures and Bioinformatics Features","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-15 07:15:01","doi":"10.21203/rs.3.rs-9409171/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ab6e42e9-7d0a-4b1a-b6b5-00f76d505028","owner":[],"postedDate":"April 15th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":66255702,"name":"Bioinformatics"},{"id":66255703,"name":"Endocrinology \u0026 Metabolism"}],"tags":[],"updatedAt":"2026-04-15T07:15:01+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-15 07:15:01","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9409171","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9409171","identity":"rs-9409171","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00