Genomic insights into Indian wheat stripe rust pathotypes from long-read hybrid assemblies | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Genomic insights into Indian wheat stripe rust pathotypes from long-read hybrid assemblies Anurag Saharan, Deepak Singla, Ramesh Gutha, Om Prakash Gangwar, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7535284/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Stripe rust, caused by Puccinia striiformis f. sp. tritici ( Pst ), poses a significant threat to global wheat production. Resistance in wheat cultivars is frequently overcome due to rapid evolution of pathogen virulence. Until recently, genome assemblies of Indian Pst pathotypes were based exclusively on short-read sequencing, which is limited in resolving the highly repetitive and heterozygous dikaryotic genomes of rust fungi. Results We generated hybrid genome assemblies for five Indian Pst pathotypes (110S119, 238S119, 46S119, 110S84, and 78S84) using high-coverage PacBio and Illumina sequencing. Assembly with Maryland Super-Read Celera Assembler (MaSuRCA) resulted in genome sizes ranging from 75.21 Mb (110S119) to 83.03 Mb (78S84), with contig counts ranging from 286 to 877. All assemblies exhibited GC content > 44% and > 90% completeness based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis, indicating high assembly quality. Gene prediction with Funannotate identified 14,559 to 15,283 protein-coding genes per pathotype. Functional classification of predicted proteins was performed using InterProScan. Phylogenetic analysis based on single-copy orthologs clustered the five Indian pathotypes into a single clade, with 78S84 and 238S119 forming one subgroup, and 110S119 and 46S119 another. Conclusions These high-quality genome assemblies represent the first long-read-based resources for Indian Pst pathotypes and provide valuable genomic insights into stripe rust diversity and evolution. They will serve as a foundation for rust surveillance, evolutionary studies, and the development of durable resistance in wheat. Genome assembly Gene annotation Puccinia striiformis f. sp. tritici Pathotype Phylogenomics Stripe rust Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Background Wheat ( Triticum aestivum L.) is one of the most important cereal crops globally and serves as a staple food for millions of people across continents [ 1 ]. Despite the development and widespread cultivation of improved wheat varieties, the crop remains vulnerable to various biotic and abiotic stresses that significantly impact productivity [ 2 ]. Among biotic stresses, fungal diseases account for 15–20% of annual economic losses in wheat worldwide [ 3 ]. Stripe rust, caused by Puccinia striiformis f. sp. tritici (Pst), is one of the most destructive foliar diseases of wheat, capable of causing complete crop failure under conducive environmental conditions [ 4 ]. The disease has a global distribution and is reported in over 60 countries, with frequent epidemics observed in Ethiopia, the United States, Australia, and China [ 5 ]. In India, stripe rust poses a recurrent threat, especially in the North-Western Plains Zone (NWPZ), where early disease onset during December and January coincides with favourable microclimatic conditions. For instance, severe epidemics occurred due to the emergence of the virulent pathotype 78S84, which overcame Yr27 resistance in the widely cultivated variety PBW343, resulting in significant yield losses [ 6 ]. Genetic resistance is the most effective and sustainable strategy for managing stripe rust. However, resistance is often short-lived due to the rapid evolution of Pst pathotypes. Despite its predominantly clonal reproduction, Pst exhibits remarkable adaptability, quickly overcoming newly deployed resistance genes [ 7 ]. Historical examples include the breakdown of Yr2 resistance in the 1970s, Yr9 in the 1990s, and Yr27 in recent decades (5). Notably, global Pst populations have shown increasing virulence complexity and aggressiveness in regions such as Europe, North America, and Asia [ 8 ]. In India, continuous efforts are made to develop and deploy wheat varieties with diverse and durable resistance genes. However, the rapid evolution of Pst often renders resistance ineffective over time, emphasizing the need for real-time monitoring and molecular characterization of pathogen populations. Understanding the genetic structure and diversity of the pathogen is critical for deploying effective resistance genes and forecasting potential outbreaks. With the advent of next-generation sequencing technologies, genome-wide analyses of rust pathogens have become feasible. Yet, earlier genome assemblies of Indian Pst pathotypes were primarily based on short-read sequencing, which failed to resolve the complexity of the dikaryotic rust genomes due to high heterozygosity and repetitive content [ 9 ]. In the present study, we employed a hybrid sequencing approach combining PacBio long-read and Illumina short-read technologies to generate high-quality genome assemblies for five Indian Pst pathotypes: 110S119, 238S119, 46S119, 110S84, and 78S84. We aimed to investigate their genomic features, assess evolutionary relationships with globally sequenced Pst isolates, and provide valuable insights to support effective and durable stripe rust management in Indian wheat production systems. Results Whole-genome sequencing of prevalent Indian Puccinia striiformis f. sp. tritici pathotypes The primary objective of this study was to generate high-quality reference genome assemblies for prevalent Indian P. striiformis f. sp. tritici (Pst) pathotypes with distinct virulence profiles (Supplementary Table S1 ). High-molecular-weight DNA was extracted from five Indian Pst pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84 and sequenced using both Illumina HiSeq (short-read) and PacBio (long-read) platforms. Paired-end Illumina sequencing generated 7.24, 10.06, 7.21, 7.23, and 9.98 Gb of data, while PacBio sequencing yielded 5.20, 5.61, 6.21, 5.56, and 5.86 Gb for 110S119, 238S119, 46S119, 110S84, and 78S84, respectively (Table 1 ). The combined sequencing coverage ranged from 155X to 305X, enabling robust hybrid assemblies. Table 1 Raw read and assembly statistics of five Pst pathotypes Pathotype Raw Read Statistics Assembly Statistics Illumina Reads (Gb) PacBio Reads (Gb) Total Coverage (X) Genome Size (Mb) Largest Contig (bp) Av. Contig Length (bp) Total Contigs N50 (bp) GC (%) L50 N90 (bp) L90 110S119 7.21 6.21 167X 75.21 2414684 207781 362 387176 44.43 52 91560 207 238S119 9.98 5.86 219X 77.56 1948682 233630 332 478465 44.42 48 110518 182 46S119 7.24 5.20 168X 75.43 5465759 247327 305 465309 44.42 42 107877 172 110S84 10.06 5.61 305X 75.91 2098857 265439 286 435920 44.44 48 134267 171 78S84 7.23 5.56 155X 83.03 1058682 94676 877 182087 44.39 129 42772 484 Hybrid de novo genome assembly Genome assembly using MaSuRCA v4.1.0 produced assemblies with 362 to 877 contigs, and genome sizes ranged from 75.21 Mb (110S119) to 83.03 Mb (78S84). Largest contig lengths varied from 1.05 to 5.46 Mb, with mean contig sizes ranging from 94.6 kb to 265.4 kb. N50 values ranged from 182.1 Kb in 78S84 to 478.5 Kb in 238S119, and GC content was about 44% in all assemblies (Table 1 ). BUSCO completeness scores exceeded 90% in all pathotypes except 110S84 (89.8%), suggesting good assembly quality and representation of conserved fungal genes in the present assemblies (Fig. 1 ). These long-read hybrid assemblies offer the most comprehensive genomic representation of Indian Pst pathotypes to date and demonstrate improved coverage compared to earlier short-read-based assemblies. Gene prediction and ortholog identification Gene prediction using Funannotate identified 14,559 to 15,283 total genes and 14474 to 15925 protein-coding genes across the five Pst genomes, consistent with prior estimates for Pst (15,000–25,000 genes). The maximum number of predicted proteins (15,925) was observed in 110S119, and the fewest (14,474) in 238S119. Predicted tRNA genes ranged from 485 to 545. Average gene lengths spanned 1.62 to 1.82 kb (Table 2 ). Table 2 Gene prediction and gene structure features in five Indian Pst pathotypes Feature 110S119 238S119 46S119 110S84 78S84 Number of genes predicted 14843 14559 14985 14930 15283 Number of proteins 15925 14474 15573 15368 15400 Number of tRNAs 490 519 509 485 545 Number of ncRNAs 0 0 0 0 0 Number of rRNAs 0 0 0 0 0 Average gene length (bp) 1619.77 1653.29 1764.48 1822.33 1654.08 CDS transcripts 15925 14474 15573 15368 15400 CDS with 3’ UTRs 1597 2334 1994 3191 2852 CDS with no UTR 7502 9732 7907 6337 9785 CDS with 5’ and 3’ UTRs 6826 2408 5672 5840 2763 CDS complete 15711 14231 15358 15200 14893 Total number of exons 97566 76047 91967 90418 82222 Total CDS exons 76016 66647 73169 71359 70625 Multiple exon transcripts 14429 13254 14396 15012 14062 Single exon transcripts 1496 1220 1177 356 1338 Average exon length (bp) 311 285 301 321 288 Average protein length (AA) 442 433 436 422 432 Coding sequence transcripts ranged from 14,474 to 15,925. UTRs and exonic features varied, with 110S84 showing the highest number of genes with 3' UTRs (3,191) and 110S119 showed the highest number of genes with both 5' and 3' UTRs (6,826). Total CDS exons ranged from 66,647 to 76,016 (Table 2 ). Ortholog analysis showed that 1,128 to 1,667 proteins were unique to across these 5 Pst pathotype, with the number of unique proteins the highest in 110S119. Proteins with at least one ortholog ranged from 12,867 to 13,898 in these pathotypes (Table 3 ). Functional annotation of predicted proteins Comprehensive functional annotation of predicted proteins from the five Pst pathotypes was performed using InterProScan, which integrates domain-level annotations from databases such as SUPERFAMILY, EggNOG, PFAM, and CDD (Table 3 ). Among the pathotypes, 110S119 exhibited the highest number of annotated proteins across databases, including 8,696 SUPERFAMILY, 11,202 EggNOG, and 6,965 PFAM domain matches, indicating a broader functional repertoire (Table 3 ). In contrast, 238S119 had comparatively fewer annotations across all datasets. Table 3 Functional annotation of predicted genes Annotation Category 110S119 238S119 46S119 110S84 78S84 GO terms (total) 7269 6328 6941 6686 6860 InterProScan hits 8696 7706 8385 8122 8314 EggNOG annotations 11202 10172 10926 10666 10915 Pfam domains 6965 6130 6685 6417 6546 CAZy enzyme entries 338 305 316 328 329 MEROPS protease entries 285 242 291 273 274 Proteins with ≥ 1 ortholog 13610 12867 13898 13231 13314 Unique proteins 1667 1128 1166 1552 1449 Exons with protein evidence (%) 78.95 83.05 80.82 81.58 82.21 Exons with transcript evidence (%) 0.66 0.18 0.25 49.82 0.18 The InterPro domain distribution profile (Fig. 2 ) revealed the P-loop containing nucleoside triphosphate hydrolase (IPR027417) as the most abundant domain across all pathotypes, underscoring its central role in ATP/GTP binding and hydrolysis, which are fundamental to numerous cellular processes including signal transduction and transport. Other prevalent domains included the Protein kinase-like domain (IPR011009) and the WD40 repeat (IPR017986), which are associated with protein phosphorylation and multi-protein complex assembly, respectively. Further classification based on transcription factor (TF)-related InterPro domains (Fig. S1 ) showed that 78S84 and 46S119 pathotypes harbored higher counts of TF-associated domains. Among these, the C2H2-type zinc finger (IPR007087), bZIP (IPR004827), and Homeobox domain (IPR001356) were frequently represented, indicating conserved regulatory mechanisms. Subtle differences in TF domain composition among pathotypes may reflect pathotype-specific transcriptional regulation influencing virulence or adaptation. Gene ontology GO classification assigned genes to three major categories: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). BP terms were most represented (1,523–1,538 genes), followed by MF (1,388–1,407 genes), and CC (609–615 genes). Pathotype 110S84 showed the highest annotation counts for both BP and MF terms, while 238S119 had the lowest (Table 4 , Fig. 3 ). Table 4 Summary of Functional Annotations (GO and CAZy) across Pst Pathotypes Functional Category Pst 110S119 Pst 238S119 Pst 46S119 Pst 110S84 Pst 78S84 GO Terms BP (Biological Process) 1532 1523 1524 1538 1534 MF (Molecular Function) 1404 1388 1391 1407 1406 CC (Cellular Component) 615 615 609 614 614 CAZy Families GH (Glycoside Hydrolase) 170 157 157 164 166 GT (Glycosyl Transferase) 74 68 72 73 72 CE (Carbohydrate Esterase) 47 45 47 48 47 AA (Auxiliary Activity) 47 35 39 42 42 CBM (Carbohydrate-Binding Module) 1 1 1 1 1 CAZyme and protease profiling Carbohydrate-active enzymes (CAZymes) play key roles in the degradation, modification, and synthesis of carbohydrates and glycoconjugates, and are essential for pathogenic fungi in host infection and nutrient acquisition. The CAZy database classifies these enzymes into six major classes: glycoside hydrolases (GHs), glycosyltransferases (GTs), carbohydrate esterases (CEs), polysaccharide lyases (PLs), auxiliary activities (AAs), and carbohydrate-binding modules (CBMs). A comprehensive analysis of CAZy families across five P. striiformis pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84 identified a total of 138 CAZy gene families, with varying representation among the classes (Table 4 , Table S2). A total of 29 GT families were identified across the pathotypes. All 29 GT families were present in pathotypes 110S119, 46S119, 78S84, and 110S84, while 238S119 lacked GT66 and GT15. Since GT66 was consistently present in the other four pathotypes, it suggested possible loss or reduced selection pressure in 238S119 (Table 4 , Table S2). GHs constituted the most abundant class among the CAZymes. Pathotype 110S119 had the highest number of GH genes (170), followed closely by 78S84 (166), while 238S119 and 46S119 had 157 genes each. GH5 was highly expressed in pathotypes 110S119 and 46S119, whereas GH18 was more abundant in 110S84 and 78S84. GH47 showed increased representation in 110S119 (Table 4 , Table S2). Eight AA families were identified across the pathotypes. The families AA1 to AA7 and AA9 were consistently present in all five Pst pathotypes. Interestingly, AA16 was uniquely present in 110S119. Pathotype 110S119 also harboured the maximum number of AA genes (47), while the minimum (35) was found in 238S119. Five CE families CE4, CE5, CE8, CE10, and CE16 were identified. The total number of CE genes ranged from 45 in 238S119 to 48 in 110S84, with the remaining pathotypes harbouring 47 genes each. CE4 showed higher representation in 238S119 and 110S84 (Table 4 , Table S2). Only one CBM family, CBM21, was detected, represented by a single gene in each pathotype. Among the 44 PL families in the CAZy database, only PL1 and PL35 were detected, each with two to three associated genes per pathotype. Pathotype 110S119 showed only two PL genes, while others had three. A detailed data on the distribution of individual CAZy families across the five pathotypes has been summarised in (Table S2), revealing patterns of specialization and possible adaptation. These variations may reflect differential carbohydrate metabolism and host adaptation strategies among Indian Pst pathotypes. Clusters of orthologous groups (COG)-functional categorization A total of 25 functional categories were identified, spanning essential cellular and metabolic processes. The heatmap (Fig. 4 ) depicts the distribution and relative abundance of gene counts per COG category among pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84. The most prominent category across all pathotypes were Posttranslational modification, protein turnover, chaperones (COG category O) and Replication, recombination and repair (COG Category L). This was followed by “Translation, ribosomal structure and biogenesis” (category J), “Post-translational modification, protein turnover, and chaperones” (category O), and “Amino acid transport and metabolism” (category E), indicating the active involvement of these fungi in protein synthesis, folding, and metabolic regulation. Categories associated with replication and repair (L), transcription (K), and carbohydrate metabolism (G) were also well represented, highlighting core biological functions essential for fungal survival and host colonization. Relatively fewer genes were assigned to signal transduction (T), secondary metabolite biosynthesis (Q), and defense-related pathways, such as cell motility (N) and nuclear structure (Y), indicating specialized but limited roles. The “Function unknown” (category S) had a considerable representation in all pathotypes, underscoring the presence of hypothetical proteins or yet-to-be-characterized functions in the P. striiformis genome. Overall, the COG functional distribution revealed a conserved profile among pathotypes with subtle differences that may reflect pathotype-specific adaptations or evolutionary divergence. MEROPS protease family profiling MEROPS-based analysis revealed the presence of five major catalytic classes of peptidases—serine (S), metallopeptidases (M), cysteine (C), threonine (T), and aspartic (A)across all five Pst pathotypes. Serine proteases were the most abundant class, dominated by members of the S09 and S10 families, followed by metallopeptidases (notably M28 and M20) and cysteine proteases (particularly C1 and C13 families) (Fig. 5 ). Subclass-level analysis (Fig. S2) highlighted a consistent and conserved distribution of key proteolytic subfamilies across pathotypes, suggesting their crucial role in fungal development, nutrient acquisition, and host-pathogen interactions. The relative enrichment of secreted serine and cysteine peptidases points toward their involvement in virulence and host tissue degradation. Identification and characterization of repetitive elements in P. striiformis Repetitive elements constituted a substantial portion of the genomes of the five Indian P. striiformis pathotypes, ranging from 33.97% in 46S119 to 37.72% in 78S84 (Table 5 ; Fig. 6 ). The proportion of interspersed repeats varied between 33.06% and 36.81%, accounting for the bulk of the repetitive fraction, while smaller contributions came from simple repeats (0.64 0.68%), small RNAs (0.03–0.08%), satellites (≤ 0.03%), and low-complexity sequences (~ 0.17–0.18%). Class I retroelements comprised 6.86% (46S119) to 8.99% (78S84) of the genome. The majority belonged to LTR elements (6.73–8.82%), dominated by Gypsy/DIRS1 (4.66–6.56%), followed by Ty1/Copia (1.79–2.08%). LINE elements were present at low levels (~ 0.12–0.28%), while SINEs and L1/CIN4 were negligible (< 0.05%). Class II DNA transposons contributed 4.35–5.23% across the five pathotypes. Within this class, the hobo- Activator family was the most abundant (1.31–1.67%). Rolling-circle elements occurred in minor proportions (0.18–0.31%). A notable fraction of the repeatome remained unclassified (21.23–22.58%), representing the single largest category of repeats in all pathotypes. Overall, the five pathotypes showed broadly similar repeat compositions, with 78S84 harbouring the highest repeat content (37.72%) and Pst46S119 the lowest (33.97%). The data indicate that while both Class I and Class II elements contribute significantly, the repeat landscape is dominated by unclassified elements and LTR retrotransposons, particularly Gypsy/DIRS1. Table 5 Repeat content of the isolate Pst110S119, Pst238S119, Pst46S119, Pst110S84 and Pst78S84 identified using de novo and homology-based repeat finding methods Item 110S119 238S119 46S119 110S84 78S84 Bases masked 34.64 36.51 34.28 34.99 37.92 Retroelements 7.61 8.56 6.86 7.70 8.99 SINEs 0.03 0 0.01 0.03 LINEs 0.13 0.18 0.12 0.28 0.14 L1/CIN4 0.03 0 0 0 0 LTR elements 7.46 8.38 6.73 7.41 8.82 BEL/Pao 0.01 0 0.01 0 0 Ty1/Copia 1.89 1.97 1.80 1.79 2.08 Gypsy/DIRS1 5.38 6.15 4.66 5.55 6.56 DNA transposons 4.35 4.97 4.44 4.97 5.23 hobo-Activator 1.52 1.31 1.41 1.67 1.63 Tc1-IS630-Pogo 0.17 0.33 0.13 0.27 0.19 MULE-MuDR 0.78 0.78 0.64 0.82 0.93 Tourist/Harbinger 0.44 0.41 0.47 0.59 0.70 Rolling-circles 0.18 0.31 0.31 0.24 0.20 Unclassified 21.60 21.77 21.76 21.23 22.58 Total interspersed repeats 33.57 35.30 33.06 33.90 36.81 Small RNA 0.06 0.03 0.04 0.03 0.08 Satellites 0 0.03 0.01 0 0.02 Simple repeats 0.67 0.66 0.68 0.65 0.64 Low complexity 0.17 0.18 0.18 0.17 0.17 Total 34.47 36.20 33.97 34.75 37.72 Phylogenomic analysis and diversity among Indian Pst pathotypes Phylogenetic analysis based on single-copy orthologs across 15 rust genomes including five Indian Pst pathotypes, eight other Pst isolates, P. graminis , and P. triticina was conducted using Ortho Finder (Fig. 7 ). Among 242,664 total genes, 97% were grouped into 16,990 Ortho groups. Of these, 4,204 were shared by all species, and 636 were single-copy orthologs. All five Indian pathotypes formed a monophyletic clade, while P. graminis and P. triticina served as outgroups. Pathotypes 78S84 and 238S119 formed one subgroup; 110S119 and 46S119 formed another. 110S84 displayed the greatest divergence and grouped more closely with Australian race 134E36. Indian Pst pathotypes were genetically distinct from aggressive Western U.S. races such as PST-130 and PST-78, which showed lower divergence with CYR34 (China) and Australian races. Phylogenetic analysis also reveals the recent origin of Indian races compared to other pathotypes. Discussion Advances in genome assembly of Indian Pst pathotypes Previous genome assemblies of P. striiformis f. sp. tritici (Pst) relied heavily on short-read sequencing technologies, which led to highly fragmented and incomplete genomes. For example, the genome of PST-130 generated using Illumina short reads resulted in a 65 Mb assembly with over 29,000 contigs and an N50 of just 5 kb [ 10 ]. Subsequent studies on other Pst races also produced assemblies highlighting the limitations of short-read sequencing [ 11 – 15 ]. In contrast, our current study utilized long-read PacBio sequencing in combination with Illumina short reads to generate hybrid genome assemblies for five Indian Pst pathotypes. These assemblies showed substantial improvements in genome contiguity, with contig numbers reduced to 286–877 and N50 values exceeding 180 Kb. All assemblies had genome sizes between 75 and 83 Mb, and completeness scores based on BUSCO exceeded 89.8%, indicating highly reliable and complete genome representations [ 16 ]. Gene prediction and functional annotation Gene prediction revealed between 14,559 to 15,283 protein-coding genes per pathotype across the five pathotypes, consistent with previous studies that estimated gene counts in Pst between 15,000 and 25,000 [ 9 , 15 , 17 ]. Functional annotation via InterProScan revealed conserved protein domains such as P-loop NTP hydrolase, kinase-like domains, and WD40 repeats—critical to signalling, metabolism, and cellular assembly. Pathotype 110S119 consistently showed the highest number of annotated domains, indicating potential functional enrichment. The relative depletion of domain hits in 238S119 might reflect evolutionary gene loss or divergence. Transcription factor diversity and gene ontology enrichment Transcription factor (TF) domain profiling highlighted differences in regulatory potential among pathotypes. Notably, 46S119 and 78S84 carried more TF-related domains including zinc-finger (C2H2), bZIP, and Homeobox, which may drive differential gene regulation and virulence expression. Gene Ontology classification showed that Biological Processes were most enriched, particularly in 110S84, with comparatively fewer annotations in 238S119. These patterns reinforce the biological complexity and variability among Indian Pst isolates. CAZy profiles reveal pathotype-specific carbohydrate metabolism The analysis of carbohydrate-active enzymes (CAZymes) provided insight into pathogenicity-related functions. CAZyme analysis revealed conserved representation of core glycoside hydrolase (GH), glycosyltransferase (GT), and auxiliary activity (AA) families essential for plant cell wall degradation [ 18 – 20 ]. Pathotype 110S119 showed the most diverse and abundant CAZyme repertoire, including a unique presence of AA16 and higher gene counts in GH5, GH47, and GT66. By contrast, 238S119 lacked GT15 and GT66, suggesting lineage-specific gene loss or differential selection. These CAZy variations imply functional adaptations influencing host colonization efficiency and virulence strategies. Conserved protease machinery with divergent specificity MEROPS profiling uncovered all five major protease classes across the pathotypes, with serine and metallopeptidases dominating the profiles. Serine proteases (S09, S10) and cysteine proteases (C1, C13) were notably abundant in 110S119 and 46S119, hinting at enhanced proteolytic capabilities for host tissue penetration and immune evasion. Despite general conservation, subtle differences in protease subfamily distribution reflect differential virulence strategies or stage-specific roles in fungal development [ 21 ]. Identification and characterization of repetitive elements in P. striiformis In the present study, the total transposable element (TE) content across the five Indian P. striiformis pathotypes ranged from 33.97% in Pst46S119 to 37.72% in Pst78S84. These values fall within the range reported earlier for P. striiformis , where TE content has been shown to vary widely from 31% to 48% depending on the isolate and methodology used [ 9 , 15 , 17 , 22 , 23 ]. Such variability is consistent with the highly dynamic nature of the repeatome in this pathogen. When compared across rust fungi, the TE proportions observed here are intermediate. In cereal rust pathogens, TE content has been reported to span a much broader spectrum, ranging from 17.8% to as high as 85% of the genome [ 24 – 28 ]. This suggests that while P. striiformis harbors a moderately repeat-rich genome relative to other rust fungi, it still exhibits considerable repeat-driven plasticity. At the pathotype level, the data revealed notable but modest variation in repeat proportions. Pst 78S84 exhibited the highest repeat content (37.72%), whereas Pst46S119 contained the lowest (33.97%). Such differences may reflect lineage-specific proliferation or loss of particular TE families, potentially contributing to genome size variability and differential adaptability. In particular, the enrichment of repeats in Pst78S84 could be indicative of recent TE amplification events, which may increase genomic plasticity and provide a substrate for rapid evolution under host-imposed selection pressures. Conversely, the relatively lower repeat fraction in Pst46S119 may point to either historical TE silencing or a more stabilized genome structure. The predominance of LTR retrotransposons, especially Gypsy/DIRS1 elements, along with a substantial fraction of unclassified repeats (~ 21–23%), indicates that a significant portion of the P. striiformis genome remains poorly resolved in terms of repeat family identity. This highlights both the dynamic activity of TEs and the limitations of current repeat libraries in fully capturing the diversity of fungal repetitive elements. Given that TEs are often implicated in structural rearrangements, gene regulation, and effector diversification, their abundance and variability across pathotypes may contribute directly to the rapid emergence of new virulence traits and host resistance breakdown in P. striiformis . Functional categorization and hypothetical gene enrichment COG analysis grouped genes into 25 functional categories. "General function prediction only" was the most represented, followed by roles in translation, protein turnover, and amino acid metabolism. Interestingly, category “S” (Function unknown) retained considerable representation, pointing toward a large pool of hypothetical proteins that merit further experimental validation for roles in pathogenicity or adaptation [ 29 ]. Phylogenetic positioning and evolutionary relationships Phylogenomic analysis using single-copy orthologs reaffirmed the distinctiveness of Indian pathotypes from global lineages. The formation of a monophyletic clade among Indian isolates suggests common ancestry, while subgroup divergence (e.g., 78S84-238S119 vs. 110S119-46S119) indicates recent evolutionary branching possibly shaped by agroecological or host selection pressures. 110S84’s unexpected proximity to Australian race 134E36 may suggest ancient genetic introgression or shared selection history. These findings align well with earlier virulence phenotyping and SSR-based diversity studies, adding a high-resolution phylogenomic dimension to Indian Pst pathotype evolution [ 30 ]. The inclusion of P. graminis and P. triticina reference genomes helped establish evolutionary divergence among rust species, reaffirming earlier findings of limited synteny and high heterozygosity in Pst [ 31 – 33 ]. Significance of long-read sequencing in Pst genomics Our results underscore the value of long-read sequencing in resolving complex dikaryotic genomes like those of rust fungi. Compared to earlier short-read-based Indian genome assemblies, which yielded 24,000–32,000 contigs [ 9 ], the current assemblies dramatically improved genome contiguity and accuracy. This also facilitated more accurate identification of orthologs, effectors, and other pathogenicity determinants. Implications for pathogen surveillance and wheat breeding The high-quality genome assemblies presented in this study provide a valuable resource for pathogen surveillance, effector prediction, and diagnostic development. Differences in CAZymes, proteases, and ortholog content across Indian pathotypes underscore their adaptive potential and support the need for continuous genome-based monitoring. These genomic insights can inform resistance breeding programs by enabling targeted gene deployment and facilitating the development of durable rust-resistant wheat varieties [ 1 , 2 , 5 , 8 ]. Conclusions The present study delivers the first long-read-based high-quality genome assemblies of five prevalent Indian P. striiformis pathotypes. These genomes offer improved completeness, gene annotation, and comparative insights, surpassing previously available draft genomes. Functional classification of genes, proteases, and CAZymes highlights critical pathogenicity mechanisms and inter-pathotype variability. Phylogenetic analysis further reveals distinct evolutionary groupings among Indian and global isolates. Collectively, these genomic resources will serve as a critical platform for molecular diagnostics, pathogen surveillance, and genomics-assisted wheat improvement for stripe rust resistance. Materials and methods Genomic DNA isolation Genomic DNA was extracted from dried urediniospores of five P. striiformis f. sp. tritici (Pst) pathotypes—46S119, 78S84, 110S84, 110S119, and 238S119—using the protocol of Schwessinger et al. [ 34 ]. DNA quality and quantity were assessed via agarose gel electrophoresis [ 35 ] and NanoDrop™ 1000 spectrophotometer (Thermo Scientific, Wilmington, USA). Genome sequencing Paired-end sequencing libraries were prepared for the five Pst pathotypes and sequenced using the Illumina NovaSeq 6000 platform. Raw reads were processed by demultiplexing and adapter trimming at a Phred score ≥ 30 with a minimum read length of 50 bp using CASAVA v1.8.2 (Illumina Inc.). For long-read sequencing, PacBio libraries were constructed using 5 µg of genomic DNA. After Exo III and Exo VII digestion, libraries underwent dual Ampure purification and BluePippin size selection. Approximately 20 picomoles of the final library were loaded onto a SMRT cell and sequenced using the PacBio Sequel System and Sequel Sequencing Kit 2.0. De novo hybrid whole genome assembly Hybrid genome assembly was performed using MaSuRCA v4.1.0 [ 36 ], which integrated short Illumina and long PacBio reads. All parameters were kept at default, except the jellyfish hash size was set at 20X the predicted genome size estimated via Genome Scope. Genome completeness was evaluated using BUSCO v5.5.0 [ 37 ] with the Basidiomycota dataset. Genome annotation and gene prediction Genome annotation was carried out for P. striiformis pathotypes (46S119, 78S84, 110S84, 110S119, and 238S119) using the Funannotate pipeline v1.8.17, which integrates repeat masking, ab initio gene prediction, and structural and functional annotation. Repeat masking was performed using Tantan. A combination of gene predictors—Augustus, GeneMark-ES, SNAP, GlimmerHMM, and CodingQuarry—was used for coding sequence identification. High-quality RNA-seq data [ 38 ] was employed to train the predictors and refine gene models using PASA v2.5.2 for UTR annotation and structural corrections. Functional annotation and comparative analysis Functional annotation was achieved using InterProScan to assign conserved domains (Pfam, SMART, TIGRFAMs) and Gene Ontology (GO) terms. EggNOG-mapper was used for GO and COG classification. Additional annotations included identification of CAZymes, peptidases, and signal peptides. Funannotate’s comparative module clustered protein sequences into orthogroups to identify core, accessory, and unique gene families. Outputs included summary statistics, heatmaps, bar plots, and NMDS plots. Identification and characterization of repetitive elements in P. striiformis Repeat Modeller, and Repeat Classifier were used to identify the genome specific repeats in the assembled Pst isolates [ 39 ]. Known repeats were searched using RepBase database [ 40 ]. de novo and known repeats were masked using Repeat Masker. For detecting repetitive elements, a hierarchical approach was applied which included an initial detection of de novo repeats using Repeat Modeller and Repeat Classifier followed by searching for the taxa specific repeats using RepBase. Phylogenetic Analysis Single-copy orthologs from OrthoFinder [ 41 ] were used to construct a phylogeny comprising Indian Pst pathotypes (110S119, 238S119, 46S119, 110S84, 78S84), global Pst isolates (PST130, PST134E16, CYR34, etc.), and reference genomes of P. graminis tritici (PGT CRL 75-36-700-3) and P. triticina (PT 15). Alignments were done using ClustalW and the phylogeny was inferred using RAxML. Visualization was performed using Fig Tree. Abbreviations CTAB: Cetyl trimethylammonium bromide; Pgt: Puccinia graminis f. sp. tritici ; Pst: Puccinia striiformis f. sp. tritici ; Pt: Puccinia triticina ; SNP: Single nucleotide polymorphism; Yr : Yellow (stripe rust) resistance; MaSuRCA: Maryland Super-Read Celera Assembler; BUSCO: Benchmarking Universal Single-Copy Orthologs Declarations Data Availability: The whole-genome sequencing data generated in this study have been deposited in the International Nucleotide Sequence Database Collaboration (INSDC) under BioProject accession number PRJEB97944. The corresponding BioSample accessions are ERS26890115 ( Pst pathotype 110S119), ERS26890116 ( Pst pathotype 110S84), ERS26890117 ( Pst pathotype 238S119), ERS26890118 ( Pst pathotype 46S119), and ERS26890119 ( Pst pathotype 78S84). (The datasets are currently confidential and will be released automatically upon acceptance of this manuscript as per repository policies). Acknowledgements First author is grateful to SERB, DST; CII and NGB Diagnostics Pvt. Ltd., for Prime Minister Fellowship and Indian Institute of Wheat and Barley Research, Karnal for providing stripe rust pathotypes. Funding Information No dedicated funding was available for this research work. Authors’ contributions All authors read and approved the final manuscript. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare that they have no competing interests Authors’ contributions AS performed genome assembly, annotation, and drafted the manuscript. DS and RG contributed to genome annotation and repeat analysis. OPG and JK carried out pathotype multiplication, virulence testing, and DNA isolation. MC and ISY assisted in data analysis and phylogenomic interpretation. SK contributed to data acquisition, provided supervision, and critical inputs. PC conceived and supervised the study, coordinated data acquisition and research activities, and finalized the manuscript. All authors read and approved the final manuscript. References Curtis BC, Rajaram S, Gomez Macpherson H. Bread Wheat; Improvement and Production. FAO Plant Production and Protection Series. Volume 30. Rome: FAO; 2002. Duveiller E, Singh RP, Nicol JM. The challenges of maintaining wheat productivity: pests, diseases, and potential epidemics. Euphytica. 2007;157(3):417–30. Figueroa M, Hammond-Kosack KE, Solomon PS. A review of wheat diseases-field perspective. Mol Plant Pathol. 2018;19:1523–26. Chen X, Kang Z, editors. Stripe Rust. Netherlands, Dordrecht: Springer; 2017. Wellings CR. Global status of stripe rust: a review of historical and current threats. Euphytica. 2011;179:129–44. Pannu PPS, Mohan C, Gurdeep SG, Jaspal K. Occurrence of yellow rust of wheat, its impact on yield viz-a-viz its management. Plant Dis Res. 2010;25:144–50. Bayles R, Flath K, Hovmoller M, de Vallavieille-Pope C. Breakdown of the Yr17 resistance to yellow rust of wheat in Northern Europe. Agronomie. 2000;20:805–11. Liu W, Maccaferri M, Rynearson S, Letta T, Zegeye H, Tuberosa R, Chen X, Pumphrey M. 2017. Novel sources of stripe rust resistance identified by genome-wide association mapping in Ethiopian durum wheat ( Triticum turgidum ssp. durum ). Front Plant Sci. 2017;8 https://doi.org/10.3389/fpls.2017.00774 Kiran K, Rawal HC, Dubey H, Jaswal R, Bhardwaj SC, Prasad P, Pal D, Devanna BN, Sharma TR. Dissection of genomic features and variations of three pathotypes of Puccinia striiformis through whole genome sequencing. Sci Rep. 2017;7:1–16. Cantu D, Govindarajulu M, Kozik A, Wang M, Chen X, Kojima KK, Jurka J, Michelmore RW, Dubcovsky J. Next generation sequencing provides rapid access to the genome of Puccinia striiformis f. sp. tritici , the causal agent of wheat stripe rust. PLoS ONE. 2011;6:e24230. Schwessinger B, Rathjen JP. Wheat rust diseases extraction of high molecular weight DNA from fungal rust spores for long read sequencing. Methods Mol Biol. New York: Humana; 2017. pp. 49–57. Schwessinger B, Chen YJ, Tien R, Vogt JK, Sperschneider J, Nagar R, McMullan M, Sicheritz-Ponten T, Sorensen CK, Hovmoller MS, Rathjen JP, Justesen AF. Distinct life histories impact dikaryotic genome evolution in the rust fungus Puccinia striiformis causing stripe rust in wheat. Genome Bio Evol. 2020;12:597–617. Vasquez-Gross H, Kaur S, Epstein L, Dubcovsky J. A haplotype-phased genome of wheat stripe rust pathogen Puccinia striiformis f. sp. tritici , race PST-130 from the estern USA. PLoS ONE. 2020;15(11):e0238611. Li C, Qiao L, Lu Y et al. 2023. Gapless genome assembly of Puccinia triticina provides insights into chromosome evolution in Pucciniales. Microbiol Spectr. 2023;11: e0282822. Yadav IS, Bhardwaj SC, Kaur J, Singla D, Kaur S, Kaur H, et al. Whole genome resequencing and comparative genome analysis of three Puccinia striiformis f. sp. tritici pathotypes prevalent in India. PLoS ONE. 2022;17(11):e0262697. Cisse OH, Stajich JE. 2019. FGMP: assessing fungal genome completeness. BMC Bioinformatics 2019;20,184. 10.1186/s12859-019-2782-89 Zheng W, Huang L, Kang Z, et al. High genome heterozygosity and endemic genetic recombination in the wheat stripe rust fungus. Nat Commun. 2013;4:2673. Van Den Brink J, De Vries RP. Fungal enzyme sets for plant polysaccharide degradation. Appl Microbiol Biotechnol. 2011;91:1477–92. 10.1007/s00253-011-3473-2 . Xia C, Qiu A, Wang M, Liu T, Chen W, Chen X. Current status and future perspectives of genomics research in the rust fungi. Int’l J Mol Sci. 2022;23:9629. Park BH, Karpinets TV, Syed MH, Leuze MR, Uberbacher EC. CAZymes analysis toolkit (cat): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. J Glycobiol. 2010;20:1574–84. 10.1093/glycob/cwq106 . Moolhuijzen P, See PT, Hane JK, Shi G, Liu Z, Oliver RP, et al. Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity. BMC Genomics. 2018;19:1–23. 10.1186/s12864-018-4680-3 . Cantu D, Segovia V, MacLean D, Bayles R, Chen X, Kamoun S, et al. Genome analyses of the wheat yellow (stripe) rust pathogen Puccinia striiformis f. sp. tritici reveal polymorphic and haustorial expressed secreted proteins as candidate effectors. BMC Genomics. 2013;14. 10.1186/1471-2164-14-270 . Xia C, Wang M, Yin C, Cornejo OE, Hulbert SH, Chen X. Genome sequence resources for the wheat stripe rust pathogen ( Puccinia striiformis f. sp. tritici ) and the barley stripe rust pathogen ( Puccinia striiformis f. sp. hordei ). MPMI. 2018;31:1117–20. 10.1094/MPMI-04-18-0107-A . Aime MC, McTaggart AR, Mondo SJ, Duplessis S. Phylogenetics and phylogenomics of rust fungi. Adv Genet. 2017;100:267–307. Krishnan P, Meile L, Plissonneau C, et al. Transposable element insertions shape gene regulation and melanin production in a fungal pathogen of wheat. BMC Biol. 2018;16:78. Raffaele S, Kamoun S. Genome evolution in filamentous plant pathogens: Why bigger can be better. Nat Rev Microbiol. 2012;10:417–30. Li C, Qiao L, Lu Y, et al. Gapless genome assembly of Puccinia triticina provides insights into chromosome evolution in Pucciniales. Microbiol Spectr. 2023;11:e0282822. Lian J, Li Y, Dodds PN, et al. Haplotype- Phased and chromosome-level genome assembly of Puccinia polysora , a giga-scale fungal pathogen causing Southern corn rust. Mol Ecol Resour. 2023;23:601–20. Liu G, Wu Z, Peng Y, Shang X, Gao L. Integrated transcriptome and proteome analysis provide insight into the ribosome inactivating proteins in Plukenetia volubilis seeds. Int J Mol Sci. 2022;23(17):9562. Singh H, Kaur J, Bala R, Srivastava P, Bains NS. 2020. Virulence and genetic diversity of Puccinia striiformis f. sp. tritici isolates in sub-mountainous area of Punjab, India. Phytoparasitica 2020; 48:383–395. Duplessis S, Cuomo CA, Lin YC, et al. Obligate biotrophy features unravelled by the genomic analysis of rust fungi. Proc Natl Acad Sci USA. 2011;108(22):9166–71. Duplessis S, Bakkeren G, Hamelin R. Advancing knowledge on biology of rust fungi through genomics. Adv Bot Res. 2014;70:173–209. Cuomo CA, Bakkeren G, Khalil HB, Panwar V, Joly D, Linning R, Sakthikumar S, Song X, Adiconis X, Fan L, Goldberg JM, Levin JZ, Young S, Zeng Q, Anikster Y, Bruce M, Wang M, Yin C, McCallum B, Szabo LJ, Hulbert S, Chen X, Fellers JP. Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci G3 genes genomes. Genet. 2017;7:361–76. Schwessinger B, Sperschneider J, Cuddy WS, Garnica DP, Miller ME, Taylor JM, Dodds PN, Figueroa M, Park RF, Rathjen JP. 2018. A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high Inter-haplotype diversity. mBio. 2009;9: e02275-17. Green MR, Sambrook J. Analysis of DNA by agarose gel electrophoresis. Cold Spring Harbor Protocols, 2019 (5), pdb. Top 100388. https://doi.org/10.1101/pdb.top100388 Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. MaSuRCA Genome Assembler Bioinform. 2013;29(21):2669–77. Manni M, Berkele MR, Seppey M, Simão FA, Zdobnov EM. Mol Biol Evol. 2021;38(10):4647–54. Gutha GV, Kaur J, Singla D, Chhuneja P, Saharan A, Gangwar OP, Bala R, Mir RR, Tak PS. 2025. Use of field pathogenomics approach for Puccinia striiformis f. sp. tritici race identification and phylogenomic delineation in north India. World J Microbiol Biotechnol. 2025;41(5): 166. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinf. 2009;1–14. 10.1002/0471250953.bi0410s25 . Bao W, Kojima KK, Kohany O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6:4–9. 10.1186/s13100-015-0041-9 . Emms DM, Kelly S. Ortho Finder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238. Additional Declarations No competing interests reported. Supplementary Files SupplementaryInformation.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7535284","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":524560573,"identity":"40eaec3a-8414-479c-a53a-181fd5a8aaaf","order_by":0,"name":"Anurag Saharan","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Anurag","middleName":"","lastName":"Saharan","suffix":""},{"id":524560574,"identity":"fc04e8c3-e536-43ec-9113-a29f1c7b2ed1","order_by":1,"name":"Deepak Singla","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Deepak","middleName":"","lastName":"Singla","suffix":""},{"id":524560575,"identity":"7058aeca-10d9-4448-8857-91abf163955f","order_by":2,"name":"Ramesh Gutha","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Ramesh","middleName":"","lastName":"Gutha","suffix":""},{"id":524560576,"identity":"75a9de14-80b0-4835-9146-7caa8b541cef","order_by":3,"name":"Om Prakash Gangwar","email":"","orcid":"","institution":"IIWBR Regional Research Station","correspondingAuthor":false,"prefix":"","firstName":"Om","middleName":"Prakash","lastName":"Gangwar","suffix":""},{"id":524560577,"identity":"903462f3-c0b0-4a42-991d-9a168d304706","order_by":4,"name":"Manoj Chaudhary","email":"","orcid":"","institution":"National Research Institute for Integrated Pest Management","correspondingAuthor":false,"prefix":"","firstName":"Manoj","middleName":"","lastName":"Chaudhary","suffix":""},{"id":524560578,"identity":"04502ad3-8b6e-4dae-8a0e-15511a590942","order_by":5,"name":"Jaspal Kaur","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Jaspal","middleName":"","lastName":"Kaur","suffix":""},{"id":524560579,"identity":"4c72fa5c-678a-408d-a65b-81d379b04bfa","order_by":6,"name":"Satinder Kaur","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Satinder","middleName":"","lastName":"Kaur","suffix":""},{"id":524560580,"identity":"207e3a67-dadc-43ce-a11a-477c81a1359c","order_by":7,"name":"Inderjit Singh Yadav","email":"","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Inderjit","middleName":"Singh","lastName":"Yadav","suffix":""},{"id":524560581,"identity":"e27f3c19-c8cf-4a35-9bac-7b0c445fb079","order_by":8,"name":"Parveen Chhuneja","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAwElEQVRIiWNgGAWjYFACHjYQyczA3gDmMjYQr4XnAIlaGBgkEojUYs5+9thj3h027Pwz3x78zMNgI7vhAAEtlj156ca8Z9KYJW7nJUvzMKQZE9RicIPHTJq37TAzw+0cA6CWw4nEavnPLH/zjPFvHob/RGs5wAxm8DAcIKzFsifHTHJuWzKz4Zm8NMs5BsnGMwlpMWc/Yybxts0uWe742cM33lTYyfYRdBiUTgZGEBKXGC12EC2jYBSMglEwCrAAAMh1PLqf1UqWAAAAAElFTkSuQmCC","orcid":"","institution":"Punjab Agricultural University","correspondingAuthor":true,"prefix":"","firstName":"Parveen","middleName":"","lastName":"Chhuneja","suffix":""}],"badges":[],"createdAt":"2025-09-04 10:53:15","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7535284/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7535284/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":92979717,"identity":"49a8983d-b183-467d-8362-0c9d8f27cb39","added_by":"auto","created_at":"2025-10-07 19:06:37","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1921681,"visible":true,"origin":"","legend":"","description":"","filename":"PSTHybridGenomeAssembliesBMC.docx","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/d6164546f017c23934be0347.docx"},{"id":92979385,"identity":"721036da-234c-4c9b-abb1-1a10e5bb1ef7","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":10396,"visible":true,"origin":"","legend":"","description":"","filename":"b0b4440efff646aba5c10cd7f8ecc8e8.json","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/d6a025015c4c3c5ed20c0b08.json"},{"id":92979388,"identity":"64dec89d-edcb-4a18-ac2a-5203580edad2","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":731545,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/ee44336888afdf37d23b5f77.docx"},{"id":92979725,"identity":"48faf79b-23ae-4a10-a303-c4b30ce0be87","added_by":"auto","created_at":"2025-10-07 19:06:37","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":149564,"visible":true,"origin":"","legend":"","description":"","filename":"b0b4440efff646aba5c10cd7f8ecc8e81enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/c37b3894714fae805eb6a4a9.xml"},{"id":92979392,"identity":"44b7765f-8768-4eb7-93e0-69012154f720","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":148347,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/66c89a276949632edd28de71.png"},{"id":92979723,"identity":"ede3f847-b822-47bc-8861-c6813d3a39a7","added_by":"auto","created_at":"2025-10-07 19:06:37","extension":"jpeg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":361818,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/582f23cddb3479d7aceab68c.jpeg"},{"id":92981133,"identity":"b33d309b-6ff0-47cd-b5a2-cc29211ad8c6","added_by":"auto","created_at":"2025-10-07 19:30:37","extension":"jpeg","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":247770,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/126431dfbe75b01196582203.jpeg"},{"id":92979396,"identity":"70e7d0df-5751-4156-8244-86fa831c7f09","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"jpeg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":464668,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/b7d2c9fd415083a5f0cb75dd.jpeg"},{"id":92979403,"identity":"408230c1-e4a1-473d-8f2c-2a69b99a9f05","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"jpeg","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":230652,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/6c85433fda373f6c4a3162cb.jpeg"},{"id":92979722,"identity":"621c1928-21dd-4f64-aacd-f832d2b3117c","added_by":"auto","created_at":"2025-10-07 19:06:37","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":170143,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/8cceb45db78015d6e5116984.png"},{"id":92980369,"identity":"99652ef4-c7fc-448f-a949-44ad69f5a63a","added_by":"auto","created_at":"2025-10-07 19:22:37","extension":"jpeg","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":220612,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/d1d2e5966fff0715937ac4ea.jpeg"},{"id":92979399,"identity":"722623a8-d021-4a71-95c7-ca42fa96b09d","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":29542,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/2234683e2d609c84118a3bb9.png"},{"id":92979407,"identity":"d2131142-77af-42f5-be8f-d453b8a35006","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":57204,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/c233c102ac917c8881a52e16.png"},{"id":92979394,"identity":"e8bd882a-03c8-46fd-af56-0e659de3be11","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":23049,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/8b587cf4afbd9025acdbd0d3.png"},{"id":92979406,"identity":"ea8fce43-0177-48d7-b8bb-f3255f170d75","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":74768,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/40d400c814033b320073d49d.png"},{"id":92979405,"identity":"b0927ffe-8375-4e16-987d-89fd7fe928eb","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":26539,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/6e337b3e2085e0a9a8b77f65.png"},{"id":92980121,"identity":"ad4c0dc1-c6f3-4210-bf1b-13adc3008a96","added_by":"auto","created_at":"2025-10-07 19:14:37","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":27970,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/c6b9ed200c8c990d53a83ecf.png"},{"id":92979408,"identity":"8fd7e03a-27cb-45c3-937c-6099ba43d7f5","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":14231,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/3ea7e87c7fa4a954c25990c7.png"},{"id":92979409,"identity":"3b964e6c-7a30-4bbe-9093-1609030682e5","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"xml","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":148144,"visible":true,"origin":"","legend":"","description":"","filename":"b0b4440efff646aba5c10cd7f8ecc8e81structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/73f8e6b756293d420c7fb0f8.xml"},{"id":92979410,"identity":"65fdd584-a3db-4d45-8aa3-9fbde2b0ec31","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"html","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":157160,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/d7ea7d05ddfad04be108efef.html"},{"id":92979383,"identity":"0a46041e-4379-4424-80ba-d2e5edaa1472","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":29542,"visible":true,"origin":"","legend":"\u003cp\u003eBUSCO genome completeness assessment of five Indian \u003cem\u003ePuccinia striiformis f. sp. tritici\u003c/em\u003e (Pst) pathotypes. The bar graph displays the percentage of Benchmarking Universal Single-Copy Orthologs (BUSCOs) classified into four categories: Complete and Single-Copy (S), Complete and Duplicated (D), Fragmented (F), and Missing (M), based on the Basidiomycota lineage dataset.\u003c/p\u003e","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/6d690d3f7ffd478154572ff9.png"},{"id":92979384,"identity":"39071d86-6d06-40f7-960e-e61ada62f17a","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":57204,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eDistribution of top 20 InterPro protein domains across five Indian Pst pathotypes.\u003c/em\u003e Heatmap shows the relative abundance of each domain (rows) in the pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84 (columns). Color intensity represents the domain count based on InterProScan annotation, highlighting conserved and variable protein domain families across the pathotypes.\u003c/p\u003e","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/5a1470de6423dc2b8a547b59.png"},{"id":92979387,"identity":"a2a90d79-aa7b-46e5-86f3-ccca8e6ab57f","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":23049,"visible":true,"origin":"","legend":"\u003cp\u003eBar plot showing the distribution of annotated Gene Ontology (GO) terms under three categories: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) across five \u003cem\u003ePuccinia striiformis f. sp. tritici\u003c/em\u003e(Pst) pathotypes (110S119, 238S119, 46S119, 110S84, and 78S84). The figure highlights conserved GO annotation patterns across all pathotypes, with BP representing the highest annotation count, followed by MF and CC.\u003c/p\u003e","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/19a468526e4414a588819d5e.png"},{"id":92980368,"identity":"c0ba7a3d-0f9b-4d57-94ae-8fcaa1cf4c4d","added_by":"auto","created_at":"2025-10-07 19:22:37","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":74768,"visible":true,"origin":"","legend":"\u003cp\u003eHeatmap showing the distribution of Cluster of Orthologous Groups (COGs) of proteins across five \u003cem\u003ePuccinia striiformis f. sp. tritici\u003c/em\u003e (Pst) pathotypes. The rows represent functional COG categories, and the columns represent the individual Pst pathotypes (110S119, 238S119, 46S119, 110S84, 78S84). Color intensity indicates the number of proteins assigned to each COG category per pathotype. This visualization highlights conserved and variable functional capabilities among Indian stripe rust pathotypes, particularly in categories related to transcription, signal transduction, carbohydrate metabolism, and replication.\u003c/p\u003e","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/bccc339c7750c71d268646bd.png"},{"id":92980120,"identity":"d5cbfed4-60f7-4459-8cdb-8ec01371d658","added_by":"auto","created_at":"2025-10-07 19:14:37","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":26539,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eDistribution of MEROPS protease families across five Puccinia striiformis f. sp. tritici (Pst) pathotypes.\u003c/em\u003eBar plot illustrates gene counts from each major MEROPS family—including aspartic (A), cysteine (C), metallo (M), serine (S), threonine (T) peptidases and protease inhibitors (I)—in pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84.\u003c/p\u003e","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/48be7316f06bf85e05b202dd.png"},{"id":92980118,"identity":"bc05a19e-52c0-4b33-a181-bacc49ada9fa","added_by":"auto","created_at":"2025-10-07 19:14:37","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":27970,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistribution of repetitive elements in five \u003c/strong\u003e\u003cem\u003ePuccinia striiformis f. sp. tritici\u003c/em\u003e\u003cstrong\u003epathotypes.\u003c/strong\u003e Retroelements (blue) and DNA transposons (orange) together contributed ~11–14% of the genome, while unclassified repeats (green) represented the largest fraction (~21–23%).\u003c/p\u003e","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/3dc451898ba1d61e0210c902.png"},{"id":92979720,"identity":"6b9df544-d829-4b3e-8ed1-a50583d51ced","added_by":"auto","created_at":"2025-10-07 19:06:37","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":14231,"visible":true,"origin":"","legend":"\u003cp\u003ePhylogenetic relationship of five Indian \u003cem\u003ePuccinia striiformis f. sp. tritici\u003c/em\u003e(Pst) pathotypes (highlighted in orange) with global reference isolates. The tree was constructed based on single-copy orthologous genes. Indian isolates clustered into a distinct clade, indicating shared ancestry and regional adaptation. \u003cem\u003ePuccinia triticina\u003c/em\u003e (PT15) and \u003cem\u003ePuccinia graminis\u003c/em\u003e (PGT CRL 75-36-700) were used as outgroups.\u003c/p\u003e","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/b7c23c8c20ce9a7110c5d44f.png"},{"id":96243496,"identity":"65364a7a-3378-4969-b307-38fbe826cc45","added_by":"auto","created_at":"2025-11-19 07:16:30","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2046192,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/74343502-7f03-4dfe-adcc-20dace0936fd.pdf"},{"id":92979386,"identity":"d894bbd4-c546-4802-8df2-2410ee1474f5","added_by":"auto","created_at":"2025-10-07 18:58:37","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":731545,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-7535284/v1/4ddf83d87836100f308dc01d.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Genomic insights into Indian wheat stripe rust pathotypes from long-read hybrid assemblies","fulltext":[{"header":"Background","content":"\u003cp\u003eWheat (\u003cem\u003eTriticum aestivum\u003c/em\u003e L.) is one of the most important cereal crops globally and serves as a staple food for millions of people across continents [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Despite the development and widespread cultivation of improved wheat varieties, the crop remains vulnerable to various biotic and abiotic stresses that significantly impact productivity [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Among biotic stresses, fungal diseases account for 15\u0026ndash;20% of annual economic losses in wheat worldwide [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Stripe rust, caused by \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e (Pst), is one of the most destructive foliar diseases of wheat, capable of causing complete crop failure under conducive environmental conditions [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eThe disease has a global distribution and is reported in over 60 countries, with frequent epidemics observed in Ethiopia, the United States, Australia, and China [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. In India, stripe rust poses a recurrent threat, especially in the North-Western Plains Zone (NWPZ), where early disease onset during December and January coincides with favourable microclimatic conditions. For instance, severe epidemics occurred due to the emergence of the virulent pathotype 78S84, which overcame \u003cem\u003eYr27\u003c/em\u003e resistance in the widely cultivated variety PBW343, resulting in significant yield losses [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eGenetic resistance is the most effective and sustainable strategy for managing stripe rust. However, resistance is often short-lived due to the rapid evolution of Pst pathotypes. Despite its predominantly clonal reproduction, Pst exhibits remarkable adaptability, quickly overcoming newly deployed resistance genes [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Historical examples include the breakdown of \u003cem\u003eYr2\u003c/em\u003e resistance in the 1970s, \u003cem\u003eYr9\u003c/em\u003e in the 1990s, and \u003cem\u003eYr27\u003c/em\u003e in recent decades (5). Notably, global Pst populations have shown increasing virulence complexity and aggressiveness in regions such as Europe, North America, and Asia [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eIn India, continuous efforts are made to develop and deploy wheat varieties with diverse and durable resistance genes. However, the rapid evolution of Pst often renders resistance ineffective over time, emphasizing the need for real-time monitoring and molecular characterization of pathogen populations. Understanding the genetic structure and diversity of the pathogen is critical for deploying effective resistance genes and forecasting potential outbreaks.\u003c/p\u003e\u003cp\u003eWith the advent of next-generation sequencing technologies, genome-wide analyses of rust pathogens have become feasible. Yet, earlier genome assemblies of Indian Pst pathotypes were primarily based on short-read sequencing, which failed to resolve the complexity of the dikaryotic rust genomes due to high heterozygosity and repetitive content [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. In the present study, we employed a hybrid sequencing approach combining PacBio long-read and Illumina short-read technologies to generate high-quality genome assemblies for five Indian Pst pathotypes: 110S119, 238S119, 46S119, 110S84, and 78S84. We aimed to investigate their genomic features, assess evolutionary relationships with globally sequenced Pst isolates, and provide valuable insights to support effective and durable stripe rust management in Indian wheat production systems.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cb\u003eWhole-genome sequencing of prevalent Indian\u003c/b\u003e \u003cb\u003ePuccinia striiformis\u003c/b\u003e \u003cb\u003ef. sp.\u003c/b\u003e \u003cb\u003etritici\u003c/b\u003e \u003cb\u003epathotypes\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe primary objective of this study was to generate high-quality reference genome assemblies for prevalent Indian \u003cem\u003eP. striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e (Pst) pathotypes with distinct virulence profiles (Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). High-molecular-weight DNA was extracted from five Indian Pst pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84 and sequenced using both Illumina HiSeq (short-read) and PacBio (long-read) platforms. Paired-end Illumina sequencing generated 7.24, 10.06, 7.21, 7.23, and 9.98 Gb of data, while PacBio sequencing yielded 5.20, 5.61, 6.21, 5.56, and 5.86 Gb for 110S119, 238S119, 46S119, 110S84, and 78S84, respectively (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The combined sequencing coverage ranged from 155X to 305X, enabling robust hybrid assemblies.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eRaw read and assembly statistics of five Pst pathotypes\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"13\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c12\" colnum=\"12\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c13\" colnum=\"13\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003ePathotype\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e\u003cp\u003eRaw Read Statistics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"9\" nameend=\"c13\" namest=\"c5\"\u003e\u003cp\u003eAssembly Statistics\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eIllumina Reads (Gb)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePacBio Reads\u003c/p\u003e\u003cp\u003e(Gb)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eTotal Coverage (X)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGenome Size (Mb)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLargest Contig (bp)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eAv. Contig Length (bp)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\"\u003e\u003cp\u003eTotal Contigs\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c9\"\u003e\u003cp\u003eN50 (bp)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c10\"\u003e\u003cp\u003eGC (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c11\"\u003e\u003cp\u003eL50\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c12\"\u003e\u003cp\u003eN90 (bp)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c13\"\u003e\u003cp\u003eL90\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e110S119\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e7.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e6.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e167X\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e75.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e2414684\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e207781\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e362\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e\u003cp\u003e387176\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e\u003cp\u003e44.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e\u003cp\u003e52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c12\"\u003e\u003cp\u003e91560\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c13\"\u003e\u003cp\u003e207\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e238S119\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e9.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.86\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e219X\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e77.56\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1948682\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e233630\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e332\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e\u003cp\u003e478465\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e\u003cp\u003e44.42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c12\"\u003e\u003cp\u003e110518\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c13\"\u003e\u003cp\u003e182\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e46S119\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e7.24\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.20\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e168X\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e75.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e5465759\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e247327\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e305\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e\u003cp\u003e465309\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e\u003cp\u003e44.42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e\u003cp\u003e42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c12\"\u003e\u003cp\u003e107877\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c13\"\u003e\u003cp\u003e172\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e110S84\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e10.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.61\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e305X\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e75.91\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e2098857\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e265439\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e286\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e\u003cp\u003e435920\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e\u003cp\u003e44.44\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c12\"\u003e\u003cp\u003e134267\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c13\"\u003e\u003cp\u003e171\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e78S84\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e7.23\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e5.56\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e155X\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e83.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1058682\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e94676\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e877\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e\u003cp\u003e182087\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e\u003cp\u003e44.39\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e\u003cp\u003e129\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c12\"\u003e\u003cp\u003e42772\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c13\"\u003e\u003cp\u003e484\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eHybrid\u003c/b\u003e \u003cb\u003ede novo\u003c/b\u003e \u003cb\u003egenome assembly\u003c/b\u003e\u003c/p\u003e\u003cp\u003eGenome assembly using MaSuRCA v4.1.0 produced assemblies with 362 to 877 contigs, and genome sizes ranged from 75.21 Mb (110S119) to 83.03 Mb (78S84). Largest contig lengths varied from 1.05 to 5.46 Mb, with mean contig sizes ranging from 94.6 kb to 265.4 kb. N50 values ranged from 182.1 Kb in 78S84 to 478.5 Kb in 238S119, and GC content was about 44% in all assemblies (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). BUSCO completeness scores exceeded 90% in all pathotypes except 110S84 (89.8%), suggesting good assembly quality and representation of conserved fungal genes in the present assemblies (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThese long-read hybrid assemblies offer the most comprehensive genomic representation of Indian Pst pathotypes to date and demonstrate improved coverage compared to earlier short-read-based assemblies.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eGene prediction and ortholog identification\u003c/h2\u003e\u003cp\u003eGene prediction using Funannotate identified 14,559 to 15,283 total genes and 14474 to 15925 protein-coding genes across the five Pst genomes, consistent with prior estimates for Pst (15,000\u0026ndash;25,000 genes). The maximum number of predicted proteins (15,925) was observed in 110S119, and the fewest (14,474) in 238S119. Predicted tRNA genes ranged from 485 to 545. Average gene lengths spanned 1.62 to 1.82 kb (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eGene prediction and gene structure features in five Indian Pst pathotypes\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFeature\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003e110S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003e238S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e46S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e110S84\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003e78S84\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of genes predicted\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e14843\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e14559\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e14985\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e14930\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e15283\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of proteins\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e14474\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e15573\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e15368\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e15400\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of tRNAs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e490\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e519\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e509\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e485\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e545\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of ncRNAs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of rRNAs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAverage gene length (bp)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1619.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1653.29\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1764.48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1822.33\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1654.08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCDS transcripts\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e14474\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e15573\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e15368\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e15400\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCDS with 3\u0026rsquo; UTRs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1597\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2334\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1994\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3191\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2852\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCDS with no UTR\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7502\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e9732\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e7907\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e6337\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e9785\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCDS with 5\u0026rsquo; and 3\u0026rsquo; UTRs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e6826\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2408\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e5672\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e5840\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2763\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCDS complete\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15711\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e14231\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e15358\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e15200\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e14893\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTotal number of exons\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e97566\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e76047\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e91967\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e90418\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e82222\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTotal CDS exons\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e76016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e66647\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e73169\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e71359\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e70625\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMultiple exon transcripts\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e14429\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e13254\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e14396\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e15012\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e14062\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSingle exon transcripts\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1496\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1220\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1177\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e356\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1338\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAverage exon length (bp)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e311\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e285\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e301\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e321\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e288\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAverage protein length (AA)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e442\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e433\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e436\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e422\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e432\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eCoding sequence transcripts ranged from 14,474 to 15,925. UTRs and exonic features varied, with 110S84 showing the highest number of genes with 3' UTRs (3,191) and 110S119 showed the highest number of genes with both 5' and 3' UTRs (6,826). Total CDS exons ranged from 66,647 to 76,016 (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Ortholog analysis showed that 1,128 to 1,667 proteins were unique to across these 5 Pst pathotype, with the number of unique proteins the highest in 110S119. Proteins with at least one ortholog ranged from 12,867 to 13,898 in these pathotypes (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eFunctional annotation of predicted proteins\u003c/h3\u003e\n\u003cp\u003eComprehensive functional annotation of predicted proteins from the five Pst pathotypes was performed using InterProScan, which integrates domain-level annotations from databases such as SUPERFAMILY, EggNOG, PFAM, and CDD (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Among the pathotypes, 110S119 exhibited the highest number of annotated proteins across databases, including 8,696 SUPERFAMILY, 11,202 EggNOG, and 6,965 PFAM domain matches, indicating a broader functional repertoire (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). In contrast, 238S119 had comparatively fewer annotations across all datasets.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eFunctional annotation of predicted genes\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnnotation Category\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003e110S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003e238S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e46S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e110S84\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003e78S84\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO terms (total)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7269\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6328\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6941\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e6686\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e6860\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eInterProScan hits\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8696\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e7706\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e8385\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e8122\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e8314\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEggNOG annotations\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e11202\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e10172\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e10926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e10666\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e10915\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePfam domains\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e6965\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6130\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6685\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e6417\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e6546\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCAZy enzyme entries\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e338\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e305\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e316\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e328\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e329\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMEROPS protease entries\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e285\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e242\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e291\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e273\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e274\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eProteins with \u0026ge;\u0026thinsp;1 ortholog\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e13610\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12867\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e13898\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e13231\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e13314\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUnique proteins\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1667\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1128\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1166\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1552\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1449\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eExons with protein evidence (%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e78.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e83.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e80.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e81.58\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e82.21\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eExons with transcript evidence (%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.66\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.25\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e49.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe InterPro domain distribution profile (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) revealed the P-loop containing nucleoside triphosphate hydrolase (IPR027417) as the most abundant domain across all pathotypes, underscoring its central role in ATP/GTP binding and hydrolysis, which are fundamental to numerous cellular processes including signal transduction and transport. Other prevalent domains included the Protein kinase-like domain (IPR011009) and the WD40 repeat (IPR017986), which are associated with protein phosphorylation and multi-protein complex assembly, respectively.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eFurther classification based on transcription factor (TF)-related InterPro domains (Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) showed that 78S84 and 46S119 pathotypes harbored higher counts of TF-associated domains. Among these, the C2H2-type zinc finger (IPR007087), bZIP (IPR004827), and Homeobox domain (IPR001356) were frequently represented, indicating conserved regulatory mechanisms. Subtle differences in TF domain composition among pathotypes may reflect pathotype-specific transcriptional regulation influencing virulence or adaptation.\u003c/p\u003e\n\u003ch3\u003eGene ontology\u003c/h3\u003e\n\u003cp\u003eGO classification assigned genes to three major categories: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). BP terms were most represented (1,523\u0026ndash;1,538 genes), followed by MF (1,388\u0026ndash;1,407 genes), and CC (609\u0026ndash;615 genes). Pathotype 110S84 showed the highest annotation counts for both BP and MF terms, while 238S119 had the lowest (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSummary of Functional Annotations (GO and CAZy) across Pst Pathotypes\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFunctional Category\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003ePst 110S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePst 238S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003ePst 46S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003ePst 110S84\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePst 78S84\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO Terms\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBP (Biological Process)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1532\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e1523\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e1524\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1534\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMF (Molecular Function)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1404\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e1388\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e1391\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1407\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1406\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCC (Cellular Component)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e615\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e615\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e609\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e614\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e614\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCAZy Families\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGH (Glycoside Hydrolase)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e170\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e157\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e157\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e164\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e166\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGT (Glycosyl Transferase)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e74\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e72\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCE (Carbohydrate Esterase)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e45\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e47\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAA (Auxiliary Activity)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e35\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e39\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e42\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCBM (Carbohydrate-Binding Module)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eCAZyme and protease profiling\u003c/h3\u003e\n\u003cp\u003eCarbohydrate-active enzymes (CAZymes) play key roles in the degradation, modification, and synthesis of carbohydrates and glycoconjugates, and are essential for pathogenic fungi in host infection and nutrient acquisition. The CAZy database classifies these enzymes into six major classes: glycoside hydrolases (GHs), glycosyltransferases (GTs), carbohydrate esterases (CEs), polysaccharide lyases (PLs), auxiliary activities (AAs), and carbohydrate-binding modules (CBMs). A comprehensive analysis of CAZy families across five \u003cem\u003eP. striiformis\u003c/em\u003e pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84 identified a total of 138 CAZy gene families, with varying representation among the classes (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Table S2).\u003c/p\u003e\u003cp\u003eA total of 29 GT families were identified across the pathotypes. All 29 GT families were present in pathotypes 110S119, 46S119, 78S84, and 110S84, while 238S119 lacked GT66 and GT15. Since GT66 was consistently present in the other four pathotypes, it suggested possible loss or reduced selection pressure in 238S119 (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Table S2).\u003c/p\u003e\u003cp\u003eGHs constituted the most abundant class among the CAZymes. Pathotype 110S119 had the highest number of GH genes (170), followed closely by 78S84 (166), while 238S119 and 46S119 had 157 genes each. GH5 was highly expressed in pathotypes 110S119 and 46S119, whereas GH18 was more abundant in 110S84 and 78S84. GH47 showed increased representation in 110S119 (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Table S2).\u003c/p\u003e\u003cp\u003eEight AA families were identified across the pathotypes. The families AA1 to AA7 and AA9 were consistently present in all five Pst pathotypes. Interestingly, AA16 was uniquely present in 110S119. Pathotype 110S119 also harboured the maximum number of AA genes (47), while the minimum (35) was found in 238S119.\u003c/p\u003e\u003cp\u003eFive CE families CE4, CE5, CE8, CE10, and CE16 were identified. The total number of CE genes ranged from 45 in 238S119 to 48 in 110S84, with the remaining pathotypes harbouring 47 genes each. CE4 showed higher representation in 238S119 and 110S84 (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, Table S2). Only one CBM family, CBM21, was detected, represented by a single gene in each pathotype. Among the 44 PL families in the CAZy database, only PL1 and PL35 were detected, each with two to three associated genes per pathotype. Pathotype 110S119 showed only two PL genes, while others had three.\u003c/p\u003e\u003cp\u003eA detailed data on the distribution of individual CAZy families across the five pathotypes has been summarised in (Table S2), revealing patterns of specialization and possible adaptation. These variations may reflect differential carbohydrate metabolism and host adaptation strategies among Indian Pst pathotypes.\u003c/p\u003e\n\u003ch3\u003eClusters of orthologous groups (COG)-functional categorization\u003c/h3\u003e\n\u003cp\u003eA total of 25 functional categories were identified, spanning essential cellular and metabolic processes. The heatmap (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e) depicts the distribution and relative abundance of gene counts per COG category among pathotypes 110S119, 238S119, 46S119, 110S84, and 78S84.\u003c/p\u003e\u003cp\u003eThe most prominent category across all pathotypes were Posttranslational modification, protein turnover, chaperones (COG category O) and Replication, recombination and repair (COG Category L). This was followed by \u0026ldquo;Translation, ribosomal structure and biogenesis\u0026rdquo; (category J), \u0026ldquo;Post-translational modification, protein turnover, and chaperones\u0026rdquo; (category O), and \u0026ldquo;Amino acid transport and metabolism\u0026rdquo; (category E), indicating the active involvement of these fungi in protein synthesis, folding, and metabolic regulation. Categories associated with replication and repair (L), transcription (K), and carbohydrate metabolism (G) were also well represented, highlighting core biological functions essential for fungal survival and host colonization.\u003c/p\u003e\u003cp\u003eRelatively fewer genes were assigned to signal transduction (T), secondary metabolite biosynthesis (Q), and defense-related pathways, such as cell motility (N) and nuclear structure (Y), indicating specialized but limited roles. The \u0026ldquo;Function unknown\u0026rdquo; (category S) had a considerable representation in all pathotypes, underscoring the presence of hypothetical proteins or yet-to-be-characterized functions in the \u003cem\u003eP. striiformis\u003c/em\u003e genome.\u003c/p\u003e\u003cp\u003eOverall, the COG functional distribution revealed a conserved profile among pathotypes with subtle differences that may reflect pathotype-specific adaptations or evolutionary divergence.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eMEROPS protease family profiling\u003c/h2\u003e\u003cp\u003eMEROPS-based analysis revealed the presence of five major catalytic classes of peptidases\u0026mdash;serine (S), metallopeptidases (M), cysteine (C), threonine (T), and aspartic (A)across all five Pst pathotypes. Serine proteases were the most abundant class, dominated by members of the S09 and S10 families, followed by metallopeptidases (notably M28 and M20) and cysteine proteases (particularly C1 and C13 families) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). Subclass-level analysis (Fig. S2) highlighted a consistent and conserved distribution of key proteolytic subfamilies across pathotypes, suggesting their crucial role in fungal development, nutrient acquisition, and host-pathogen interactions. The relative enrichment of secreted serine and cysteine peptidases points toward their involvement in virulence and host tissue degradation.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eIdentification and characterization of repetitive elements in\u003c/b\u003e \u003cb\u003eP. striiformis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eRepetitive elements constituted a substantial portion of the genomes of the five Indian \u003cem\u003eP. striiformis\u003c/em\u003e pathotypes, ranging from 33.97% in 46S119 to 37.72% in 78S84 (Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e; Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). The proportion of interspersed repeats varied between 33.06% and 36.81%, accounting for the bulk of the repetitive fraction, while smaller contributions came from simple repeats (0.64 0.68%), small RNAs (0.03\u0026ndash;0.08%), satellites (\u0026le;\u0026thinsp;0.03%), and low-complexity sequences (~\u0026thinsp;0.17\u0026ndash;0.18%). Class I retroelements comprised 6.86% (46S119) to 8.99% (78S84) of the genome. The majority belonged to LTR elements (6.73\u0026ndash;8.82%), dominated by Gypsy/DIRS1 (4.66\u0026ndash;6.56%), followed by Ty1/Copia (1.79\u0026ndash;2.08%). LINE elements were present at low levels (~\u0026thinsp;0.12\u0026ndash;0.28%), while SINEs and L1/CIN4 were negligible (\u0026lt;\u0026thinsp;0.05%). Class II DNA transposons contributed 4.35\u0026ndash;5.23% across the five pathotypes. Within this class, the hobo- Activator family was the most abundant (1.31\u0026ndash;1.67%). Rolling-circle elements occurred in minor proportions (0.18\u0026ndash;0.31%). A notable fraction of the repeatome remained unclassified (21.23\u0026ndash;22.58%), representing the single largest category of repeats in all pathotypes.\u003c/p\u003e\u003cp\u003eOverall, the five pathotypes showed broadly similar repeat compositions, with 78S84 harbouring the highest repeat content (37.72%) and Pst46S119 the lowest (33.97%). The data indicate that while both Class I and Class II elements contribute significantly, the repeat landscape is dominated by unclassified elements and LTR retrotransposons, particularly Gypsy/DIRS1.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eRepeat content of the isolate Pst110S119, Pst238S119, Pst46S119, Pst110S84 and Pst78S84 identified using \u003cem\u003ede novo\u003c/em\u003e and homology-based repeat finding methods\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eItem\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003e110S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003e238S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e46S119\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e110S84\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003e78S84\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBases masked\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e34.64\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e36.51\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e34.28\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e34.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e37.92\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRetroelements\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7.61\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8.56\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6.86\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e7.70\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e8.99\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSINEs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLINEs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.12\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.28\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.14\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eL1/CIN4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLTR elements\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7.46\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e8.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6.73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e7.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e8.82\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBEL/Pao\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTy1/Copia\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1.89\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1.80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.79\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2.08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGypsy/DIRS1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6.15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e4.66\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e5.55\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e6.56\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDNA transposons\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e4.35\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e4.44\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e4.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e5.23\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ehobo-Activator\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1.52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1.31\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.67\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1.63\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTc1-IS630-Pogo\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.33\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.27\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.19\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMULE-MuDR\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.78\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.78\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.64\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.93\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTourist/Harbinger\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.44\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.59\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.70\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRolling-circles\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.31\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.31\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.24\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.20\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUnclassified\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e21.60\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e21.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e21.76\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e21.23\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e22.58\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTotal interspersed repeats\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e33.57\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e35.30\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e33.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e33.90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e36.81\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSmall RNA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.04\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSatellites\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.03\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.02\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSimple repeats\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.67\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.66\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.65\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.64\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLow complexity\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.17\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTotal\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e34.47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e36.20\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e33.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e34.75\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e37.72\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003ePhylogenomic analysis and diversity among Indian Pst pathotypes\u003c/h3\u003e\n\u003cp\u003ePhylogenetic analysis based on single-copy orthologs across 15 rust genomes including five Indian Pst pathotypes, eight other Pst isolates, \u003cem\u003eP. graminis\u003c/em\u003e, and \u003cem\u003eP. triticina\u003c/em\u003e was conducted using Ortho Finder (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). Among 242,664 total genes, 97% were grouped into 16,990 Ortho groups. Of these, 4,204 were shared by all species, and 636 were single-copy orthologs.\u003c/p\u003e\u003cp\u003eAll five Indian pathotypes formed a monophyletic clade, while \u003cem\u003eP. graminis\u003c/em\u003e and \u003cem\u003eP. triticina\u003c/em\u003e served as outgroups. Pathotypes 78S84 and 238S119 formed one subgroup; 110S119 and 46S119 formed another. 110S84 displayed the greatest divergence and grouped more closely with Australian race 134E36. Indian Pst pathotypes were genetically distinct from aggressive Western U.S. races such as PST-130 and PST-78, which showed lower divergence with CYR34 (China) and Australian races. Phylogenetic analysis also reveals the recent origin of Indian races compared to other pathotypes.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eAdvances in genome assembly of Indian Pst pathotypes\u003c/h2\u003e\u003cp\u003ePrevious genome assemblies of \u003cem\u003eP. striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e (Pst) relied heavily on short-read sequencing technologies, which led to highly fragmented and incomplete genomes. For example, the genome of PST-130 generated using Illumina short reads resulted in a 65 Mb assembly with over 29,000 contigs and an N50 of just 5 kb [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Subsequent studies on other Pst races also produced assemblies highlighting the limitations of short-read sequencing [\u003cspan additionalcitationids=\"CR12 CR13 CR14\" citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eIn contrast, our current study utilized long-read PacBio sequencing in combination with Illumina short reads to generate hybrid genome assemblies for five Indian Pst pathotypes. These assemblies showed substantial improvements in genome contiguity, with contig numbers reduced to 286\u0026ndash;877 and N50 values exceeding 180 Kb. All assemblies had genome sizes between 75 and 83 Mb, and completeness scores based on BUSCO exceeded 89.8%, indicating highly reliable and complete genome representations [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003eGene prediction and functional annotation\u003c/h2\u003e\u003cp\u003eGene prediction revealed between 14,559 to 15,283 protein-coding genes per pathotype across the five pathotypes, consistent with previous studies that estimated gene counts in Pst between 15,000 and 25,000 [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. Functional annotation via InterProScan revealed conserved protein domains such as P-loop NTP hydrolase, kinase-like domains, and WD40 repeats\u0026mdash;critical to signalling, metabolism, and cellular assembly. Pathotype 110S119 consistently showed the highest number of annotated domains, indicating potential functional enrichment. The relative depletion of domain hits in 238S119 might reflect evolutionary gene loss or divergence.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eTranscription factor diversity and gene ontology enrichment\u003c/h2\u003e\u003cp\u003eTranscription factor (TF) domain profiling highlighted differences in regulatory potential among pathotypes. Notably, 46S119 and 78S84 carried more TF-related domains including zinc-finger (C2H2), bZIP, and Homeobox, which may drive differential gene regulation and virulence expression. Gene Ontology classification showed that Biological Processes were most enriched, particularly in 110S84, with comparatively fewer annotations in 238S119. These patterns reinforce the biological complexity and variability among Indian Pst isolates.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003eCAZy profiles reveal pathotype-specific carbohydrate metabolism\u003c/h2\u003e\u003cp\u003eThe analysis of carbohydrate-active enzymes (CAZymes) provided insight into pathogenicity-related functions. CAZyme analysis revealed conserved representation of core glycoside hydrolase (GH), glycosyltransferase (GT), and auxiliary activity (AA) families essential for plant cell wall degradation [\u003cspan additionalcitationids=\"CR19\" citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Pathotype 110S119 showed the most diverse and abundant CAZyme repertoire, including a unique presence of AA16 and higher gene counts in GH5, GH47, and GT66. By contrast, 238S119 lacked GT15 and GT66, suggesting lineage-specific gene loss or differential selection. These CAZy variations imply functional adaptations influencing host colonization efficiency and virulence strategies.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003eConserved protease machinery with divergent specificity\u003c/h2\u003e\u003cp\u003eMEROPS profiling uncovered all five major protease classes across the pathotypes, with serine and metallopeptidases dominating the profiles. Serine proteases (S09, S10) and cysteine proteases (C1, C13) were notably abundant in 110S119 and 46S119, hinting at enhanced proteolytic capabilities for host tissue penetration and immune evasion. Despite general conservation, subtle differences in protease subfamily distribution reflect differential virulence strategies or stage-specific roles in fungal development [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e].\u003c/p\u003e\u003cp\u003e\u003cb\u003eIdentification and characterization of repetitive elements in\u003c/b\u003e \u003cb\u003eP. striiformis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eIn the present study, the total transposable element (TE) content across the five Indian \u003cem\u003eP. striiformis\u003c/em\u003e pathotypes ranged from 33.97% in Pst46S119 to 37.72% in Pst78S84. These values fall within the range reported earlier for \u003cem\u003eP. striiformis\u003c/em\u003e, where TE content has been shown to vary widely from 31% to 48% depending on the isolate and methodology used [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. Such variability is consistent with the highly dynamic nature of the repeatome in this pathogen.\u003c/p\u003e\u003cp\u003eWhen compared across rust fungi, the TE proportions observed here are intermediate. In cereal rust pathogens, TE content has been reported to span a much broader spectrum, ranging from 17.8% to as high as 85% of the genome [\u003cspan additionalcitationids=\"CR25 CR26 CR27\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. This suggests that while \u003cem\u003eP. striiformis\u003c/em\u003e harbors a moderately repeat-rich genome relative to other rust fungi, it still exhibits considerable repeat-driven plasticity.\u003c/p\u003e\u003cp\u003eAt the pathotype level, the data revealed notable but modest variation in repeat proportions. Pst 78S84 exhibited the highest repeat content (37.72%), whereas Pst46S119 contained the lowest (33.97%). Such differences may reflect lineage-specific proliferation or loss of particular TE families, potentially contributing to genome size variability and differential adaptability. In particular, the enrichment of repeats in Pst78S84 could be indicative of recent TE amplification events, which may increase genomic plasticity and provide a substrate for rapid evolution under host-imposed selection pressures. Conversely, the relatively lower repeat fraction in Pst46S119 may point to either historical TE silencing or a more stabilized genome structure.\u003c/p\u003e\u003cp\u003eThe predominance of LTR retrotransposons, especially Gypsy/DIRS1 elements, along with a substantial fraction of unclassified repeats (~\u0026thinsp;21\u0026ndash;23%), indicates that a significant portion of the \u003cem\u003eP. striiformis\u003c/em\u003e genome remains poorly resolved in terms of repeat family identity. This highlights both the dynamic activity of TEs and the limitations of current repeat libraries in fully capturing the diversity of fungal repetitive elements. Given that TEs are often implicated in structural rearrangements, gene regulation, and effector diversification, their abundance and variability across pathotypes may contribute directly to the rapid emergence of new virulence traits and host resistance breakdown in \u003cem\u003eP. striiformis\u003c/em\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003eFunctional categorization and hypothetical gene enrichment\u003c/h2\u003e\u003cp\u003eCOG analysis grouped genes into 25 functional categories. \"General function prediction only\" was the most represented, followed by roles in translation, protein turnover, and amino acid metabolism. Interestingly, category \u0026ldquo;S\u0026rdquo; (Function unknown) retained considerable representation, pointing toward a large pool of hypothetical proteins that merit further experimental validation for roles in pathogenicity or adaptation [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003ePhylogenetic positioning and evolutionary relationships\u003c/h2\u003e\u003cp\u003ePhylogenomic analysis using single-copy orthologs reaffirmed the distinctiveness of Indian pathotypes from global lineages. The formation of a monophyletic clade among Indian isolates suggests common ancestry, while subgroup divergence (e.g., 78S84-238S119 vs. 110S119-46S119) indicates recent evolutionary branching possibly shaped by agroecological or host selection pressures. 110S84\u0026rsquo;s unexpected proximity to Australian race 134E36 may suggest ancient genetic introgression or shared selection history. These findings align well with earlier virulence phenotyping and SSR-based diversity studies, adding a high-resolution phylogenomic dimension to Indian Pst pathotype evolution [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The inclusion of \u003cem\u003eP. graminis\u003c/em\u003e and \u003cem\u003eP. triticina\u003c/em\u003e reference genomes helped establish evolutionary divergence among rust species, reaffirming earlier findings of limited synteny and high heterozygosity in Pst [\u003cspan additionalcitationids=\"CR32\" citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003eSignificance of long-read sequencing in Pst genomics\u003c/h2\u003e\u003cp\u003eOur results underscore the value of long-read sequencing in resolving complex dikaryotic genomes like those of rust fungi. Compared to earlier short-read-based Indian genome assemblies, which yielded 24,000\u0026ndash;32,000 contigs [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], the current assemblies dramatically improved genome contiguity and accuracy. This also facilitated more accurate identification of orthologs, effectors, and other pathogenicity determinants.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\u003ch2\u003eImplications for pathogen surveillance and wheat breeding\u003c/h2\u003e\u003cp\u003eThe high-quality genome assemblies presented in this study provide a valuable resource for pathogen surveillance, effector prediction, and diagnostic development. Differences in CAZymes, proteases, and ortholog content across Indian pathotypes underscore their adaptive potential and support the need for continuous genome-based monitoring. These genomic insights can inform resistance breeding programs by enabling targeted gene deployment and facilitating the development of durable rust-resistant wheat varieties [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e\u003c/div\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe present study delivers the first long-read-based high-quality genome assemblies of five prevalent Indian \u003cem\u003eP. striiformis\u003c/em\u003e pathotypes. These genomes offer improved completeness, gene annotation, and comparative insights, surpassing previously available draft genomes. Functional classification of genes, proteases, and CAZymes highlights critical pathogenicity mechanisms and inter-pathotype variability. Phylogenetic analysis further reveals distinct evolutionary groupings among Indian and global isolates. Collectively, these genomic resources will serve as a critical platform for molecular diagnostics, pathogen surveillance, and genomics-assisted wheat improvement for stripe rust resistance.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\u003ch2\u003eGenomic DNA isolation\u003c/h2\u003e\u003cp\u003eGenomic DNA was extracted from dried urediniospores of five \u003cem\u003eP. striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e (Pst) pathotypes\u0026mdash;46S119, 78S84, 110S84, 110S119, and 238S119\u0026mdash;using the protocol of Schwessinger et al. [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. DNA quality and quantity were assessed via agarose gel electrophoresis [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e] and NanoDrop\u0026trade; 1000 spectrophotometer (Thermo Scientific, Wilmington, USA).\u003c/p\u003e\u003cdiv id=\"Sec23\" class=\"Section3\"\u003e\u003ch2\u003eGenome sequencing\u003c/h2\u003e\u003cp\u003ePaired-end sequencing libraries were prepared for the five Pst pathotypes and sequenced using the Illumina NovaSeq 6000 platform. Raw reads were processed by demultiplexing and adapter trimming at a Phred score\u0026thinsp;\u0026ge;\u0026thinsp;30 with a minimum read length of 50 bp using CASAVA v1.8.2 (Illumina Inc.). For long-read sequencing, PacBio libraries were constructed using 5 \u0026micro;g of genomic DNA. After Exo III and Exo VII digestion, libraries underwent dual Ampure purification and BluePippin size selection. Approximately 20 picomoles of the final library were loaded onto a SMRT cell and sequenced using the PacBio Sequel System and Sequel Sequencing Kit 2.0.\u003c/p\u003e\u003cp\u003e\u003cb\u003eDe novo\u003c/b\u003e \u003cb\u003ehybrid whole genome assembly\u003c/b\u003e\u003c/p\u003e\u003cp\u003eHybrid genome assembly was performed using MaSuRCA v4.1.0 [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e], which integrated short Illumina and long PacBio reads. All parameters were kept at default, except the jellyfish hash size was set at 20X the predicted genome size estimated via Genome Scope. Genome completeness was evaluated using BUSCO v5.5.0 [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] with the Basidiomycota dataset.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec24\" class=\"Section2\"\u003e\u003ch2\u003eGenome annotation and gene prediction\u003c/h2\u003e\u003cp\u003eGenome annotation was carried out for \u003cem\u003eP. striiformis\u003c/em\u003e pathotypes (46S119, 78S84, 110S84, 110S119, and 238S119) using the Funannotate pipeline v1.8.17, which integrates repeat masking, ab initio gene prediction, and structural and functional annotation. Repeat masking was performed using Tantan. A combination of gene predictors\u0026mdash;Augustus, GeneMark-ES, SNAP, GlimmerHMM, and CodingQuarry\u0026mdash;was used for coding sequence identification. High-quality RNA-seq data [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] was employed to train the predictors and refine gene models using PASA v2.5.2 for UTR annotation and structural corrections.\u003c/p\u003e\u003cdiv id=\"Sec25\" class=\"Section3\"\u003e\u003ch2\u003eFunctional annotation and comparative analysis\u003c/h2\u003e\u003cp\u003eFunctional annotation was achieved using InterProScan to assign conserved domains (Pfam, SMART, TIGRFAMs) and Gene Ontology (GO) terms. EggNOG-mapper was used for GO and COG classification. Additional annotations included identification of CAZymes, peptidases, and signal peptides. Funannotate\u0026rsquo;s comparative module clustered protein sequences into orthogroups to identify core, accessory, and unique gene families. Outputs included summary statistics, heatmaps, bar plots, and NMDS plots.\u003c/p\u003e\u003cp\u003e\u003cb\u003eIdentification and characterization of repetitive elements in\u003c/b\u003e \u003cb\u003eP. striiformis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eRepeat Modeller, and Repeat Classifier were used to identify the genome specific repeats in the assembled \u003cem\u003ePst\u003c/em\u003e isolates [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Known repeats were searched using RepBase database [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. \u003cem\u003ede novo\u003c/em\u003e and known repeats were masked using Repeat Masker. For detecting repetitive elements, a hierarchical approach was applied which included an initial detection of de novo repeats using Repeat Modeller and Repeat Classifier followed by searching for the taxa specific repeats using RepBase.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec26\" class=\"Section3\"\u003e\u003ch2\u003ePhylogenetic Analysis\u003c/h2\u003e\u003cp\u003eSingle-copy orthologs from OrthoFinder [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e] were used to construct a phylogeny comprising Indian Pst pathotypes (110S119, 238S119, 46S119, 110S84, 78S84), global Pst isolates (PST130, PST134E16, CYR34, etc.), and reference genomes of \u003cem\u003eP. graminis tritici\u003c/em\u003e (PGT CRL 75-36-700-3) and \u003cem\u003eP. triticina\u003c/em\u003e (PT 15). Alignments were done using ClustalW and the phylogeny was inferred using RAxML. Visualization was performed using Fig Tree.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eCTAB: Cetyl trimethylammonium bromide; Pgt: \u003cem\u003ePuccinia graminis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e; Pst: \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e; Pt: \u003cem\u003ePuccinia triticina\u003c/em\u003e; SNP: Single nucleotide polymorphism; \u003cem\u003eYr\u003c/em\u003e: Yellow (stripe rust) resistance; MaSuRCA: Maryland Super-Read Celera Assembler; BUSCO: Benchmarking Universal Single-Copy Orthologs\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData Availability:\u003c/strong\u003e The whole-genome sequencing data generated in this study have been deposited in the International Nucleotide Sequence Database Collaboration (INSDC) under BioProject accession number PRJEB97944. The corresponding BioSample accessions are ERS26890115 (\u003cem\u003ePst\u003c/em\u003e pathotype 110S119), ERS26890116 (\u003cem\u003ePst\u003c/em\u003e pathotype 110S84), ERS26890117 (\u003cem\u003ePst\u003c/em\u003e pathotype 238S119), ERS26890118 (\u003cem\u003ePst\u003c/em\u003e pathotype 46S119), and ERS26890119 (\u003cem\u003ePst\u003c/em\u003e pathotype 78S84).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e(The datasets are currently confidential and will be released automatically upon acceptance of this manuscript as per repository policies).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFirst author is grateful to SERB, DST; CII and NGB Diagnostics Pvt. Ltd., for Prime Minister Fellowship and Indian Institute of Wheat and Barley Research, Karnal for providing stripe rust pathotypes.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNo dedicated funding was available for this research work.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors’ contributions\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAll authors read and approved the final manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors’ contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAS performed genome assembly, annotation, and drafted the manuscript. DS and RG contributed to genome annotation and repeat analysis. OPG and JK carried out pathotype multiplication, virulence testing, and DNA isolation. MC and ISY assisted in data analysis and phylogenomic interpretation. SK contributed to data acquisition, provided supervision, and critical inputs. PC conceived and supervised the study, coordinated data acquisition and research activities, and finalized the manuscript. All authors read and approved the final manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eCurtis BC, Rajaram S, Gomez Macpherson H. Bread Wheat; Improvement and Production. FAO Plant Production and Protection Series. Volume 30. Rome: FAO; 2002.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDuveiller E, Singh RP, Nicol JM. The challenges of maintaining wheat productivity: pests, diseases, and potential epidemics. Euphytica. 2007;157(3):417\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFigueroa M, Hammond-Kosack KE, Solomon PS. A review of wheat diseases-field perspective. Mol Plant Pathol. 2018;19:1523\u0026ndash;26.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen X, Kang Z, editors. Stripe Rust. Netherlands, Dordrecht: Springer; 2017.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWellings CR. Global status of stripe rust: a review of historical and current threats. Euphytica. 2011;179:129\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePannu PPS, Mohan C, Gurdeep SG, Jaspal K. Occurrence of yellow rust of wheat, its impact on yield viz-a-viz its management. Plant Dis Res. 2010;25:144\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBayles R, Flath K, Hovmoller M, de Vallavieille-Pope C. Breakdown of the \u003cem\u003eYr17\u003c/em\u003e resistance to yellow rust of wheat in Northern Europe. Agronomie. 2000;20:805\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu W, Maccaferri M, Rynearson S, Letta T, Zegeye H, Tuberosa R, Chen X, Pumphrey M. 2017. Novel sources of stripe rust resistance identified by genome-wide association mapping in Ethiopian durum wheat (\u003cem\u003eTriticum turgidum\u003c/em\u003e ssp. \u003cem\u003edurum\u003c/em\u003e). Front Plant Sci. 2017;8 \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpls.2017.00774\u003c/span\u003e\u003cspan address=\"10.3389/fpls.2017.00774\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKiran K, Rawal HC, Dubey H, Jaswal R, Bhardwaj SC, Prasad P, Pal D, Devanna BN, Sharma TR. Dissection of genomic features and variations of three pathotypes of \u003cem\u003ePuccinia striiformis\u003c/em\u003e through whole genome sequencing. Sci Rep. 2017;7:1\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCantu D, Govindarajulu M, Kozik A, Wang M, Chen X, Kojima KK, Jurka J, Michelmore RW, Dubcovsky J. Next generation sequencing provides rapid access to the genome of \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e, the causal agent of wheat stripe rust. PLoS ONE. 2011;6:e24230.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchwessinger B, Rathjen JP. Wheat rust diseases extraction of high molecular weight DNA from fungal rust spores for long read sequencing. Methods Mol Biol. New York: Humana; 2017. pp. 49\u0026ndash;57.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchwessinger B, Chen YJ, Tien R, Vogt JK, Sperschneider J, Nagar R, McMullan M, Sicheritz-Ponten T, Sorensen CK, Hovmoller MS, Rathjen JP, Justesen AF. Distinct life histories impact dikaryotic genome evolution in the rust fungus \u003cem\u003ePuccinia striiformis\u003c/em\u003e causing stripe rust in wheat. Genome Bio Evol. 2020;12:597\u0026ndash;617.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVasquez-Gross H, Kaur S, Epstein L, Dubcovsky J. A haplotype-phased genome of wheat stripe rust pathogen \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e, race PST-130 from the estern USA. PLoS ONE. 2020;15(11):e0238611.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi C, Qiao L, Lu Y et al. 2023. Gapless genome assembly of \u003cem\u003ePuccinia triticina\u003c/em\u003e provides insights into chromosome evolution in Pucciniales. Microbiol Spectr. 2023;11: e0282822.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYadav IS, Bhardwaj SC, Kaur J, Singla D, Kaur S, Kaur H, et al. Whole genome resequencing and comparative genome analysis of three \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e pathotypes prevalent in India. PLoS ONE. 2022;17(11):e0262697.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCisse OH, Stajich JE. 2019. FGMP: assessing fungal genome completeness. BMC Bioinformatics 2019;20,184. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12859-019-2782-89\u003c/span\u003e\u003cspan address=\"10.1186/s12859-019-2782-89\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZheng W, Huang L, Kang Z, et al. High genome heterozygosity and endemic genetic recombination in the wheat stripe rust fungus. Nat Commun. 2013;4:2673.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVan Den Brink J, De Vries RP. Fungal enzyme sets for plant polysaccharide degradation. Appl Microbiol Biotechnol. 2011;91:1477\u0026ndash;92. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00253-011-3473-2\u003c/span\u003e\u003cspan address=\"10.1007/s00253-011-3473-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXia C, Qiu A, Wang M, Liu T, Chen W, Chen X. Current status and future perspectives of genomics research in the rust fungi. Int\u0026rsquo;l J Mol Sci. 2022;23:9629.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePark BH, Karpinets TV, Syed MH, Leuze MR, Uberbacher EC. CAZymes analysis toolkit (cat): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. J Glycobiol. 2010;20:1574\u0026ndash;84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/glycob/cwq106\u003c/span\u003e\u003cspan address=\"10.1093/glycob/cwq106\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMoolhuijzen P, See PT, Hane JK, Shi G, Liu Z, Oliver RP, et al. Comparative genomics of the wheat fungal pathogen \u003cem\u003ePyrenophora tritici-repentis\u003c/em\u003e reveals chromosomal variations and genome plasticity. BMC Genomics. 2018;19:1\u0026ndash;23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12864-018-4680-3\u003c/span\u003e\u003cspan address=\"10.1186/s12864-018-4680-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCantu D, Segovia V, MacLean D, Bayles R, Chen X, Kamoun S, et al. Genome analyses of the wheat yellow (stripe) rust pathogen \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e reveal polymorphic and haustorial expressed secreted proteins as candidate effectors. BMC Genomics. 2013;14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/1471-2164-14-270\u003c/span\u003e\u003cspan address=\"10.1186/1471-2164-14-270\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXia C, Wang M, Yin C, Cornejo OE, Hulbert SH, Chen X. Genome sequence resources for the wheat stripe rust pathogen (\u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e) and the barley stripe rust pathogen (\u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003ehordei\u003c/em\u003e). MPMI. 2018;31:1117\u0026ndash;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1094/MPMI-04-18-0107-A\u003c/span\u003e\u003cspan address=\"10.1094/MPMI-04-18-0107-A\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAime MC, McTaggart AR, Mondo SJ, Duplessis S. Phylogenetics and phylogenomics of rust fungi. Adv Genet. 2017;100:267\u0026ndash;307.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKrishnan P, Meile L, Plissonneau C, et al. Transposable element insertions shape gene regulation and melanin production in a fungal pathogen of wheat. BMC Biol. 2018;16:78.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRaffaele S, Kamoun S. Genome evolution in filamentous plant pathogens: Why bigger can be better. Nat Rev Microbiol. 2012;10:417\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi C, Qiao L, Lu Y, et al. Gapless genome assembly of \u003cem\u003ePuccinia triticina\u003c/em\u003e provides insights into chromosome evolution in Pucciniales. Microbiol Spectr. 2023;11:e0282822.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLian J, Li Y, Dodds PN, et al. Haplotype- Phased and chromosome-level genome assembly of \u003cem\u003ePuccinia polysora\u003c/em\u003e, a giga-scale fungal pathogen causing Southern corn rust. Mol Ecol Resour. 2023;23:601\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu G, Wu Z, Peng Y, Shang X, Gao L. Integrated transcriptome and proteome analysis provide insight into the ribosome inactivating proteins in \u003cem\u003ePlukenetia volubilis\u003c/em\u003e seeds. Int J Mol Sci. 2022;23(17):9562.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSingh H, Kaur J, Bala R, Srivastava P, Bains NS. 2020. Virulence and genetic diversity of \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e isolates in sub-mountainous area of Punjab, India. Phytoparasitica 2020; 48:383\u0026ndash;395.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDuplessis S, Cuomo CA, Lin YC, et al. Obligate biotrophy features unravelled by the genomic analysis of rust fungi. Proc Natl Acad Sci USA. 2011;108(22):9166\u0026ndash;71.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDuplessis S, Bakkeren G, Hamelin R. Advancing knowledge on biology of rust fungi through genomics. Adv Bot Res. 2014;70:173\u0026ndash;209.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCuomo CA, Bakkeren G, Khalil HB, Panwar V, Joly D, Linning R, Sakthikumar S, Song X, Adiconis X, Fan L, Goldberg JM, Levin JZ, Young S, Zeng Q, Anikster Y, Bruce M, Wang M, Yin C, McCallum B, Szabo LJ, Hulbert S, Chen X, Fellers JP. Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci G3 genes genomes. Genet. 2017;7:361\u0026ndash;76.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchwessinger B, Sperschneider J, Cuddy WS, Garnica DP, Miller ME, Taylor JM, Dodds PN, Figueroa M, Park RF, Rathjen JP. 2018. A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e reveals high Inter-haplotype diversity. mBio. 2009;9: e02275-17.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGreen MR, Sambrook J. Analysis of DNA by agarose gel electrophoresis. Cold Spring Harbor Protocols, 2019 (5), pdb. Top 100388. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1101/pdb.top100388\u003c/span\u003e\u003cspan address=\"10.1101/pdb.top100388\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZimin AV, Mar\u0026ccedil;ais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. MaSuRCA Genome Assembler Bioinform. 2013;29(21):2669\u0026ndash;77.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eManni M, Berkele MR, Seppey M, Sim\u0026atilde;o FA, Zdobnov EM. Mol Biol Evol. 2021;38(10):4647\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGutha GV, Kaur J, Singla D, Chhuneja P, Saharan A, Gangwar OP, Bala R, Mir RR, Tak PS. 2025. Use of field pathogenomics approach for \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e race identification and phylogenomic delineation in north India. World J Microbiol Biotechnol. 2025;41(5): 166.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinf. 2009;1\u0026ndash;14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/0471250953.bi0410s25\u003c/span\u003e\u003cspan address=\"10.1002/0471250953.bi0410s25\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBao W, Kojima KK, Kohany O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6:4\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s13100-015-0041-9\u003c/span\u003e\u003cspan address=\"10.1186/s13100-015-0041-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEmms DM, Kelly S. Ortho Finder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Genome assembly, Gene annotation, Puccinia striiformis f. sp. tritici, Pathotype, Phylogenomics, Stripe rust","lastPublishedDoi":"10.21203/rs.3.rs-7535284/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7535284/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eStripe rust, caused by \u003cem\u003ePuccinia striiformis\u003c/em\u003e f. sp. \u003cem\u003etritici\u003c/em\u003e (\u003cem\u003ePst\u003c/em\u003e), poses a significant threat to global wheat production. Resistance in wheat cultivars is frequently overcome due to rapid evolution of pathogen virulence. Until recently, genome assemblies of Indian \u003cem\u003ePst\u003c/em\u003e pathotypes were based exclusively on short-read sequencing, which is limited in resolving the highly repetitive and heterozygous dikaryotic genomes of rust fungi.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eWe generated hybrid genome assemblies for five Indian \u003cem\u003ePst\u003c/em\u003e pathotypes (110S119, 238S119, 46S119, 110S84, and 78S84) using high-coverage PacBio and Illumina sequencing. Assembly with Maryland Super-Read Celera Assembler (MaSuRCA) resulted in genome sizes ranging from 75.21 Mb (110S119) to 83.03 Mb (78S84), with contig counts ranging from 286 to 877. All assemblies exhibited GC content\u0026thinsp;\u0026gt;\u0026thinsp;44% and \u0026gt;\u0026thinsp;90% completeness based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis, indicating high assembly quality. Gene prediction with Funannotate identified 14,559 to 15,283 protein-coding genes per pathotype. Functional classification of predicted proteins was performed using InterProScan. Phylogenetic analysis based on single-copy orthologs clustered the five Indian pathotypes into a single clade, with 78S84 and 238S119 forming one subgroup, and 110S119 and 46S119 another.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e\u003cp\u003eThese high-quality genome assemblies represent the first long-read-based resources for Indian \u003cem\u003ePst\u003c/em\u003e pathotypes and provide valuable genomic insights into stripe rust diversity and evolution. They will serve as a foundation for rust surveillance, evolutionary studies, and the development of durable resistance in wheat.\u003c/p\u003e","manuscriptTitle":"Genomic insights into Indian wheat stripe rust pathotypes from long-read hybrid assemblies","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-07 18:58:32","doi":"10.21203/rs.3.rs-7535284/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"902eb173-1dc1-485f-9d6f-ebe26e725873","owner":[],"postedDate":"October 7th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-11-14T16:08:30+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-07 18:58:32","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7535284","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7535284","identity":"rs-7535284","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.