Insights into Genetic Heterogeneity and Drug Resistance in Leishmania donovani of the Indian sub-continent from genomic data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Insights into Genetic Heterogeneity and Drug Resistance in Leishmania donovani of the Indian sub-continent from genomic data Aroop Verma, Indu Kumari, Anupam Nath Jha This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7267613/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 19 Nov, 2025 Read the published version in Antonie van Leeuwenhoek → Version 1 posted 9 You are reading this latest preprint version Abstract Leishmaniasis is one of the neglected tropical diseases that is endemic to over 90 countries and it’s cases are being reported from non-endemic countries as well like Austria. The cases Visceral Leishmaniasis (VL) caused by Leishmania donovani are concentrated on the Indian Subcontinent. Several studies on the genetic heterogeneity, aneuploidy, and drug resistance emergence reported from Indian subcontinent as well as globally. However, no research has yet inspected the genomic data from the Indian subcontinent to depict a larger microscopic investigation at the genome level. We have considered whole genome sequence (WGS) data from the publicly available database i.e. ENA. Analysis has shown that there is tetraploidy in the chromosome 31, trisomy in chromosome 2 and 8 and trisomy was observed among chromosome 6 and 15 among some samples. This aneuploidy pattern is evolving over time observed in the present study. The pattern of aneuploidy variations of the Indian subcontinent sample differ from from other continents. Further, ATP-binding cassette (ABC) family, Amastin-like surface proteins, A2 genes, amino acid permeases, heat shock 70-related protein 1, mitochondrial precursor and sodium stibogluconate resistance protein were the proteins that exhibited maximum number of mutations among all the analyzed samples. The proteins showing highest number of mutations belong to membrane prtoeins that are involved in drug resistance mechanism. Most of these proteins are involved in the virulence and drug resistant mechanism. The present study provides the possible candidates which can be targeted to disarm the virulence of the protozoan and drug candidates for therapeutic interventions. Whole Genome Sequences Leishmania Indian sub-continent Drug Resistance Single Nucleotide Polymorphism Mosaic Aneuploidy Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Introduction Leishmaniasis is caused by vector-borne, obligate intracellular, kinetoplastid parasites belonging to the genus Leishmania (Family Trypanosomatidae) (Torres-Guerrero, Quintanilla-Cedillo et al. 2017 ). Transmission is through the bite of an infected female phlebotomine sandfly(Schlein 1993 ). Leishmaniasis has three main types of clinical manifestations: Cutaneous Leishmaniasis, Mucosal Leishmaniasis, and Visceral Leishmaniasis. Disease manifestation and severity depends upon the host immune response, the causal pathogen and/or any comorbidities(Alvarenga, Escalda et al. 2010 ) (Awasthi, Mathur et al. 2004 ). The World Health Organization (WHO) has declared Leishmaniasis as a Neglected Tropical Disease (NTD) (Kumari, Lakhanpal et al. 2023 ). It is most prevalent in the subtropical and tropical regions of the globe; mainly Europe, Northern Africa, the Middle East, Asia and part of South America (Martínez, Guerrero et al. 2011 ). According to the CDC, current estimates of disease incidence for Cutaneous Leishmaniasis have ranged from 700,000 to 1.2 million cases per year and it's been estimated that the number of new cases per year for Visceral Leishmaniasis has decreased from 400,000 to less than 100,000 (Prevention, 2020). Visceral Leishmaniasis (VL), caused by the Leishmania donovani complex which includes L. donovani and L. infantum , is the most lethal form of Leishmaniasis. Majority of disease burden (90%) has been reported in the following countries; India, Eiteria, Iraq, Somalia, Brazil, Ethiopia, Kenya, South Sudan, Nepal, China, and Sudan (Patil and Chatterjee 2023 , Saha, Borah et al. 2023 ). India contributes 18% of the total reported VL cases. VL is endemic to 54 districts scattered in four states — Bihar (33 out of 38 districts), Jharkhand (4 out of 24 districts), Uttar Pradesh (6 out of 75 districts) and West Bengal (11 out of 23 districts). In recent time, the vector of leishmaniasis has been detected in the Western Himalayan region of India like Jammu and Kashmir, Gujarat, Assam, Himachal Pradesh, Haryana, Tamil Nadu, Puducherry, Madhya Pradesh, Kerala, Sikkim and Uttaranchal which raises a concern (Patil and Chatterjee 2023 , Sofi, Godara et al. 2023 ). With the unconventional spread of Leishmaniasis to Himalaya, there is a need for the reassessment of Leishmaniasis endemicity in the Himalayan region of North India. Pentamidine, Antimonial [Sodium stibogluconate (SSG) and meglumine antimoniate (MA)], Miltefosine, Amphotericin B and Paromomycin are the anti-leishmaniasis agents used in the first line and second line therapy (Bharadava et al., 2024 ; (Mann, Frasca et al. 2021 )). Beauvericin (BEA) is a fungal secondary metabolite synthesized by Beauveria bassiana which is potent ABC transporter inhibitor. Its application along with anti-leishmanial drug may enhance therapeutic index of anti-leishmanial agents (Al Khoury et al., 2024 ). Emergence of drug-resistance strains has rendered traditionally employed treatment methods ineffective. The use of chemotherapy remains viable but it is expensive and has toxic side effects (Lira, Sundar et al. 1999 ). Hence there is an urgent need for novel drugs to combat the emerging peril of drug-resistant strains and effectively decrease the virulent effect of Leishmania (Bora and Jha 2020 ). Other mode of therapy includes thermotherapy, electric field stimulation and diverse topical treatment methods that are in clinical or preclinical trials, such as formulations containing paromomycin, imiquimod, miltefosine, morphine, and buparvaquone components (Pinheiro et al., 2022). Currently new chemical classes which are in clinical phase 1 like anti-leishmanial classes nitroimidazoles and benzoxaboroles, with DNDI-0690 and DNDI-6148 as lead compounds, respectively, the pyrazolopyrimidine scaffold recognized as CRK-12 kinase inhibitors. Advancements in high throughput sequencing technologies have greatly reduced the time and capital required to sequence whole genomes of organisms, making it possible to analyse complete genome sequences and investigate the biology in greater detail (Reuter, Spacek et al. 2015 ). WGS has enabled the genetic characterization of novel spp. of Leishmania ellisi from USA (Sapp, Low et al. 2024 ). The meta-analysis of transcriptomic data of Leishmania spp. has been carried out to understand the immune system’s response to its infection (Rezaei, Tahmasebi et al. 2024 ). Recent changes have been observed in the epidemiology of leishmaniasis with reports of it emerging in the non-endemic country such as Austria. The development highlights the impact of climate and habitat changes, urbanization, and globalization on the spread of the disease in regions previously unaffected (Bora and Nath Jha 2019 , Riebenbauer, Czerny et al. 2024 ). There are reports of VL from certain villages of Nepal, where the patients did not have any travel history outside of their village. However, it is likely that traders and visitors might have contributed to the spread of the disease. The reported incidences of infection in new loci of Nepal can be a threat to the elimination effort of Leishmaniasis (Joshi, Banjara et al. 2024). L. donovani offspring obtained from a cross between two sand fly species has resulted in genetic diversity in HASPB antigen which has major impact on the therapeutic vaccine design (Kumari, Mamud et al. 2023 , Sádlová, Yeo et al. 2024 ). The genomic studies of Leishmania spp. has explored the genetic plasticity and constantly ongoing rearrangement of the genome that leads to variation in copy number, gene clusters or the chromosome (Santi and Murta 2022 ). Several studies indicate that the clinical isolates of Leishmania spp. has shown increased resistance against pentavalent antimony (Santi and Murta 2022 ). In Indian samples, it has been observed that high levels of As(III) in the water has led to the selection of Sb-resistant parasites. Further, change in aneuploidy has been triggered by the drug resistance against antimony (Dumetz, Imamura et al. 2017 ). The changing epidemiology of leishmaniasis globally and in India is a matter of concern. To understand the genetic constitution of Leishmania spp., we have retrieved publicly available Leishmania donovani whole genome sequencing (WGS) data specifically isolated from the Indian subcontinent. We analyzed the genomes of the various strains and reported the variants which can be targeted for the drug development, aiming to reduce the severity of the disease. We tracked and reported the variants located within the functional domains of proteins. The study will help to explore the genome of Leishmania under different stress, environmental conditions and drug pressure. Methodology Retrieval of WGS data of Leishmania donovani and alignment to reference genome A total of 34 sample data of the Indian Subcontinent were downloaded through a combination of text-based and advanced search approach from the European Nucleotide Archive ( https://www.ebi.ac.uk/ena/browser/ ) (Supplementary Table 1). The samples sequencing approach was whole genome sequencing and biological source of the samples was late passage, bone marrow and Atypical cutaneous Leishmania donovani . Fastq files were downloaded and quality check was performed using the FASTQC version: 0.11.9 (Andrews 2010 ). FASTQC generates a visual report of the sequencing files based on several analytical modules. These reports were used to select specific cut-offs for riding sub-quality sequence calls. Base calls with a Q score of less than 20 were filtered out using Trimmomatic Version: 0.39 (Bolger, Lohse et al. 2014 ). Leishmania donovani reference genome assembly ASM22713v2 (Downing, Imamura et al. 2011 ) was utilised and the GenBank (Clark, Karsch-Mizrachi et al. 2016 ) assembly GCA_000227135.2 was retrieved from NCBI. The reference genome was indexed using Burrows-Wheeler aligner (bwa) version: 0.7.17-r1188 ( https://bio-bwa.sourceforge.net/bwa.shtml ) ‘index’ tool while SAMtools Version: 1.10 (using htslib 1.10.2-3ubuntu0.1) (Danecek, Bonfield et al. 2021 ), PICARD version:2.27.5 ( https://github.com/broadinstitute/picard ) were used to create a dictionary of the reference sequence. The ‘faidx’ (Shirley, Ma et al. 2015 ) tool was used to create a bed file. Reads were aligned against the reference sequence using bwa mem that resulted in a sequence alignment map (SAM) files (Li, Handsaker et al. 2009 ). SAM files were sorted, duplicate marked, converted into binary alignment map (BAM) files, and read group information added using PICARD. SAMtools were used to get alignment, depth and coverage information, which were employed to discard low depth and low coverage samples from further analysis. Variant discovery and annotation Genome Analysis Toolkit (GATK v4.3.0.0) (Van der Auwera, Carneiro et al. 2013 ), a suite of tools was used to call variants following the best practices workflow for germline short variants – Single Nucleotide Polymorphisms (SNPs) and Insertion/Deletions (InDels) – discovery. Variant calling was performed using GATK HaplotypeCaller using default settings. The resultant variant call format (VCFs) were filtered for SNPs, removing Insertion/Deletions. The following thresholds were chosen to filter SNPs following the GATK hard filtering recommendations, QD < 20, QUAL < 30, SOR < 3, FS < 60, MQ < 4, MQRankSum < − 12.5 and ReadPosRankSum < − 8, to remove low quality or false positive variants which may contaminate downstream processing, annotation and analysis. Annotation was performed using ANNOVAR (Wang, Li et al. 2010 ) and snpEff (version:5.1d) (Cingolani, Platts et al. 2012 ) separately to obtain common annotated genes. Variant Analysis Multianno files obtained after annotating VCF files using ANNOVAR were imported into R and analyzed using the R package MAFtools from Bioconductor (Mayakonda, Lin et al. 2018 ). VCF files annotated with snpEff were converted to Mutation Annotation Format (MAF) using VCF2maf perl script ( Mskcc/VCF2maf , n.d.) to extract variant information such as gene id, variant, variant location, variant type and effect, HGVS.p. SNPs and InDels were split into different maf files due to quality control and processing differences. Using the MAFtools R package, a summary of the MAF files was plotted and most mutated genes were discovered and reported using the plotmafsummary and oncoplot functions of the package. MAF files containing InDel information were analyzed using shell commands and various information associated with InDels were extracted. InDel affected genes common to all strains were investigated; top mutated genes with more than 15 InDel events were compared and projected with a heatmap using the pheatmap package. Aneuploidy and copy number analysis Aneuploidy and Somy profiles were calculated using a read depth based approach. Samtools was used to extract raw read depth information of the aligned bases from the BAM files. Normalised read depths were estimated by dividing raw depth by the mean raw read depth of specific chromosomes. The haploid depth was defined as the median of the normalised read depths and chromosomal somy was estimated by multiplying the haploid depth by two given the assumption that there exist two copies of chromosomes. Sample ploidy was defined as the median of the chromosomal somies calculated in the previous step. A heatmap was prepared to visualise the different chromosomal profiles using the pheatmap package in R. Copy Number Variations (CNV) were investigated with Control-FREEC (Boeva, Popova et al. 2012 ) using a sliding window approach, window size of 5000 and step size of 250. The following parameters were also included in the config file: breakPointThreshold = 0.8, minExpectedGC = 0.35, maxExpectedGC = 0.65, ploidy = 2, and inputFormat = BAM. Chromosomal Somy and Copy Number Variant estimation Chromosomal Somy was estimated using read depth statistics of the sequenced reads. Raw depth of a position (i Raw depth) and mean read depth (Ch Mean depth) of a chromosome were used to calculate the normalised read depth of a position (i Norm ) and the chromosomal Somy (S) was calculated using the following formula: 2 × Median( i Norm = i Raw depth / Ch Mean depth ). The range of monosomy, disomy, trisomy, tetrasomy, and pentasomy was then used to define the full cell-normalised chromosome depth or S as S < 1.5, 1.5 ≤ S < 2.5, 2.5 ≤ S < 3.5, 3.5 ≤ S < 4.5, and 4.5 ≤ S < 5.5, respectively, as previously described. (Dumetz F, Imamura H, Sanders M, Seblova V, Myskova J, Pescher P, Vanaerschot M, Meehan CJ, Cuypers B, De Muylder G, Späth GF, Bussotti G, Vermeesch JR, Berriman M, Cotton JA, Volf P, Dujardin JC, Domagalska MA. Modulation of Aneuploidy in Leishmania donovani during Adaptation to Different In Vitro and In Vivo Environments and Its Impact on Gene Expression. mBio. 2017 May 23;8(3):e00599-17. doi: 10.1128/mBio.00599-17 . PMID: 28536289; PMCID: PMC5442457. )The results were projected onto a heatmap and dendrograms using the R package ‘pheatmap’. The isolates were clustered into groups based on the euclidean distances between their chromosomal profiles. Chromosomes were also clustered into groups based on the Euclidean distance of chromosomal copy numbers profile of a chromosome across isolates. CONTROL-FREEC (Boeva, Popova et al. 2012 ) was employed to investigate the Copy Number Variation (CNV). Control-FreeC parameters were defined as follows: Window size = 5,000, Step size = 250, breakPointThreshold = 0.8, minExpectedGC = 0.35, maxExpectedGC = 0.65, ploidy = 2, and inputFormat = BAM. Functional domain analysis For functional domain analysis, genes affected by missense mutations which were common to most samples were analyzed. Multiple missense variants were observed in most of the genes but variants which affected the majority of the samples were selected for downstream analysis. 70 genes were annotated using the Uniprot database (The UniProt Consortium, 2017); 19 genes were uncharacterized/ hypothetical and were removed from subsequent analysis. Protein sequence information for 51 remaining genes was retrieved from Uniprot, which included protein name, length, domain(s) and domain location(s). The domain annotations were provided by Interpro (Paysan-Lafosse, Blum et al. 2023 ). Only 30 protein sequences were found to have relevant Interpro domain annotations; variants found inside these domains were reported. Results Sample details and quality check The organism was isolated from human, biological replicate and laboratory-adapted strains. Leishmania sp. cultured under different conditions and drug resistance has shown different mutations and genetic makeup (Supplementary Table 1). The samples were quality checked, processed, and aligned to the reference genome. The Binary Alignment Map (BAM) files were used to assess the alignment and depth statistics of each sample (Fig. 1 ). 2 samples were discarded from further analysis because of insufficient alignment to the reference genome, due to the incompatibility of the read files with BWA alignment software. The 31 remaining samples were used for further analysis. The samples used in this study have been provided in Supplementary Table 2. Aneuploidy change pattern in Leishmania donovani samples from Indian Sub-continent All the analyzed samples have shown tetrasomic state as reported in the earlier studies. In addition to this, trisomy was observed in chromosomes 2 and 8 in BHU 1220, BHU 575, MHOM/IN/1983/AG83, MHOM/IN/2013, MHOM/IN/2014, K133 samples and tetrasomic states in BPK275/0/cl18 samples. Contradictorily, the Nepalese strain BPK275/0/cl18 did not show either trisomy or tetrasomy in chromosomes 2 and 8, and clustered away from the other Nepalese strains, with the K133 strain isolated from India. A few Indian strains along with one Nepalese strain, BPK275/0/cl18, K133PMM/R, K133AS/R, K133WT, MHOM/IN/2010/T3, MHOM/IN/IN/T8, MHOM/IN/2012/T9, MHOM/IN/2014/A2, MHOM/IN/2013/P2, MHOM/IN/2014/H1, BHU 575, showed the disomic state of chromosome 2 and clustered together separate from the other Indian cluster (Fig. 2 ). It should be noted that the strains are clustering into 4 major clusters on the similarity in their aneuploidy profile. These clusters encompass the two distinctly separate groups of Indian strains, a Nepalese cluster and a mixed group with Nepalese and Indian strains. Here, we observed that the Indian cluster of strains isolated from India BHU 1220, MHOM/IN/1983/AG83.i and ii, West Bengal MHOM/IN/2013/M1, Jharkhand MHOM/IN/2014/P6, and MHOM/IN/2014/T10, showed expected karyotypes as most chromosomes were disomic barring the trisomic chromosomes 6 and 15, and the tetrasomic chromosome 31. On the other hand, the second Indian cluster comprising strains isolated from Bihar MHOM/IN/2010/T3, MHOM/IN/2010/T7, MHOM/IN/2012/T9, MHOM/IN/2013/P2, West Bengal MHOM/IN/2012/T8, Jharkhand MHOM/IN/2014/H1, MHOM/IN/2014/A2 and BHU 575 strain showed a remarkable difference in aneuploidy profile; chromosome 31 and 8 showed an increase in copy number which reflected in their tetrasomic and trisomic states, chromosomes 2, 5, 9, 11, 13, 14, 16, 20, 22, 23, 26, 35 showed variable aneuploidy, ranging from disomy to trisomy, which was lacking in the previous Indian cluster. The Nepalese strains grouped together and surprisingly were closer to an Indian cluster due to similarity in aneuploidy profiles, except the chromosome 2 which distinctly showed tetrasomic condition in Nepalese strains BPK275/0/cl18. Finally, the remaining cluster, was a mixed group of one Nepalese strain and three Indian strains. This may have been due to the similarity in tetrasomic state of chromosome 5 along with trisomy observed in chromosomes 6, 9, 13, 15, 20, 23, 26, and 33. Chromosomes 2, 8 and 31 were also observed to be clustered together, possibly due to their peculiar nature of existing in unusually higher copy number states, mostly trisomic and/or tetrasomic with respect to the other chromosomes. The chromosomes 5, 9, 13, 14, 16, 20, 23, 26, 33, and 35 were grouped together, possibly because of their tendency to exhibit variable aneuploidies ranging from disomy to tetrasomy. Chromosomes 6 and 15 show occasional trisomy and hence grouped together. Rest of the chromosomes showed normal copy numbers with minimal deviation from disomy. The overall heatmap of the aneuploidy change in the samples from Indian subcontinent has shown trisomy or tetrasomy in the whole or partially in chromosome 2 and 31. This observation is supported by earlier study that has shown chromosome 31 is superanumerary and tetrasomic (Pilling, Reis-Cunha et al. 2023 ). The aneuploidy changes of Leishmania under the effect of paromomycin can be observed in the samples with BPK* id, most of the samples have shown trisomy in Chromosome 2. WGS samples of late passage has shown tetrasomy in chromosome 31 and trisomy in chromosome 5 and 6. The present analysis show a very high genetic modification in Leishmania donovani sequenced from patients of Indian sub-continent and drug resistance also alter the aneuploidy of Leishmania . Distribution of variants among samples The variant analysis has shown that the sample from the Himachal region of India has recorded the highest number of variants in the coding sequences, gene interval and chromosomal interval (Supplementary Figs. 1 and 2). Rest of the samples have shown variants in the same range i.e. 1000 to 10000 in number. We observed the genotype distribution of SNPs in different strains; Indian strains MHOM/IN/2010, MHOM/IN/2012, MHOM/IN/2013, and MHOM/IN/2014 showed an average SNP frequency of 1900. These strains were isolated from different regions of India mainly from Bihar, Delhi, Jharkhand, and West Bengal. Strains with unknown states as geographical locations were tagged as ‘Indian’. The strain MHOM/IN/2010/T7, isolated from Bihar, had the highest SNP occurrence of 1975 while Jharkhand isolate MHOM/IN/2014/H1 had the lowest SNP occurrence of 1868. Indian strains BHU1220, isolated from Bihar, showed 2089 SNPs, MHOM/IN/1983/AG showed a higher SNP frequency of 2707 and 2719, and BHU575/0/cl2 with 3800 SNPs was the highest. Indian strains K133WT, K133AS-R, and K133PM-R showed abundance with each having 3713, 3281 and 3488 SNPs respectively. We looked at the distribution of genotypes in these samples; 86–91% of SNPs had Heterozygous Reference Alternate (Het RA) genotype, 9–13% had Homozygous Alternate (Hom AA) genotype and a miniscule percentage, around 0–0.002% of SNPs represented the Heterozygous Alternate (Het AA) genotype, which represents two different alternate alleles in place of the reference (Fig. 3 ). The Nepalese strains BPK275/0/cl18 showed a distinctly higher SNP abundance than Indian strains and an average SNP abundance of 3482 SNPs. BPK275/0/cl18.vii, isolated from Koshi province, showed the lowest SNP abundance of 2962 while BPK275/0/cl18 was the highest – 4003, among all strains from both India and Nepal. Most of the Nepalese strains were isolated from the Koshi province of Nepal, barring BPK275/0/cl18 of the study SRR15851294 which did not have an associated geographical location. The variant frequency distribution is similar among all the samples except the one obtained from the Himalayan region. We also analyzed the Insertion/Deletions occurrence frequency in the samples and reported the different genotypes associated (Fig. 4 ). Quite interestingly, the Indian strains MHOM/IN/2010 T3, T7, MHOM/IN/2012/T8, T9, MHOM/IN/2013/M1, P2, MHOM/IN/2014/A2, H1, P6 and T10 had remarkably high insertion/deletion frequency compared to rest of the samples. The highest occurrence was in P2–29760, and the lowest – 18,153 – in T8. Among the strains, there existed unusually high counts of Het ALT/REF alleles nearing 20,000 to 30,000 in a few strains. The average insertion/deletions variants in these strains were 24,060. While the Het REF/ALT alleles were in abundance, the Hom ALT and Het ALT alleles showed sparse distribution. Allele frequency ranged from strain MHOM/IN/2014/T10, from Jharkhand, having the most with 314 Hom AA and 16 Het AA alleles and strain MHOM/IN/2014/A2, from West Bengal, having the least with 165 Hom AA and 8 Het AA alleles. Other Indian strains such as the BHU1220, BHU575/0/cl2, MHOM/IN/1983/AG83 and K133 strains showed markedly higher frequency of Hom AA and Het AA while exhibiting negligible Het RA alleles relative to the strains mentioned previously. The Nepalese strains BPK275/0/cl18 showed limited InDels relative to the Indian strains, while also exhibiting a higher frequency of Hom ALT and Het ALT alleles. The strain BPK275/0/cl18.vii showed lower amounts of insertion/deletions in comparison to other Nepalese strains. The sample of Himachal state of India has around 1 lakh missense and 40000 InDels. To decipher the distribution of different variant types within the strains used in this study, we utilized snpEff for annotation. The method involves identifying and attaching the relevant sequence ontological terms to the variants, as well as predicting their effect types and prospective impacts. Most of the variants appeared to be impacting non-coding genomic regions which included the 5’Flank, 3’Flank, and IGR. We observed an abundance of Frame_Shift_Del events and Missense_Mutation which impacted the coding portion of the genome. Few variants which impacted the coding regions were associated with in-frame deletion or insertion, missense and silent mutations. A tiny proportion of the variants represented intron, nonsense, nonstop, splice region, splice site and targeted region classes (Fig. 5 ). We looked at the distribution of different variant types across strains. The majority of the variants were in 3’ Flank or 5’ Flank across the strains that constituted 75.1% of the total variants. The strains had an average of 7.43% IGR variants. A relatively small share of the variants affected the coding regions of the gene sequences, these were Frame_Shift_Del, Frame_Shift_Ins, and Missense Mutations, contributing to an average of 8.37%, 1.18% and 4.54% respectively (Fig. 6 ). When we compared the average variant proportion between Indian and Nepalese strains we found there to be significant differences between Frame_Shift_Del variants, 3’Flank and IGR regions. We observed that Indian strains MHOM/IN/2010/, MHOM/IN/2012/, MHOM/IN/2013/, and MHOM/IN/2014 exhibited substantial amounts of variants tagged as ‘Frame_Shift_Del’, except for the BHU 1220, BHU575/0/cl2, MHOM/IN/1983/AG83 and K133 strains. This might be due to the presence of a higher number of insertion/deletion events, potentially leading to an increased representation of these variants in the samples. This was supported by the higher frequency of these variants relative to other strains in the population. It ranged from 4364 in MHOM/IN/2013/M1 and 1497 in MHOM/IN/2014/H1. The proportion also remained high at 27 to 18% of the total variants in these samples. Frame_Shift_Del events in other Indian strains remained lower; only occurring as less than 1% of the total variants in the samples. The frequency relative to other Indian strains previously mentioned also remained significantly low i.e. 30, 41, 44, 53 respectively. This might be due to the lesser insertion/deletion events in these strains which could be contributing to remarkably higher frame shift events. Whereas the Nepalese strains exhibited less than 1% of these variants, except strain BPK/275/0/cl18.vii which had 5% of ‘Frame_Shift_Del’ variants, which could be possible due to relatively lower insertion/deletion events in these strains. Nepalese strains also showed a higher mean share of Missense_Mutations, around 5.59%, than the Indian strains which showed 3.78%. IGR also contributed to 9.74% in Nepalese strains while 5.76% in Indian strains. In the Himalayan region sample, downstream and upstream variants comprises of 36% and 35% of the total variants and 9.2% in the exon region. Single nucleotide polymorphisms impact functional domains We used the Mutation Annotation Format (MAF) files generated by the R package MAFtools to investigate the abundance of ‘Missense variants’ in different isolates. We observed a median of 310 missense variants in the isolates affecting an array of genes, mainly imparting function as proteins involved in biosynthesis pathways, transport, anchorage, and as receptor proteins involved in cellular uptake of essential amino acids, interaction with host immune system and drug resistance. Among the strains from the Indian subcontinent those from Nepal showed a relatively high frequency of missense variations. Although the Indian isolate K133WT exhibited the highest missense mutation frequency, also K133 and BHU575/0/cl2 strains had a higher frequency of mutations which could be due to the exposure from Paromomycin (PMM) drugs (Fig. 7 ). Fifty genes were found to be affected by missense variants in the entire population of isolates (Fig. 8 ). These included genes of ATP-binding cassette (ABC) family, Amastin-like surface proteins, A2 genes, amino acid permeases, and others. ABC subfamily A, member 2 (ABCA2) and member 9 (ABCA9) genes showed the most mutations, 503 and 326 respectively. We observed an increase in copy number of ABC subfamily A, member 9 protein (LDBPK_270840) in all the isolates. ABCA9 protein has two ‘ABC transporter domains’ from amino acids (aa) 760–993 and 1547–1777; while none of the domains were affected by variants, several missense variants of ‘Moderate Impact’ were observed in the gene sequence. It has been reported that Leishmania modulates aneuploidy in response to drug pressure, as ABC family has been shown to be present in higher copy number. ABC family of proteins are membrane transporter proteins which have been observed to be involved in drug resistance against antileishmaniasis through efflux (Ouellette, Légaré et al. 1998 ). Three amastin genes that exhibit high mutation count; Amastin-like surface protein (LDBPK_341150), Amastin surface glycofamily protein (LDBPK_044340), and Amastin surface glycofamily protein (LDBPK_140500) were among that constituted high number of mutations. It has been proposed that Amastin-like surface glycoproteins play a crucial role in macrophage-parasite interaction and parasite virulence. As reported by in a study wherein they knocked down the δ-Amastin expression in L. braziliensis using RNA-interference and observed decreased intracellular parasites in amastigotes in-vitro and in-vivo . We observed 2 missense mutations and 3 silent mutations in LDBPK_341150, 2 missense and 1 silent mutation in LDBPK_044340 and 1 missense mutation in LDBPK_140500. The 3’a2rel (LDBPK_220560) and 5’a2rel (LDBPK_220660) related proteins of the A2-A2rel cluster showed an increased missense variant count, the A2rel genes have been predicted to be involved in the visceralization of the leishmaniasis infection, as reported in a study on BALB/c mice by infection of L. major Friedlin V9 strain which was transfected with A2 expressing plasmids .The 3’a2rel sequence showed 7 moderate impact missense variants, 2 silent mutations and the 5’a2rel sequence showed missense, a modifier 5’ Flank and silent mutations. Several surface protein encoding genes showed high frequency of missense mutations (Fig. 9 ) including the Phosphoglycan-ꞵ-1,3-Galactosyltransferase, partial protein (LDBPK_020150), Mannosyltransferase-like protein (LDBPK_311920) and Amastin-like surface protein (LDBPK_341150). Several studies have implicated the Lipophosphoglycan (LPG), which are glycolipid surface proteins, which play a key role in the propagation of the parasite, through the insect vector, by facilitating the interaction of promastigotes with midgut epithelial lining of sandfly. (Ilg T., Sacks D et al.) LPGs are involved in parasite defence against the host innate immune system by avoiding the formation of membrane attack complexes. These glycoconjugates have also been shown to have a role in promoting cellular uptake of Leishmania parasites into the host macrophages, through the complement system, where they take refuge and multiply inside the acidic phagolysosomes. (SM Puentes et al., PJ Green et al.) In response to the selective pressures by the host environment and immune system Leishmania parasites alter surface physiology by altering LPG side chains which contribute to enhanced survivability and pathogenicity. (Roditi I et al.) The geographical distribution of sandfly influences species-specific LPG alterations and contribute to the successful transmission of parasite from sandfly vector to vertebrate host. (Ilg T.) Differential host-species and host-strain interactions could be contributing to the observed mutational frequency in the Galactosyltransferase gene. Mannosyltransferases are a type of galactosyltransferase which act on mannose and produces Mannogen, a unique carbohydrate material soluble in the cytoplasm, a homopolymer of ꞵ-1,2-linked mannose, 4–40 residues long. (Ralton et al., Sernee et al.) Mannogen is constitutively produced by seven Mannosyltransferases/phosphorylases (MTPs). Infective stage amastigotes have accumulated Mannogen, as it has been reported that defective Mannogen biosynthesis can lead to loss of virulence and attenuated growth in macrophages. (Sernee, M. Fleur et al.) There exist 𝛼-mannosyltransferases that catalyse the synthesis of N-linked glycans and glycosylphosphatidylinositol (GPI) glycolipids which promotes anchoring of surface proteins, including LPGs, by GPI. (Hilley et al.) Among the multi-hit top mutated genes were Amino acid permease, putative (LDBPK_270530), Tryparedoxin peroxidase (LDBPK_151140), cAMP specific phosphodiesterases (LDBPK_151450). Arginine is an essential amino acid for Leishmania hence it needs to be sourced from exogenous sources. Arginine uptake and homeostasis is maintained by transmembrane proteins such as amino acid permeases which transport Arginine on depletion in the parasite. (Opperdoes FR et al., Darlyuk, Ilona et al.) Tryparedoxin peroxidase enzyme, which is upregulated in the presence of Reactive Oxygen Species (ROOs), helps mitigate cellular toxicity by eliminating toxic agents responsible for Ca2 + influx, also an increase in Tryparedoxin peroxidases leads to lesser responsiveness to anti-leishmanial Antimony. (Iyer JP et al.) Various membrane and surface proteins have been implicated in the pathogenicity of Leishmania which also showed high frequency of missense variations, such as the ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210) and member 9, putative (LDBPK_111210), 3’a2rel-related protein (LDBPK_220560). The A2 gene family is essential for visceralization of the disease as it has been observed to be absent in Leishmania major which causes Mucocutaneous Leishmaniasis (MCL) but is present in L. donovani which results in the visceral manifestation of the disease. (Zhang et al.). Next, we investigated if any of the mutations affected the functional domains of protein products of the gene sequences. We found 20 genes which had a reported functional domain that happened to be impacted by the missense mutations. The findings are reported in the Table 1. Indian and Nepalese L. donovani isolates have distinct mutational profiles We checked for genes which were affected by indels and found 370 genes common to all isolates. Our analysis revealed differential mutation profiles for each gene in different isolates. A cluster analysis of isolates was performed based on Indel mutation profiles and the results are shown in the below. (Fig. 10 ) We observed the clustering of Leishmania donovani strains into three main groups, one of the clusters showed exclusively Indian isolates, cluster A, while the other two showed an intermixing of both Indian and Nepalese isolates, cluster B and C. Cluster A comprised the strains MHOM/IN/2010/T3, MHOM/IN/2010/T7, MHOM/IN/2012/T8, MHOM/IN/2012/T9, MHOM/IN/2013/P2, MHOM/IN/2013/P2/M1, MHOM/IN/2014/A2, MHOM/IN/2014/H1, MHOM/IN/2014/P6, and MHOM/IN/2014/T10. These strains showed increased mutations in Bridge-like lipid transfer protein family member 1 N-terminal domaining containing protein (LDBPK_160180), Mannosyltransferase like-protein (LDBPK_311920), Calpain-like cysteine peptidase, putative (LDBPK_270510, ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210), and uncharacterized proteins (LDBPK_333030-LDBPK_333040). Strains M1, P2 and T10 isolated from West Bengal, Bihar and Jharkhand grouped together since they had frequency of variants in the above-mentioned genes. Cluster B comprised Indian and Nepalese isolates; BPK275/0/cl18.i, BPK275/0/cl18.ii, BPK275/0/cl18.iii, BPK275/0/cl18.vi, BPK275/0/cl18.viii, BPK275/0/cl18.ix,BPK275/0/cl18.xi, MHOM/IN/1983/AG83.i, MHOM/IN/1983/AG83.ii, and K133AS-R, K133PMM-R and K133WT. This cluster exhibited higher mutations in Uncharacterized protein (LDBPK_310470), uncharacterized proteins (LDBPK_333030-LDBPK_333040), Calpain-like cysteine peptidase, putative (LDBPK_270500-LDBPK_270510), Phosphoglycan beta 1,3 galactosyltransferase (LDBPK_020150), Amastin-like surface protein, putative (LDBPK_341690), Amino acid permease, putative (LDBPK_272680), Amastin-like surface protein, putative (LDBPK_341720), Amastin surface glycofamily protein-CHR_END (LDBPK_044360-CHR_END), more than 30 variants were seen in these genes. A few more genes were observed that contained mutations, not as many as the above mentioned, these were; pseudo (LDBPK_310470), Uncharacterized protein (LDBPK_120661-LDBPK_120667), ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210), Uncharacterized protein (LDBPK_093020), Tyrosine phosphatase family protein (LDBPK_321070), Amastin surface glycofamily protein (LDBPK_044340), and Uncharacterized protein (LDBPK_333030-LDBPK_333040). Cluster C comprised Indian and Nepalese isolates; BHU 1220, BHU 575/0/cl2, BPK275/0/cl18.iv, BPK275/0/cl18.v, BPK275/0/cl18.vi, BPK275/0/cl18.x, BPK275/0/cl18.xii, and BPK275/0/cl18.xiii. Strains showed similar mutation profiles to the strains in cluster B but distinct to the cluster A strains. Isolates showed a peculiar abundance in mutation in the Amastin surface glycoprotein (LDBPK_044360-CHR_END) gene and the Amastin-like surface protein, putative (LDBPK_341720). Indels aggregate in specific regions of the L. donovani genome Here we provide a comprehensive report on the Indel variation in the population of Nepalese and Indian isolates (Table 2). 91 indels were seen to be occurring in all the samples, with different variant classifications, in noncoding and coding regions of the genome. Only 7.6% i.e. 7 indels affected the coding regions. These variants were classified as either ‘In_Frame_Ins’ or ‘In_Frame_Del’, meaning they existed inside the reading frame of the genes and were size multiples of three, thereby preserving the reading frame. Whereas 92%, i.e. 84 indels were in noncoding regions and mostly were size multiples of two. Indels in coding regions may have a severe impact on the structure and function of gene products and hence are negatively selected, which explains the lower frequency of reading frame disrupting indels. Non-coding indels may be better tolerated due to more lenient selection pressure. It has been proposed that non-coding variants may play a role in influencing the gene expression of upstream and downstream genes. 91 variants were spread over coding and non-coding regions of 54 genes. These variants were observed to be occurring in and around specific nucleotide positions of gene sequences. The ATP-binding cassette protein, family A, member 2 (ABCA2) showed 6 frameshift indels, 585_586insTA, 567_568insCTTGC, 557_558insA, 580_581delCA, 572delA, and 559_563delGGTGT, all of which push the reading frame of the coding sequence and are concentrated around nucleotide (nt) 550 to 590. Hypothetical predicted transmembrane protein (LDBPK_240440) had three in-frame indels and two frameshift indels concentrated at nt 1270 to 1290. IGR between two Amastin-like proteins, LDBPK_080760 and LDBPK_080780 had four indels, which could possibly be involved in influencing gene expressions. Indels were seen in the 5’ Flank region of the Histone H2 protein. 3',5'-cyclic nucleotide phosphodiesterase, putative indels were at 5’Flank upstream of the gene sequence. Sodium stibogluconate resistance protein, putative had indels which resulted in reading frame shift which were predicted to be of high impact. This trend continued in the rest of the genes, thereby indicating that indels tend to arise in specific regions of the gene sequences most prone to alterations. Copy number alterations We observed that all of the chromosomes were predicted to have local copy number alterations in them. Chromosomes 36 through 26, 24, 22, 21, 16, 12, 7 and 2 were observed to have developed variations in copy number in all 31 samples. While chromosome 1, 3 and 4 were predicted to have local copy number variations in 5, 14 and 17 samples respectively. We checked genes that were associated with loss in copy number and the frequency of samples they occurred in. We observed that most of the genes showing a loss in copy numbers, that occurred in ≥ 30 samples, existed on chromosomes 2, 3, 7, 8, 9, 16, 22, 26 through 36. We checked the same for genes showing a gain in copy number and observed that the same chromosomes, which showed increased loss in copy number alterations, also had increased gain in copy number events. We found 393 genes with a gain in copy numbers; the highest number of CNV alterations were predicted in the chromosome 8 and 31 which may be because of their increased chromosomal copy number. Discussion In this study, WGS data of the Indian sub-continent has been analyzed to investigate the SNP, CNV and ploidy distribution in this region. There are few reports on Indian samples that deals with the genome-wide analysis of paromomycin (PMM) resistant L. donovani and genetic characterization of L. donovani variant from Himachal Pradesh (Ghosh, Kumar et al. 2022 , Lypaczewski, Thakur et al. 2022 ). The aneuploidy analysis has shown that the samples BPK* that belong to Nepal has shown different aneuploidy patterns as compared to other samples. The possible reason is that L. donovani samples considered for analysis are PMM resistance that act as stimulus for aneuploidy modification. K133WT (Delhi, India) sample that has maximum number of variants has shown triploidy at chromosome 5. BHU.1220 sample collected during 1992 shows tetraploidy at the chromosome 31 only. Therefore, the analysis clearly indicates that Leishmania has evolved over the time i.e. from 1992 to 2019 and there has been change in the ploidy in response to drug resistance and other stress conditions. The samples from Bihar, Jharkhand and West Bengal have shown triploidy in chromosome 31, however two samples have shown triploidy. It is evident that L. donovani has different genetic constitution in Nepal area than Indian states (Fig. 10 ). Drug pressure in K133* sample and BPK* sample may lead to genetic drift in the parasite population that lead to change in ploidy. (Fig. 11 ) L. donovani sample from Himachal has shown highest number of variants and this is attributed to its genetic hybridization between L. donovani parents from ‘Yeti’ clade reported from Nepal highlands. The SNPs that were present in either samples were removed and the novel SNPs present in this sample has been discussed in the following text. The details of the variants of the sample are given in Supplementary Table 3. It was observed that the following proteins were most mutated in the sample of Himachal namely, LdBPK_111210 encoding an ABC protein subfamilyA, LDBPK_060660 protein kinase and putative LdBPK_171090.1, LDBPK_281830, LDBPK_361500, LDBPK_091050, LDBPK_161130, LDBPK_262120, LDBPK_251150, LDBPK_191110 genes encode for hypothetical proteins (Supplementary Fig. 3). It is evident that the proteins which have shown more mutations are hypothetical proteins and need to be characterized. However, this sample follows the same pattern as other samples in case of higher mutation rate in transporter protein. Since this sample organism has been reported as an intraspecies hybrid two distinct clades of L. donovani from Ethiopia can be the reason that there are numerous novel mutations (Lypaczewski, Thakur et al. 2022 ). However in all samples of Indian sub-continent transporter protein, amastin like proteins, protein involved in drug resistance, heat shock proteins and hypothetical proteins have shown maximum variants. The details of the variants present in all samples are given in Supplementary Table 4 and the name of protein encoded by corresponding gene is given in Supplementary Table 5. There is genetic plasticity in Leishmania genome, CNV affects either some genomic region or whole chromosomes as observed for chromosome 8 and 31 that leads to aneuploidy. In recent studies, only chromosome 31 has shown high number of CNVs, however the present analysis shows high number of CNVs in chromosome 8 and 31 (Ghosh, Kumar et al. 2022 , Lypaczewski, Thakur et al. 2022 ). This analysis of Indian sub-continent data is providing a wholesome image of CNV which is different from ealier obsevations. The proteins which have shown maximum missense mutations are transporter protein, proteins involved in carbohydrate metabolism, heat shock protein, amastin proteins, sodium stibogluconate resistant protein and hypothetical proteins. While in the case sample from Delhi, India, mutations were observed in transcription, translation (60S ribosomal protein L22, 60S ribosomal protein L40, RNA binding protein, eukaryotic translation initiation factor 4E-1 and 4E-2), virulence factor (protein-tyrosine phosphatase), and signaling molecule (protein kinase and protein-tyrosine phosphatase) (Ghosh, Kumar et al. 2022 ). 42 ABC genes namely ABCA to ABCH has been reported. In previous studies, membrane/transporter proteins like ABCG1, ABCG2, ABCG3, ABCA2, ABCA7, and ABCA8 has shown missense mutations (Ghosh, Verma et al. 2020 ). However, in the present study, ABC subfamily A, member 2 (ABCA2) and member 9 (ABCA9) has shown maximum missense mutations which are primarily involved in drug efflux and responsible for multi-drug resistance. Two transmembrane glycorotein i.e. Amsatin-like surface protein and amastin surface glycofamily protein has shown maximum mutations after transporter proteins. As reported in earlier studies, amastin-like protein is reported to be an important precursor for the phenotype resistance development against antimonial and paramomycin (Ghosh, Kumar et al. 2022 ). Mannosyltransferase protein has shown high number of missense mutations. The overexpression of Mannosyltransferase protein in promastigote and amastigote forms of L. braziliensis protects from trivalent antimony (Ribeiro, Rocha et al. 2019 ). Heat shock protein 70-related protein 1, putative, mitochondrial has high missense mutations and in the earlier studies, other types of HSP proteins has shown mutations or altered expression and are also linked to miltefosine resistance (Vacchina, Norris-Mullins et al. 2016 , Alcolea, Alonso et al. 2023 ). Leishmaniasis treatment failure to sodium stibogluconate has been described in some studies which are related to expression of γ-Glumatylcysteine synthase, however there is no record of the missense mutations of sodium stibogluconate resistant protein putative. It was observed that sodium stiboglocunate resistant protein putative is among the top five mutated protein in the samples of Indian subcontinent. The mutated proteins observed from WGS data of L. donovani from ISC may act as initial step for further validation of these proteins in the virulence and understanding the drug resistance mechanism of these proteins. CONCLUSION Owing to the changing climate conditions, Indian sub-continent geographical area is more prone to these changes. The analysis of the clinical samples of L. donovani from Indian sub-continent has provided novel insights into the genetic makeup of the protozoan present here. The overall trend of the mutations in the protein of L. donovani from Indian sub-continent shows that there is increasing mutations in the proteins which are involved in the virulence and drug-resistant mechanism. The membrane proteins has shown most of the variants as they throw out the drugs from the pathogen by efflux mechanism. CNV analysis has shown different pattern of aneuploidy in the L. donovani from Indian sub-continent when compared to the earlier studies from other part of the world. Therefore, the proteins and variants obtained can be the founding steps to develop strategic drugs for the treatment of Leishmaniasis. Declarations Declaration of competing interest The authors declare that they have no conflict of interest. Author Contribution AV has curated the data, carried out the WGS analysis, and written the first draft of the paper. IK: Streamlined the analysis workflow, written the results and discussion. ANJ: has checked, edited, and given suggestions in the manuscript. Acknowledgement ANJ gratefully acknowledges the ICMR project for providing computing facilities. References Alcolea, P. J., A. Alonso, F. García-Tabares, J. Larraga, L. T. Martins, F. J. Loayza, S. Ruiz-García and V. Larraga (2023). "An insight into differential protein abundance throughout Leishmania donovani promastigote growth and differentiation." International Microbiology 26 (1): 25-42. Alvarenga, D. G. d., P. M. F. Escalda, A. S. V. d. Costa and M. T. F. D. Monreal (2010). "Leishmaniose visceral: estudo retrospectivo de fatores associados à letalidade." Revista da Sociedade Brasileira de Medicina Tropical 43 : 194-197. Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data, Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom. Awasthi, A., R. K. Mathur and B. Saha (2004). "Immune response to Leishmania infection." Indian Journal of Medical Research 119 : 238-258. Boeva, V., T. Popova, K. Bleakley, P. Chiche, J. Cappo, G. Schleiermacher, I. Janoueix-Lerosey, O. Delattre and E. Barillot (2012). "Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data." Bioinformatics 28 (3): 423-425. Bolger, A. M., M. Lohse and B. Usadel (2014). "Trimmomatic: a flexible trimmer for Illumina sequence data." Bioinformatics 30 (15): 2114-2120. Bora, N. and A. N. Jha (2020). "In silico metabolic pathway analysis identifying target against leishmaniasis–a kinetic modeling approach." Frontiers in Genetics 11 : 179. Bora, N. and A. Nath Jha (2019). "An integrative approach using systems biology, mutational analysis with molecular dynamics simulation to challenge the functionality of a target protein." Chemical biology & drug design 93 (6): 1050-1060. Cingolani, P., A. Platts, L. L. Wang, M. Coon, T. Nguyen, L. Wang, S. J. Land, X. Lu and D. M. Ruden (2012). "A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3." fly 6 (2): 80-92. Clark, K., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell and E. W. Sayers (2016). "GenBank." Nucleic acids research 44 (D1): D67-D72. Danecek, P., J. K. Bonfield, J. Liddle, J. Marshall, V. Ohan, M. O. Pollard, A. Whitwham, T. Keane, S. A. McCarthy and R. M. Davies (2021). "Twelve years of SAMtools and BCFtools." Gigascience 10 (2): giab008. de Paiva, R. M. C., V. Grazielle-Silva, M. S. Cardoso, B. N. Nakagaki, R. P. Mendonca-Neto, A. M. C. Canavaci, N. Souza Melo, P. M. Martinelli, A. P. Fernandes and W. D. daRocha (2015). "Amastin knockdown in Leishmania braziliensis affects parasite-macrophage interaction and results in impaired viability of intracellular amastigotes." PLoS pathogens 11 (12): e1005296. Downing, T., H. Imamura, S. Decuypere, T. G. Clark, G. H. Coombs, J. A. Cotton, J. D. Hilley, S. de Doncker, I. Maes and J. C. Mottram (2011). "Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance." Genome research 21 (12): 2143-2156. Dumetz, F., H. Imamura, M. Sanders, V. Seblova, J. Myskova, P. Pescher, M. Vanaerschot, C. J. Meehan, B. Cuypers and G. De Muylder (2017). "Modulation of aneuploidy in Leishmania donovani during adaptation to different in vitro and in vivo environments and its impact on gene expression." MBio 8 (3): 10.1128/mbio. 00599-00517. Ghosh, S., V. Kumar, A. Verma, T. Sharma, D. Pradhan, A. Selvapandiyan, P. Salotra and R. Singh (2022). "Genome-wide analysis reveals allelic variation and chromosome copy number variation in paromomycin-resistant Leishmania donovani." Parasitology Research 121 (11): 3121-3132. Ghosh, S., A. Verma, V. Kumar, D. Pradhan, A. Selvapandiyan, P. Salotra and R. Singh (2020). "Genomic and transcriptomic analysis for identification of genes and interlinked pathways mediating artemisinin resistance in Leishmania donovani." Genes 11 (11): 1362. Joshi, A. B., M. R. Banjara, M. L. Das, P. Ghale, K. R. Pant, U. R. Pyakurel, G. Dahal, K. P. Paudel, C. L. Das, A. J. T. A. J. o. T. M. Kroeger and Hygiene (2024). "Epidemiological, Serological, and Entomological Investigation of New Visceral Leishmaniasis Foci in Nepal." The American Journal of Tropical Medicine and Hygiene 110 (1): 44. Kumari, I., D. Lakhanpal, S. Swargam and A. Nath Jha (2023). "Leishmaniasis: Omics approaches to understand its biology from molecule to cell level." Current Protein and Peptide Science 24 (3): 229-239. Kumari, P., A. Mamud and A. N. Jha (2023). "Review on the Drug Intolerance and Vaccine Development for the Leishmaniasis." Current Drug Targets 24 (13): 1023-1031. Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin and G. P. D. P. Subgroup (2009). "The sequence alignment/map format and SAMtools." bioinformatics 25 (16): 2078-2079. Lira, R., S. Sundar, A. Makharia, R. Kenney, A. Gam, E. Saraiva and D. Sacks (1999). "Evidence that the high incidence of treatment failures in Indian kala-azar is due to the emergence of antimony-resistant strains of Leishmania donovani." The Journal of infectious diseases 180 (2): 564-567. Lypaczewski, P., L. Thakur, A. Jain, S. Kumari, K. Paulini, G. Matlashewski and M. Jain (2022). "An intraspecies Leishmania donovani hybrid from the Indian subcontinent is associated with an atypical phenotype of cutaneous disease." Iscience 25 (2). Mann, S., K. Frasca, S. Scherrer, A. F. Henao-Martínez, S. Newman, P. Ramanan and J. A. Suarez (2021). "A review of leishmaniasis: current knowledge and future directions." Current tropical medicine reports 8 : 121-132. Martínez, F. V., E. T. Guerrero, R. Arenas and M. Q. Cedillo (2011). "Leishmaniasis en México." Medicina Cutánea Ibero-Latino-Americana 39 (4): 163-183. Mayakonda, A., D.-C. Lin, Y. Assenov, C. Plass and H. P. Koeffler (2018). "Maftools: efficient and comprehensive analysis of somatic variants in cancer." Genome research 28 (11): 1747-1756. Ouellette, M., D. Légaré, A. Haimeur, K. Grondin, G. Roy, C. Brochu and B. Papadopoulou (1998). "ABC transporters in Leishmania and their role in drug resistance." Drug Resistance Updates 1 (1): 43-48. Patil, R. R. and P. K. Chatterjee (2023). "Epidemiology of Visceral Leishmaniasis in India." Paysan-Lafosse, T., M. Blum, S. Chuguransky, T. Grego, B. L. Pinto, G. A. Salazar, M. L. Bileschi, P. Bork, A. Bridge and L. Colwell (2023). "InterPro in 2022." Nucleic acids research 51 (D1): D418-D427. Pilling, O. A., J. L. Reis-Cunha, C. A. Grace, A. S. Berry, M. W. Mitchell, J. A. Yu, C. R. Malekshahi, E. Krespan, C. K. Go and C. Lombana (2023). "Selective whole-genome amplification reveals population genetics of Leishmania braziliensis directly from patient skin biopsies." PLoS Pathogens 19 (3): e1011230. Reuter, J. A., D. V. Spacek and M. P. Snyder (2015). "High-throughput sequencing technologies." Molecular cell 58 (4): 586-597. Rezaei, Z., A. Tahmasebi and B. Pourabbas (2024). "Using meta-analysis and machine learning to investigate the transcriptional response of immune cells to Leishmania infection." PLOS Neglected Tropical Diseases 18 (1): e0011892. Ribeiro, C. V., B. F. B. Rocha, D. d. S. Moreira, V. Peruhype-Magalhães and S. M. F. Murta (2019). "Mannosyltransferase (GPI-14) overexpression protects promastigote and amastigote forms of Leishmania braziliensis against trivalent antimony." Parasites & vectors 12 : 1-7. Riebenbauer, K., S. Czerny, M. Egg, N. Urban, T. Kinaciyan, A. Hampel, L. Fidelsberger, F. Karlhofer, S. Porkert and J. Walochnik (2024). "The changing epidemiology of human leishmaniasis in the non-endemic country of Austria between 2000 to 2021, including a congenital case." PLOS Neglected Tropical Diseases 18 (1): e0011875. Sádlová, J., M. Yeo, D. S. Mateus, J. Phelan, L. A. Hai, T. Bhattacharyya, S. Kurtev, O. Sebesta, J. Myskova and V. Seblova (2024). "Comparative genomics of Leishmania donovani progeny from genetic crosses in two sand fly species and impact on the diversity of diagnostic and vaccine candidates." PLOS Neglected Tropical Diseases 18 (1): e0011920. Saha, D., N. J. Borah and A. N. Jha (2023). "Molecular scaffold recognition of drug molecules against essential genes of Leishmania donovani using biocomputing approach." South African Journal of Botany 162 : 52-63. Santi, A. M. M. and S. M. F. Murta (2022). "Impact of genetic diversity and genome plasticity of Leishmania spp. in treatment and the search for novel chemotherapeutic targets." Frontiers in Cellular and Infection Microbiology 12 : 9. Sapp, S. G., R. Low, G. Nine, F. S. Nascimento, Y. Qvarnstrom and J. L. Barratt (2024). "Genetic characterization and description of Leishmania (Leishmania) ellisi sp. nov.: a new human-infecting species from the USA." Parasitology Research 123 (1): 52. Schlein, Y. (1993). "Leishmania and sandflies: interactions in the life cycle and transmission." Parasitology Today 9 (7): 255-258. Shirley, M. D., Z. Ma, B. S. Pedersen and S. J. Wheelan (2015). Efficient" pythonic" access to FASTA files using pyfaidx, PeerJ PrePrints. Sofi, O. M. U. D., R. Godara, R. Katoch and A. Yadav (2023). "Morphological and molecular evidence of sandfly (Diptera: Psychodidae: Phlebotomine) and its relevance to recent cases of leishmaniasis from Jammu region of North India." Journal of Vector Borne Diseases 60 (3): 328-332. Torres-Guerrero, E., M. R. Quintanilla-Cedillo, J. Ruiz-Esmenjaud and R. Arenas (2017). "Leishmaniasis: a review." FResearch 6 . Vacchina, P., B. Norris-Mullins, E. Carlson and M. Morales (2016). "A mitochondrial HSP70 (HSPA9B) is linked to miltefosine resistance and stress response in Leishmania donovani." Parasites & vectors 9 : 1-15. Van der Auwera, G. A., M. O. Carneiro, C. Hartl, R. Poplin, G. Del Angel, A. Levy‐Moonshine, T. Jordan, K. Shakir, D. Roazen and J. Thibault (2013). "From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline." Current protocols in bioinformatics 43 (1): 11.10. 11-11.10. 33. Wang, K., M. Li and H. Hakonarson (2010). "ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data." Nucleic acids research 38 (16): e164-e164. Zhang, W. W. and G. Matlashewski (2001). "Characterization of the A2–A2rel gene cluster in Leishmania donovani: involvement of A2 in visceralization during infection." Molecular microbiology 39 (4): 935-948. Bharadava, K., Upadhyay, T. K., Kaushal, R. S., Ahmad, I., Alraey, Y., Siddiqui, S., & Saeed, M. (2024). Genomic Insight of Leishmania Parasite: In-Depth Review of Drug Resistance Mechanisms and Genetic Mutations. ACS omega, 9(11), 12500-12514. Al Khoury, C., Thoumi, S., Tokajian, S., Sinno, A., Nemer, G., El Beyrouthy, M., & Rahy, K. (2024). ABC transporter inhibition by beauvericin partially overcomes drug resistance in Leishmania tropica. Antimicrobial Agents and Chemotherapy, 68(5), e01368-23. Tables Tables 1 and 2 are available in the Supplementary Files section. Additional Declarations No competing interests reported. Supplementary Files Table1functionaldomains.xlsx Table2Indels.xlsx SupplementaryTable.xlsx SupplementaryFig.docx Cite Share Download PDF Status: Published Journal Publication published 19 Nov, 2025 Read the published version in Antonie van Leeuwenhoek → Version 1 posted Editorial decision: Revision requested 22 Aug, 2025 Reviews received at journal 22 Aug, 2025 Reviews received at journal 07 Aug, 2025 Reviewers agreed at journal 07 Aug, 2025 Reviewers agreed at journal 06 Aug, 2025 Reviewers invited by journal 06 Aug, 2025 Editor assigned by journal 01 Aug, 2025 Submission checks completed at journal 01 Aug, 2025 First submitted to journal 01 Aug, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7267613","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":497758079,"identity":"d7a9479c-ce4b-4b15-9a21-811666c085e2","order_by":0,"name":"Aroop Verma","email":"","orcid":"","institution":"Tezpur University","correspondingAuthor":false,"prefix":"","firstName":"Aroop","middleName":"","lastName":"Verma","suffix":""},{"id":497758081,"identity":"a5c98969-ee12-42c8-8ce1-8f86f3faa658","order_by":1,"name":"Indu Kumari","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA8ElEQVRIiWNgGAWjYBACCTBpwMDDwJDA/CChAshhZm4gWgubwYMzIC2MxGgBgwQGyYdtIAYBLZIzshM/VxQwyJiz5xgYJM6rjeZvB2r5UbENpxZpidzNkmeADrPseWPwIHHb8dwZhxkbGHvO3MapRU4id4NkA1CLwQ2QLduO5TYAtTAztuHVsvknTItE4pxjufMJaQE6bJskQktDTe4GQloke95us2wwkOAxOPOszCDh2IHcjUAtB/H5ReJ47uabDX9s7A2OJ29++KOmLnfe+cMHH/yowK0FphPGOAwmDxBSjwzqSFE8CkbBKBgFIwQAAIMqWc0P0p+OAAAAAElFTkSuQmCC","orcid":"","institution":"Regional Centre for Biotechnology","correspondingAuthor":true,"prefix":"","firstName":"Indu","middleName":"","lastName":"Kumari","suffix":""},{"id":497758084,"identity":"56f8dff1-e56e-4a94-9b80-d76c236160b1","order_by":2,"name":"Anupam Nath Jha","email":"","orcid":"","institution":"Tezpur University","correspondingAuthor":false,"prefix":"","firstName":"Anupam","middleName":"Nath","lastName":"Jha","suffix":""}],"badges":[],"createdAt":"2025-08-01 05:38:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7267613/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7267613/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1007/s10482-025-02199-1","type":"published","date":"2025-11-19T15:57:04+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":88778191,"identity":"58ed933c-69b6-4e75-b59c-802ee109b130","added_by":"auto","created_at":"2025-08-11 10:22:41","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":52571,"visible":true,"origin":"","legend":"\u003cp\u003eThe graph above shows coverage percentage of the reads against the reference genome, the graph below shows the average depth for each sample after alignment. Strains are listed at the bottom.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt2.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/a5aa5d4968ad566a8c1b9d3a.png"},{"id":88775778,"identity":"6cc0c856-60fb-4ae8-99d4-972c18b86bb2","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":156633,"visible":true,"origin":"","legend":"\u003cp\u003eChromosome numbers are listed on the right-hand side of the heatmap panel and strain names are listed on the bottom. The dendrogram on top shows clustering of strains on the basis of similarity of aneuploidy profiles and the distances and were generated using euclidean distances between them. The dendrogram on the left shows clustering of chromosomes with similar ploidies. The colour scale indicates somy of the chromosomes; disomy: 1.5-2.5, trisomy: 2.5-3.5, tetrasomy: 3.5-4.5.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt3.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/f24bdcb300268d9f2d683ef3.png"},{"id":88777652,"identity":"371fa368-7dc6-4745-a9fd-6811af16017b","added_by":"auto","created_at":"2025-08-11 10:14:40","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":110212,"visible":true,"origin":"","legend":"\u003cp\u003eThe plot displays frequency of SNPs with various genotypes for Indian (North and East India) and Nepalese \u003cem\u003eLeishmania donovani\u003c/em\u003e strains, list of strains is indicated at the bottom, Y axis represents the frequency of SNP variants, and the legend represents the genotype. A – Total SNPs occurring in the sample, B – Stacked bar plot representing the genotype distribution of the total SNP events, C to E represent the frequency of different genotypes associated with the SNPs. The genotypes of variants shown here are Heterozygous REF/ALT, Homozygous ALT, and Heterozygous ALT.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt4.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/13d8ccf28ae60f9c9b3c30c0.png"},{"id":88778192,"identity":"e94a55eb-19f3-457f-9ba3-4310307657f3","added_by":"auto","created_at":"2025-08-11 10:22:41","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":107083,"visible":true,"origin":"","legend":"\u003cp\u003eThe following plot displays frequency of insertion/deletions with various genotypes for Indian and Nepalese \u003cem\u003eLeishmania donovani\u003c/em\u003e strains, list of strains is indicated at the bottom, Y axis represents the frequency of insertion/deletions, and the legend represents the genotype. A – Total insertion/deletions occurring in the sample, B – Stacked bar plot representing the genotype distribution of the total insertion/deletion events, C to E represent the frequency of different genotypes associated with the variants. The genotypes of variants shown here are Heterozygous REF/ALT, Homozygous ALT, and Heterozygous ALT.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt5.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/38bbee9a9a4b70581d601cca.png"},{"id":88775783,"identity":"db4b28e4-f17f-44bf-acbe-cc8ee41a3699","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":76595,"visible":true,"origin":"","legend":"\u003cp\u003eA pie chart depicting the proportion of different variant types within the population of samples used in this study. Variant types and their corresponding colours are listed on the right. 61.2% of the variants were classified as 5’ Flank, 12.1% as 3’ Flank and 12.2% as Frame_Shift_Del, 6.43% Intergenic Regions (IGR), 3.95% Missense_Mutation, 2.42% Silent, 1.17% Frame_shift_Ins and rest of the variants comprising less than 1% of the total variants.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt6.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/82024d122b47a404d65d5176.png"},{"id":88775779,"identity":"7991d3a1-91c7-453b-ad8d-17dffb28b1ca","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":96843,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Graph showing the frequency of different variant classes across isolates. We can see the Indian isolates showing a significantly higher number of variants in 5` flank of the gene. (B) Graph showing the proportion of different variants of the total variants that occur in the samples. Isolates are listed at the bottom on the x-axis.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt7.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/11b83cc0a5f41451892c8c97.png"},{"id":88775781,"identity":"70ad564b-9810-4c04-90e9-d0fcd3d010fb","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":76543,"visible":true,"origin":"","legend":"\u003cp\u003eBar chart indicating the frequency of missense mutations which have occurred in different \u003cem\u003eL. donovani \u003c/em\u003estrains.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt8.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/f2335d78b95b0967620e400d.png"},{"id":88775784,"identity":"9de4d059-ba0f-4ef1-9b1c-ea4d28b45b49","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":245353,"visible":true,"origin":"","legend":"\u003cp\u003eBar chart showing the top 50 genes affected by ‘Missense_Mutation’ variant classification. Genes are listed on the y-axis in decreasing order of mutation events present in the genes. X-axis indicates the cumulative mutations that occurred throughout the population of isolates.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt9.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/859c23b48baaa1ad15542511.png"},{"id":88775792,"identity":"44cba5b8-c693-4685-b6ca-bbcdb3c4fe72","added_by":"auto","created_at":"2025-08-11 10:06:41","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":139305,"visible":true,"origin":"","legend":"\u003cp\u003ePlot showing top 20 mutated genes across all samples. Top chart depicts the number of variants per sample (y-axis is log scaled), and the right-hand chart shows the number of samples the specific gene is present in. The rows show different genes and columns are represented by samples; the colour of the cell depicts the type of variant classification associated with the variant which affects the gene.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt10.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/ebb0a4af704a93ebc1c3c0bd.png"},{"id":88775795,"identity":"ff780bf1-c004-49ef-8ce5-d6550c8ca181","added_by":"auto","created_at":"2025-08-11 10:06:41","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":175171,"visible":true,"origin":"","legend":"\u003cp\u003eHeatmap created using the pheatmap R package. Strain names are listed on the bottom of the plot, list of genes on the right, dendrograms on top show clustering of strains based on variation in indels, and dendrograms on the left show clustering of genes based on the mutation occurrences in the genes. Colours in the legend reflect the number of variants red indicating the least and dark blue indicating the most.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt11.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/df47a14c5a5d6e4a2b1763f7.png"},{"id":88775789,"identity":"50681e86-e3b8-4658-ab23-80882f070db1","added_by":"auto","created_at":"2025-08-11 10:06:41","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":41757,"visible":true,"origin":"","legend":"\u003cp\u003eBar chart depicting the total Copy Number Variants for each chromosome for all isolates as predicted by CONTROL-FREEC. Labels over the bars indicate the CNV event frequency, Y-axis indicates the frequency of CNVs and it is log\u003csub\u003e10\u003c/sub\u003e transformed to accommodate the exceptionally high frequency of CNV events found in the tetrasomic chromosome 31.\u003c/p\u003e","description":"","filename":"Fig3Biotech.ppt12.png","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/a70c3b5c9ed65253808e7ed1.png"},{"id":96650020,"identity":"20677129-784a-4138-95b1-c3751521424f","added_by":"auto","created_at":"2025-11-24 16:05:04","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2157571,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/5ba8a571-2439-4ff5-9b6b-fa4a3c67832f.pdf"},{"id":88775775,"identity":"49790893-6d4a-4560-864d-c779f5ba8b61","added_by":"auto","created_at":"2025-08-11 10:06:40","extension":"xlsx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":6755,"visible":true,"origin":"","legend":"","description":"","filename":"Table1functionaldomains.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/b9d76b4ba98ec419d870e2d0.xlsx"},{"id":88775799,"identity":"8096b4d0-9d18-43d5-a3ee-f8c9d249e819","added_by":"auto","created_at":"2025-08-11 10:06:41","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":8135,"visible":true,"origin":"","legend":"","description":"","filename":"Table2Indels.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/a18a501cbb3d8e9e02f5723a.xlsx"},{"id":88775791,"identity":"79e80b0e-762c-4ac0-9767-0613bd46599e","added_by":"auto","created_at":"2025-08-11 10:06:41","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":729087,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/5e4ddaf4394402cd7d0ada3a.xlsx"},{"id":88777655,"identity":"e1ef4704-405d-46b0-a766-14010e5b7560","added_by":"auto","created_at":"2025-08-11 10:14:41","extension":"docx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":430108,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFig.docx","url":"https://assets-eu.researchsquare.com/files/rs-7267613/v1/63f4b657cf2aff18cb0aa564.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Insights into Genetic Heterogeneity and Drug Resistance in Leishmania donovani of the Indian sub-continent from genomic data","fulltext":[{"header":"Introduction","content":"\u003cp\u003eLeishmaniasis is caused by vector-borne, obligate intracellular, kinetoplastid parasites belonging to the genus \u003cem\u003eLeishmania\u003c/em\u003e (Family Trypanosomatidae) (Torres-Guerrero, Quintanilla-Cedillo et al. \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). Transmission is through the bite of an infected female phlebotomine sandfly(Schlein \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e1993\u003c/span\u003e). Leishmaniasis has three main types of clinical manifestations: Cutaneous Leishmaniasis, Mucosal Leishmaniasis, and Visceral Leishmaniasis. Disease manifestation and severity depends upon the host immune response, the causal pathogen and/or any comorbidities(Alvarenga, Escalda et al. \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2010\u003c/span\u003e) (Awasthi, Mathur et al. \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2004\u003c/span\u003e). The World Health Organization (WHO) has declared Leishmaniasis as a Neglected Tropical Disease (NTD) (Kumari, Lakhanpal et al. \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). It is most prevalent in the subtropical and tropical regions of the globe; mainly Europe, Northern Africa, the Middle East, Asia and part of South America (Martínez, Guerrero et al. \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). According to the CDC, current estimates of disease incidence for Cutaneous Leishmaniasis have ranged from 700,000 to 1.2\u0026nbsp;million cases per year and it's been estimated that the number of new cases per year for Visceral Leishmaniasis has decreased from 400,000 to less than 100,000 (Prevention, 2020).\u003c/p\u003e\u003cp\u003eVisceral Leishmaniasis (VL), caused by the \u003cem\u003eLeishmania donovani\u003c/em\u003e complex which includes \u003cem\u003eL. donovani\u003c/em\u003e and \u003cem\u003eL. infantum\u003c/em\u003e, is the most lethal form of Leishmaniasis. Majority of disease burden (90%) has been reported in the following countries; India, Eiteria, Iraq, Somalia, Brazil, Ethiopia, Kenya, South Sudan, Nepal, China, and Sudan (Patil and Chatterjee \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2023\u003c/span\u003e, Saha, Borah et al. \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). India contributes 18% of the total reported VL cases. VL is endemic to 54 districts scattered in four states — Bihar (33 out of 38 districts), Jharkhand (4 out of 24 districts), Uttar Pradesh (6 out of 75 districts) and West Bengal (11 out of 23 districts). In recent time, the vector of leishmaniasis has been detected in the Western Himalayan region of India like Jammu and Kashmir, Gujarat, Assam, Himachal Pradesh, Haryana, Tamil Nadu, Puducherry, Madhya Pradesh, Kerala, Sikkim and Uttaranchal which raises a concern (Patil and Chatterjee \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2023\u003c/span\u003e, Sofi, Godara et al. \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). With the unconventional spread of Leishmaniasis to Himalaya, there is a need for the reassessment of Leishmaniasis endemicity in the Himalayan region of North India.\u003c/p\u003e\u003cp\u003ePentamidine, Antimonial [Sodium stibogluconate (SSG) and meglumine antimoniate (MA)], Miltefosine, Amphotericin B and Paromomycin are the anti-leishmaniasis agents used in the first line and second line therapy (Bharadava et al., \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; (Mann, Frasca et al. \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2021\u003c/span\u003e)). Beauvericin (BEA) is a fungal secondary metabolite synthesized by \u003cem\u003eBeauveria bassiana\u003c/em\u003e which is potent ABC transporter inhibitor. Its application along with anti-leishmanial drug may enhance therapeutic index of anti-leishmanial agents (Al Khoury et al., \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Emergence of drug-resistance strains has rendered traditionally employed treatment methods ineffective. The use of chemotherapy remains viable but it is expensive and has toxic side effects (Lira, Sundar et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e1999\u003c/span\u003e). Hence there is an urgent need for novel drugs to combat the emerging peril of drug-resistant strains and effectively decrease the virulent effect of \u003cem\u003eLeishmania\u003c/em\u003e (Bora and Jha \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Other mode of therapy includes thermotherapy, electric field stimulation and diverse topical treatment methods that are in clinical or preclinical trials, such as formulations containing paromomycin, imiquimod, miltefosine, morphine, and buparvaquone components (Pinheiro\u003c/p\u003e\u003cp\u003eet al., 2022). Currently new chemical classes which are in clinical phase 1 like anti-leishmanial classes nitroimidazoles and benzoxaboroles, with DNDI-0690 and DNDI-6148 as lead compounds, respectively, the pyrazolopyrimidine scaffold recognized as CRK-12 kinase inhibitors.\u003c/p\u003e\u003cp\u003eAdvancements in high throughput sequencing technologies have greatly reduced the time and capital required to sequence whole genomes of organisms, making it possible to analyse complete genome sequences and investigate the biology in greater detail (Reuter, Spacek et al. \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). WGS has enabled the genetic characterization of novel spp. of \u003cem\u003eLeishmania ellisi\u003c/em\u003e from USA (Sapp, Low et al. \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). The meta-analysis of transcriptomic data of \u003cem\u003eLeishmania\u003c/em\u003e spp. has been carried out to understand the immune system’s response to its infection (Rezaei, Tahmasebi et al. \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Recent changes have been observed in the epidemiology of leishmaniasis with reports of it emerging in the non-endemic country such as Austria. The development highlights the impact of climate and habitat changes, urbanization, and globalization on the spread of the disease in regions previously unaffected (Bora and Nath Jha \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2019\u003c/span\u003e, Riebenbauer, Czerny et al. \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). There are reports of VL from certain villages of Nepal, where the patients did not have any travel history outside of their village. However, it is likely that traders and visitors might have contributed to the spread of the disease. The reported incidences of infection in new loci of Nepal can be a threat to the elimination effort of Leishmaniasis (Joshi, Banjara et al. 2024). \u003cem\u003eL. donovani\u003c/em\u003e offspring obtained from a cross between two sand fly species has resulted in genetic diversity in HASPB antigen which has major impact on the therapeutic vaccine design (Kumari, Mamud et al. \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2023\u003c/span\u003e, Sádlová, Yeo et al. \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). The genomic studies of \u003cem\u003eLeishmania\u003c/em\u003e spp. has explored the genetic plasticity and constantly ongoing rearrangement of the genome that leads to variation in copy number, gene clusters or the chromosome (Santi and Murta \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Several studies indicate that the clinical isolates of \u003cem\u003eLeishmania\u003c/em\u003e spp. has shown increased resistance against pentavalent antimony (Santi and Murta \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). In Indian samples, it has been observed that high levels of As(III) in the water has led to the selection of Sb-resistant parasites. Further, change in aneuploidy has been triggered by the drug resistance against antimony (Dumetz, Imamura et al. \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). The changing epidemiology of leishmaniasis globally and in India is a matter of concern. To understand the genetic constitution of \u003cem\u003eLeishmania\u003c/em\u003e spp., we have retrieved publicly available \u003cem\u003eLeishmania donovani\u003c/em\u003e whole genome sequencing (WGS) data specifically isolated from the Indian subcontinent. We analyzed the genomes of the various strains and reported the variants which can be targeted for the drug development, aiming to reduce the severity of the disease. We tracked and reported the variants located within the functional domains of proteins. The study will help to explore the genome of \u003cem\u003eLeishmania\u003c/em\u003e under different stress, environmental conditions and drug pressure.\u003c/p\u003e"},{"header":"Methodology","content":"\u003cp\u003e\u003cb\u003eRetrieval of WGS data of\u003c/b\u003e \u003cb\u003eLeishmania donovani\u003c/b\u003e \u003cb\u003eand alignment to reference genome\u003c/b\u003e\u003c/p\u003e\u003cp\u003eA total of 34 sample data of the Indian Subcontinent were downloaded through a combination of text-based and advanced search approach from the European Nucleotide Archive (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ebi.ac.uk/ena/browser/\u003c/span\u003e\u003cspan address=\"https://www.ebi.ac.uk/ena/browser/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) (Supplementary Table\u0026nbsp;1). The samples sequencing approach was whole genome sequencing and biological source of the samples was late passage, bone marrow and Atypical cutaneous \u003cem\u003eLeishmania donovani\u003c/em\u003e. Fastq files were downloaded and quality check was performed using the FASTQC version: 0.11.9 (Andrews \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2010\u003c/span\u003e). FASTQC generates a visual report of the sequencing files based on several analytical modules. These reports were used to select specific cut-offs for riding sub-quality sequence calls. Base calls with a Q score of less than 20 were filtered out using Trimmomatic Version: 0.39 (Bolger, Lohse et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). \u003cem\u003eLeishmania donovani\u003c/em\u003e reference genome assembly ASM22713v2 (Downing, Imamura et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2011\u003c/span\u003e) was utilised and the GenBank (Clark, Karsch-Mizrachi et al. \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2016\u003c/span\u003e) assembly GCA_000227135.2 was retrieved from NCBI. The reference genome was indexed using Burrows-Wheeler aligner (bwa) version: 0.7.17-r1188 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://bio-bwa.sourceforge.net/bwa.shtml\u003c/span\u003e\u003cspan address=\"https://bio-bwa.sourceforge.net/bwa.shtml\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e ‘index’ tool while SAMtools Version: 1.10 (using htslib 1.10.2-3ubuntu0.1) (Danecek, Bonfield et al. \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), PICARD version:2.27.5 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/broadinstitute/picard\u003c/span\u003e\u003cspan address=\"https://github.com/broadinstitute/picard\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e were used to create a dictionary of the reference sequence. The ‘faidx’ (Shirley, Ma et al. \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) tool was used to create a bed file. Reads were aligned against the reference sequence using bwa mem that resulted in a sequence alignment map (SAM) files (Li, Handsaker et al. \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). SAM files were sorted, duplicate marked, converted into binary alignment map (BAM) files, and read group information added using PICARD. SAMtools were used to get alignment, depth and coverage information, which were employed to discard low depth and low coverage samples from further analysis.\u003c/p\u003e\u003cp\u003e\u003cb\u003eVariant discovery and annotation\u003c/b\u003e\u003c/p\u003e\u003cp\u003eGenome Analysis Toolkit (GATK v4.3.0.0) (Van der Auwera, Carneiro et al. \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e2013\u003c/span\u003e), a suite of tools was used to call variants following the best practices workflow for germline short variants – Single Nucleotide Polymorphisms (SNPs) and Insertion/Deletions (InDels) – discovery. Variant calling was performed using GATK HaplotypeCaller using default settings. The resultant variant call format (VCFs) were filtered for SNPs, removing Insertion/Deletions. The following thresholds were chosen to filter SNPs following the GATK hard filtering recommendations, QD \u0026lt; 20, QUAL \u0026lt; 30, SOR \u0026lt; 3, FS \u0026lt; 60, MQ \u0026lt; 4, MQRankSum \u0026lt; − 12.5 and ReadPosRankSum \u0026lt; − 8, to remove low quality or false positive variants which may contaminate downstream processing, annotation and analysis. Annotation was performed using ANNOVAR (Wang, Li et al. \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e2010\u003c/span\u003e) and snpEff (version:5.1d) (Cingolani, Platts et al. \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2012\u003c/span\u003e) separately to obtain common annotated genes.\u003c/p\u003e\u003cp\u003e\u003cb\u003eVariant Analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eMultianno files obtained after annotating VCF files using ANNOVAR were imported into R and analyzed using the R package MAFtools from Bioconductor (Mayakonda, Lin et al. \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). VCF files annotated with snpEff were converted to Mutation Annotation Format (MAF) using VCF2maf perl script (\u003cem\u003eMskcc/VCF2maf\u003c/em\u003e, n.d.) to extract variant information such as gene id, variant, variant location, variant type and effect, HGVS.p. SNPs and InDels were split into different maf files due to quality control and processing differences. Using the MAFtools R package, a summary of the MAF files was plotted and most mutated genes were discovered and reported using the plotmafsummary and oncoplot functions of the package. MAF files containing InDel information were analyzed using shell commands and various information associated with InDels were extracted. InDel affected genes common to all strains were investigated; top mutated genes with more than 15 InDel events were compared and projected with a heatmap using the pheatmap package.\u003c/p\u003e\u003cp\u003e\u003cb\u003eAneuploidy and copy number analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eAneuploidy and Somy profiles were calculated using a read depth based approach. Samtools was used to extract raw read depth information of the aligned bases from the BAM files. Normalised read depths were estimated by dividing raw depth by the mean raw read depth of specific chromosomes. The haploid depth was defined as the median of the normalised read depths and chromosomal somy was estimated by multiplying the haploid depth by two given the assumption that there exist two copies of chromosomes. Sample ploidy was defined as the median of the chromosomal somies calculated in the previous step. A heatmap was prepared to visualise the different chromosomal profiles using the pheatmap package in R. Copy Number Variations (CNV) were investigated with Control-FREEC (Boeva, Popova et al. \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2012\u003c/span\u003e) using a sliding window approach, window size of 5000 and step size of 250. The following parameters were also included in the config file: breakPointThreshold = 0.8, minExpectedGC = 0.35, maxExpectedGC = 0.65, ploidy = 2, and inputFormat = BAM.\u003c/p\u003e\u003cp\u003e\u003cb\u003eChromosomal Somy and Copy Number Variant estimation\u003c/b\u003e\u003c/p\u003e\u003cp\u003eChromosomal Somy was estimated using read depth statistics of the sequenced reads. Raw depth of a position (i \u003csub\u003eRaw depth)\u003c/sub\u003e and mean read depth (Ch \u003csub\u003eMean depth)\u003c/sub\u003e of a chromosome were used to calculate the normalised read depth of a position (i\u003csub\u003eNorm\u003c/sub\u003e) and the chromosomal Somy (S) was calculated using the following formula: 2 × Median( i \u003csub\u003eNorm\u003c/sub\u003e = i \u003csub\u003eRaw depth\u003c/sub\u003e / Ch \u003csub\u003eMean depth\u003c/sub\u003e ). The range of monosomy, disomy, trisomy, tetrasomy, and pentasomy was then used to define the full cell-normalised chromosome depth or S as S \u0026lt; 1.5, 1.5 ≤ S \u0026lt; 2.5, 2.5 ≤ S \u0026lt; 3.5, 3.5 ≤ S \u0026lt; 4.5, and 4.5 ≤ S \u0026lt; 5.5, respectively, as previously described. (Dumetz F, Imamura H, Sanders M, Seblova V, Myskova J, Pescher P, Vanaerschot M, Meehan CJ, Cuypers B, De Muylder G, Späth GF, Bussotti G, Vermeesch JR, Berriman M, Cotton JA, Volf P, Dujardin JC, Domagalska MA. Modulation of Aneuploidy in \u003cem\u003eLeishmania donovani\u003c/em\u003e during Adaptation to Different \u003cem\u003eIn Vitro\u003c/em\u003e and \u003cem\u003eIn Vivo\u003c/em\u003e Environments and Its Impact on Gene Expression. mBio. 2017 May 23;8(3):e00599-17. doi: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/mBio.00599-17\u003c/span\u003e\u003cspan address=\"10.1128/mBio.00599-17\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. PMID: 28536289; PMCID: PMC5442457.\u003c/p\u003e\u003cp\u003e)The results were projected onto a heatmap and dendrograms using the R package ‘pheatmap’. The isolates were clustered into groups based on the euclidean distances between their chromosomal profiles. Chromosomes were also clustered into groups based on the Euclidean distance of chromosomal copy numbers profile of a chromosome across isolates.\u003c/p\u003e\u003cp\u003eCONTROL-FREEC (Boeva, Popova et al. \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2012\u003c/span\u003e) was employed to investigate the Copy Number Variation (CNV). Control-FreeC parameters were defined as follows:\u003c/p\u003e\u003cp\u003eWindow size = 5,000, Step size = 250, breakPointThreshold = 0.8, minExpectedGC = 0.35, maxExpectedGC = 0.65, ploidy = 2, and inputFormat = BAM.\u003c/p\u003e\u003cp\u003e\u003cb\u003eFunctional domain analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eFor functional domain analysis, genes affected by missense mutations which were common to most samples were analyzed. Multiple missense variants were observed in most of the genes but variants which affected the majority of the samples were selected for downstream analysis. 70 genes were annotated using the Uniprot database (The UniProt Consortium, 2017); 19 genes were uncharacterized/ hypothetical and were removed from subsequent analysis. Protein sequence information for 51 remaining genes was retrieved from Uniprot, which included protein name, length, domain(s) and domain location(s). The domain annotations were provided by Interpro (Paysan-Lafosse, Blum et al. \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Only 30 protein sequences were found to have relevant Interpro domain annotations; variants found inside these domains were reported.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cb\u003eSample details and quality check\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe organism was isolated from human, biological replicate and laboratory-adapted strains. \u003cem\u003eLeishmania\u003c/em\u003e sp. cultured under different conditions and drug resistance has shown different mutations and genetic makeup (Supplementary Table\u0026nbsp;1). The samples were quality checked, processed, and aligned to the reference genome. The Binary Alignment Map (BAM) files were used to assess the alignment and depth statistics of each sample (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). 2 samples were discarded from further analysis because of insufficient alignment to the reference genome, due to the incompatibility of the read files with BWA alignment software. The 31 remaining samples were used for further analysis. The samples used in this study have been provided in Supplementary Table\u0026nbsp;2.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eAneuploidy change pattern in\u003c/b\u003e \u003cb\u003eLeishmania donovani\u003c/b\u003e \u003cb\u003esamples from Indian Sub-continent\u003c/b\u003e\u003c/p\u003e\u003cp\u003eAll the analyzed samples have shown tetrasomic state as reported in the earlier studies. In addition to this, trisomy was observed in chromosomes 2 and 8 in BHU 1220, BHU 575, MHOM/IN/1983/AG83, MHOM/IN/2013, MHOM/IN/2014, K133 samples and tetrasomic states in BPK275/0/cl18 samples. Contradictorily, the Nepalese strain BPK275/0/cl18 did not show either trisomy or tetrasomy in chromosomes 2 and 8, and clustered away from the other Nepalese strains, with the K133 strain isolated from India. A few Indian strains along with one Nepalese strain, BPK275/0/cl18, K133PMM/R, K133AS/R, K133WT, MHOM/IN/2010/T3, MHOM/IN/IN/T8, MHOM/IN/2012/T9, MHOM/IN/2014/A2, MHOM/IN/2013/P2, MHOM/IN/2014/H1, BHU 575, showed the disomic state of chromosome 2 and clustered together separate from the other Indian cluster (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). It should be noted that the strains are clustering into 4 major clusters on the similarity in their aneuploidy profile. These clusters encompass the two distinctly separate groups of Indian strains, a Nepalese cluster and a mixed group with Nepalese and Indian strains. Here, we observed that the Indian cluster of strains isolated from India BHU 1220, MHOM/IN/1983/AG83.i and ii, West Bengal MHOM/IN/2013/M1, Jharkhand MHOM/IN/2014/P6, and MHOM/IN/2014/T10, showed expected karyotypes as most chromosomes were disomic barring the trisomic chromosomes 6 and 15, and the tetrasomic chromosome 31. On the other hand, the second Indian cluster comprising strains isolated from Bihar MHOM/IN/2010/T3, MHOM/IN/2010/T7, MHOM/IN/2012/T9, MHOM/IN/2013/P2, West Bengal MHOM/IN/2012/T8, Jharkhand MHOM/IN/2014/H1, MHOM/IN/2014/A2 and BHU 575 strain showed a remarkable difference in aneuploidy profile; chromosome 31 and 8 showed an increase in copy number which reflected in their tetrasomic and trisomic states, chromosomes 2, 5, 9, 11, 13, 14, 16, 20, 22, 23, 26, 35 showed variable aneuploidy, ranging from disomy to trisomy, which was lacking in the previous Indian cluster. The Nepalese strains grouped together and surprisingly were closer to an Indian cluster due to similarity in aneuploidy profiles, except the chromosome 2 which distinctly showed tetrasomic condition in Nepalese strains BPK275/0/cl18. Finally, the remaining cluster, was a mixed group of one Nepalese strain and three Indian strains. This may have been due to the similarity in tetrasomic state of chromosome 5 along with trisomy observed in chromosomes 6, 9, 13, 15, 20, 23, 26, and 33. Chromosomes 2, 8 and 31 were also observed to be clustered together, possibly due to their peculiar nature of existing in unusually higher copy number states, mostly trisomic and/or tetrasomic with respect to the other chromosomes. The chromosomes 5, 9, 13, 14, 16, 20, 23, 26, 33, and 35 were grouped together, possibly because of their tendency to exhibit variable aneuploidies ranging from disomy to tetrasomy. Chromosomes 6 and 15 show occasional trisomy and hence grouped together. Rest of the chromosomes showed normal copy numbers with minimal deviation from disomy. The overall heatmap of the aneuploidy change in the samples from Indian subcontinent has shown trisomy or tetrasomy in the whole or partially in chromosome 2 and 31. This observation is supported by earlier study that has shown chromosome 31 is superanumerary and tetrasomic (Pilling, Reis-Cunha et al. \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The aneuploidy changes of \u003cem\u003eLeishmania\u003c/em\u003e under the effect of paromomycin can be observed in the samples with BPK* id, most of the samples have shown trisomy in Chromosome 2. WGS samples of late passage has shown tetrasomy in chromosome 31 and trisomy in chromosome 5 and 6. The present analysis show a very high genetic modification in \u003cem\u003eLeishmania donovani\u003c/em\u003e sequenced from patients of Indian sub-continent and drug resistance also alter the aneuploidy of \u003cem\u003eLeishmania\u003c/em\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eDistribution of variants among samples\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe variant analysis has shown that the sample from the Himachal region of India has recorded the highest number of variants in the coding sequences, gene interval and chromosomal interval (Supplementary Figs.\u0026nbsp;1 and 2). Rest of the samples have shown variants in the same range i.e. 1000 to 10000 in number. We observed the genotype distribution of SNPs in different strains; Indian strains MHOM/IN/2010, MHOM/IN/2012, MHOM/IN/2013, and MHOM/IN/2014 showed an average SNP frequency of 1900. These strains were isolated from different regions of India mainly from Bihar, Delhi, Jharkhand, and West Bengal. Strains with unknown states as geographical locations were tagged as \u0026lsquo;Indian\u0026rsquo;. The strain MHOM/IN/2010/T7, isolated from Bihar, had the highest SNP occurrence of 1975 while Jharkhand isolate MHOM/IN/2014/H1 had the lowest SNP occurrence of 1868. Indian strains BHU1220, isolated from Bihar, showed 2089 SNPs, MHOM/IN/1983/AG showed a higher SNP frequency of 2707 and 2719, and BHU575/0/cl2 with 3800 SNPs was the highest. Indian strains K133WT, K133AS-R, and K133PM-R showed abundance with each having 3713, 3281 and 3488 SNPs respectively. We looked at the distribution of genotypes in these samples; 86\u0026ndash;91% of SNPs had Heterozygous Reference Alternate (Het RA) genotype, 9\u0026ndash;13% had Homozygous Alternate (Hom AA) genotype and a miniscule percentage, around 0\u0026ndash;0.002% of SNPs represented the Heterozygous Alternate (Het AA) genotype, which represents two different alternate alleles in place of the reference (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThe Nepalese strains BPK275/0/cl18 showed a distinctly higher SNP abundance than Indian strains and an average SNP abundance of 3482 SNPs. BPK275/0/cl18.vii, isolated from Koshi province, showed the lowest SNP abundance of 2962 while BPK275/0/cl18 was the highest \u0026ndash; 4003, among all strains from both India and Nepal. Most of the Nepalese strains were isolated from the Koshi province of Nepal, barring BPK275/0/cl18 of the study SRR15851294 which did not have an associated geographical location. The variant frequency distribution is similar among all the samples except the one obtained from the Himalayan region.\u003c/p\u003e\u003cp\u003eWe also analyzed the Insertion/Deletions occurrence frequency in the samples and reported the different genotypes associated (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). Quite interestingly, the Indian strains MHOM/IN/2010 T3, T7, MHOM/IN/2012/T8, T9, MHOM/IN/2013/M1, P2, MHOM/IN/2014/A2, H1, P6 and T10 had remarkably high insertion/deletion frequency compared to rest of the samples. The highest occurrence was in P2\u0026ndash;29760, and the lowest \u0026ndash; 18,153 \u0026ndash; in T8. Among the strains, there existed unusually high counts of Het ALT/REF alleles nearing 20,000 to 30,000 in a few strains. The average insertion/deletions variants in these strains were 24,060. While the Het REF/ALT alleles were in abundance, the Hom ALT and Het ALT alleles showed sparse distribution. Allele\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003efrequency ranged from strain MHOM/IN/2014/T10, from Jharkhand, having the most with 314 Hom AA and 16 Het AA alleles and strain MHOM/IN/2014/A2, from West Bengal, having the least with 165 Hom AA and 8 Het AA alleles. Other Indian strains such as the BHU1220, BHU575/0/cl2, MHOM/IN/1983/AG83 and K133 strains showed markedly higher frequency of Hom AA and Het AA while exhibiting negligible Het RA alleles relative to the strains mentioned previously.\u003c/p\u003e\u003cp\u003eThe Nepalese strains BPK275/0/cl18 showed limited InDels relative to the Indian strains, while also exhibiting a higher frequency of Hom ALT and Het ALT alleles. The strain BPK275/0/cl18.vii showed lower amounts of insertion/deletions in comparison to other Nepalese strains. The sample of Himachal state of India has around 1 lakh missense and 40000 InDels.\u003c/p\u003e\u003cp\u003eTo decipher the distribution of different variant types within the strains used in this study, we utilized snpEff for annotation. The method involves identifying and attaching the relevant sequence ontological terms to the variants, as well as predicting their effect types and prospective impacts. Most of the variants appeared to be impacting non-coding genomic regions which included the 5\u0026rsquo;Flank, 3\u0026rsquo;Flank, and IGR. We observed an abundance of Frame_Shift_Del events and Missense_Mutation which impacted the coding portion of the genome. Few variants which impacted the coding regions were associated with in-frame deletion or insertion, missense and silent mutations. A tiny proportion of the variants represented intron, nonsense, nonstop, splice region, splice site and targeted region classes (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). We looked at the distribution of different variant types across strains. The majority of the variants were in 3\u0026rsquo; Flank or 5\u0026rsquo; Flank across the strains that constituted 75.1% of the total variants. The strains had an average of 7.43% IGR variants. A relatively small share of the variants affected the coding regions of the gene sequences, these were Frame_Shift_Del, Frame_Shift_Ins, and Missense Mutations, contributing to an average of 8.37%, 1.18% and 4.54% respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWhen we compared the average variant proportion between Indian and Nepalese strains we found there to be significant differences between Frame_Shift_Del variants, 3\u0026rsquo;Flank and IGR regions.\u003c/p\u003e\u003cp\u003eWe observed that Indian strains MHOM/IN/2010/, MHOM/IN/2012/, MHOM/IN/2013/, and MHOM/IN/2014 exhibited substantial amounts of variants tagged as \u0026lsquo;Frame_Shift_Del\u0026rsquo;, except for the BHU 1220, BHU575/0/cl2, MHOM/IN/1983/AG83 and K133 strains. This might be due to the presence of a higher number of insertion/deletion events, potentially leading to an increased representation of these variants in the samples. This was supported by the higher frequency of these variants relative to other strains in the population. It ranged from 4364 in MHOM/IN/2013/M1 and 1497 in MHOM/IN/2014/H1. The proportion also remained high at 27 to 18% of the total variants in these samples. Frame_Shift_Del events in other Indian strains remained lower; only occurring as less than 1% of the total variants in the samples. The frequency relative to other Indian strains previously mentioned also remained significantly low i.e. 30, 41, 44, 53 respectively. This might be due to the lesser insertion/deletion events in these strains which could be contributing to remarkably higher frame shift events. Whereas the Nepalese strains exhibited less than 1% of these variants, except strain BPK/275/0/cl18.vii which had 5% of \u0026lsquo;Frame_Shift_Del\u0026rsquo; variants, which could be possible due to relatively lower insertion/deletion events in these strains. Nepalese strains also showed a higher mean share of Missense_Mutations, around 5.59%, than the Indian strains which showed 3.78%. IGR also contributed to 9.74% in Nepalese strains while 5.76% in Indian strains. In the Himalayan region sample, downstream and upstream variants comprises of 36% and 35% of the total variants and 9.2% in the exon region.\u003c/p\u003e\u003cp\u003e\u003cb\u003eSingle nucleotide polymorphisms impact functional domains\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe used the Mutation Annotation Format (MAF) files generated by the R package MAFtools to investigate the abundance of \u0026lsquo;Missense variants\u0026rsquo; in different isolates. We observed a median of 310 missense variants in the isolates affecting an array of genes, mainly imparting function as proteins involved in biosynthesis pathways, transport, anchorage, and as receptor proteins involved in cellular uptake of essential amino acids, interaction with host immune system and drug resistance. Among the strains from the Indian subcontinent those from Nepal showed a relatively high frequency of missense variations. Although the Indian isolate K133WT exhibited the highest missense mutation frequency, also K133 and BHU575/0/cl2 strains had a higher frequency of mutations which could be due to the exposure from Paromomycin (PMM) drugs (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). Fifty genes were found to be affected by missense variants in the entire population of\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eisolates (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e). These included genes of ATP-binding cassette (ABC) family, Amastin-like surface proteins, A2 genes, amino acid permeases, and others. ABC subfamily A, member 2 (ABCA2) and member 9 (ABCA9) genes showed the most mutations, 503 and 326 respectively. We observed an increase in copy number of ABC subfamily A, member 9 protein (LDBPK_270840) in all the isolates. ABCA9 protein has two \u0026lsquo;ABC transporter domains\u0026rsquo; from amino acids (aa) 760\u0026ndash;993 and 1547\u0026ndash;1777; while none of the domains were affected by variants, several missense variants of \u0026lsquo;Moderate Impact\u0026rsquo; were observed in the gene sequence. It has been reported that \u003cem\u003eLeishmania\u003c/em\u003e modulates aneuploidy in response to drug pressure, as ABC family has been shown to be present in higher copy number. ABC family of proteins are membrane transporter proteins which have been observed to be involved in drug resistance against antileishmaniasis through efflux (Ouellette, L\u0026eacute;gar\u0026eacute; et al. \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e1998\u003c/span\u003e). Three amastin genes that exhibit high mutation count; Amastin-like surface protein (LDBPK_341150), Amastin surface glycofamily protein (LDBPK_044340), and Amastin surface glycofamily protein (LDBPK_140500) were among that constituted high number of mutations. It has been proposed that Amastin-like surface glycoproteins play a crucial role in macrophage-parasite interaction and parasite virulence. As reported by in a study wherein they knocked down the δ-Amastin expression in \u003cem\u003eL. braziliensis\u003c/em\u003e using RNA-interference and observed decreased intracellular parasites in amastigotes \u003cem\u003ein-vitro\u003c/em\u003e and \u003cem\u003ein-vivo\u003c/em\u003e. We observed 2 missense mutations and 3 silent mutations in LDBPK_341150, 2 missense and 1 silent mutation in LDBPK_044340 and 1 missense mutation in LDBPK_140500. The 3\u0026rsquo;a2rel (LDBPK_220560) and 5\u0026rsquo;a2rel (LDBPK_220660) related proteins of the A2-A2rel cluster showed an increased missense variant count, the A2rel genes have been predicted to be involved in the visceralization of the leishmaniasis infection, as reported in a study on BALB/c mice by infection of L. major Friedlin V9 strain which was transfected with A2 expressing plasmids .The 3\u0026rsquo;a2rel sequence showed 7 moderate impact missense variants, 2 silent mutations and the 5\u0026rsquo;a2rel sequence showed missense, a modifier 5\u0026rsquo; Flank and silent mutations. Several surface protein encoding genes showed high frequency of missense mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e) including the Phosphoglycan-ꞵ-1,3-Galactosyltransferase, partial protein (LDBPK_020150), Mannosyltransferase-like protein (LDBPK_311920) and Amastin-like surface protein (LDBPK_341150). Several studies have implicated the Lipophosphoglycan (LPG), which are glycolipid surface proteins, which play a key role in the propagation of the parasite, through the insect vector, by facilitating the interaction of promastigotes with midgut epithelial lining of sandfly. (Ilg T., Sacks D et al.) LPGs are involved in parasite defence against the host innate immune system by avoiding the formation of membrane attack complexes. These glycoconjugates have also been shown to have a role in promoting cellular uptake of Leishmania parasites into the host macrophages, through the complement system, where they take refuge and multiply inside the acidic phagolysosomes. (SM Puentes et al., PJ Green et al.) In response to the selective pressures by the host environment and immune system Leishmania parasites alter surface physiology by altering LPG side chains which contribute to enhanced survivability and pathogenicity. (Roditi I et al.) The geographical distribution of sandfly influences species-specific LPG alterations and contribute to the successful transmission of parasite from sandfly vector to vertebrate host. (Ilg T.) Differential host-species and host-strain interactions could be contributing to the observed mutational frequency in the Galactosyltransferase gene. Mannosyltransferases are a type of galactosyltransferase which act on mannose and produces Mannogen, a unique carbohydrate material soluble in the cytoplasm, a homopolymer of ꞵ-1,2-linked mannose, 4\u0026ndash;40 residues long. (Ralton et al., Sernee et al.) Mannogen is constitutively produced by seven Mannosyltransferases/phosphorylases (MTPs). Infective stage amastigotes have accumulated Mannogen, as it has been reported that defective Mannogen biosynthesis can lead to loss of virulence and attenuated growth in macrophages. (Sernee, M. Fleur et al.) There exist \u0026#120572;-mannosyltransferases that catalyse the synthesis of N-linked glycans and glycosylphosphatidylinositol (GPI) glycolipids which promotes anchoring of surface proteins, including LPGs, by GPI. (Hilley et al.) Among the multi-hit top mutated genes were Amino acid permease, putative (LDBPK_270530), Tryparedoxin peroxidase (LDBPK_151140), cAMP specific phosphodiesterases (LDBPK_151450). Arginine is an essential amino acid for Leishmania hence it needs to be sourced from exogenous sources. Arginine uptake and homeostasis is maintained by transmembrane proteins such as amino acid permeases which transport Arginine on depletion in the parasite. (Opperdoes FR et al., Darlyuk, Ilona et al.) Tryparedoxin peroxidase enzyme, which is upregulated in the presence of Reactive Oxygen Species (ROOs), helps mitigate cellular toxicity by eliminating toxic agents responsible for Ca2\u0026thinsp;+\u0026thinsp;influx, also an increase in Tryparedoxin peroxidases leads to lesser responsiveness to anti-leishmanial Antimony. (Iyer JP et al.) Various membrane and surface proteins have been implicated in the pathogenicity of Leishmania which also showed high frequency of missense variations, such as the ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210) and member 9, putative (LDBPK_111210), 3\u0026rsquo;a2rel-related protein (LDBPK_220560). The A2 gene family is essential for visceralization of the disease as it has been observed to be absent in Leishmania major which causes Mucocutaneous Leishmaniasis (MCL) but is present in L. donovani which results in the visceral manifestation of the disease. (Zhang et al.). Next, we investigated if any of the mutations affected the functional domains of protein products of the gene sequences. We found 20 genes which had a reported functional domain that happened to be impacted by the missense mutations. The findings are reported in the Table\u0026nbsp;1.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eIndian and Nepalese\u003c/b\u003e \u003cb\u003eL. donovani\u003c/b\u003e \u003cb\u003eisolates have distinct mutational profiles\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe checked for genes which were affected by indels and found 370 genes common to all isolates. Our analysis revealed differential mutation profiles for each gene in different isolates. A cluster analysis of isolates was performed based on Indel mutation profiles and the results are shown in the below. (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e)\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe observed the clustering of \u003cem\u003eLeishmania donovani\u003c/em\u003e strains into three main groups, one of the clusters showed exclusively Indian isolates, cluster A, while the other two showed an intermixing of both Indian and Nepalese isolates, cluster B and C.\u003c/p\u003e\u003cp\u003eCluster A comprised the strains MHOM/IN/2010/T3, MHOM/IN/2010/T7, MHOM/IN/2012/T8, MHOM/IN/2012/T9, MHOM/IN/2013/P2, MHOM/IN/2013/P2/M1, MHOM/IN/2014/A2, MHOM/IN/2014/H1, MHOM/IN/2014/P6, and MHOM/IN/2014/T10. These strains showed increased mutations in Bridge-like lipid transfer protein family member 1 N-terminal domaining containing protein (LDBPK_160180), Mannosyltransferase like-protein (LDBPK_311920), Calpain-like cysteine peptidase, putative (LDBPK_270510, ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210), and uncharacterized proteins (LDBPK_333030-LDBPK_333040). Strains M1, P2 and T10 isolated from West Bengal, Bihar and Jharkhand grouped together since they had frequency of variants in the above-mentioned genes.\u003c/p\u003e\u003cp\u003eCluster B comprised Indian and Nepalese isolates; BPK275/0/cl18.i, BPK275/0/cl18.ii, BPK275/0/cl18.iii, BPK275/0/cl18.vi, BPK275/0/cl18.viii, BPK275/0/cl18.ix,BPK275/0/cl18.xi, MHOM/IN/1983/AG83.i, MHOM/IN/1983/AG83.ii, and K133AS-R, K133PMM-R and K133WT. This cluster exhibited higher mutations in Uncharacterized protein (LDBPK_310470), uncharacterized proteins (LDBPK_333030-LDBPK_333040), Calpain-like cysteine peptidase, putative (LDBPK_270500-LDBPK_270510), Phosphoglycan beta 1,3 galactosyltransferase (LDBPK_020150), Amastin-like surface protein, putative (LDBPK_341690), Amino acid permease, putative (LDBPK_272680), Amastin-like surface protein, putative (LDBPK_341720), Amastin surface glycofamily protein-CHR_END (LDBPK_044360-CHR_END), more than 30 variants were seen in these genes. A few more genes were observed that contained mutations, not as many as the above mentioned, these were; pseudo (LDBPK_310470), Uncharacterized protein (LDBPK_120661-LDBPK_120667), ATP-binding cassette protein subfamily A, member 2, putative (LDBPK_111210), Uncharacterized protein (LDBPK_093020), Tyrosine phosphatase family protein (LDBPK_321070), Amastin surface glycofamily protein (LDBPK_044340), and Uncharacterized protein (LDBPK_333030-LDBPK_333040).\u003c/p\u003e\u003cp\u003eCluster C comprised Indian and Nepalese isolates; BHU 1220, BHU 575/0/cl2, BPK275/0/cl18.iv, BPK275/0/cl18.v, BPK275/0/cl18.vi, BPK275/0/cl18.x, BPK275/0/cl18.xii, and BPK275/0/cl18.xiii. Strains showed similar mutation profiles to the strains in cluster B but distinct to the cluster A strains. Isolates showed a peculiar abundance in mutation in the Amastin surface glycoprotein (LDBPK_044360-CHR_END) gene and the Amastin-like surface protein, putative (LDBPK_341720).\u003c/p\u003e\u003cp\u003e\u003cb\u003eIndels aggregate in specific regions of the\u003c/b\u003e \u003cb\u003eL. donovani\u003c/b\u003e \u003cb\u003egenome\u003c/b\u003e\u003c/p\u003e\u003cp\u003eHere we provide a comprehensive report on the Indel variation in the population of Nepalese and Indian isolates (Table\u0026nbsp;2). 91 indels were seen to be occurring in all the samples, with different variant classifications, in noncoding and coding regions of the genome. Only 7.6% i.e. 7 indels affected the coding regions. These variants were classified as either \u0026lsquo;In_Frame_Ins\u0026rsquo; or \u0026lsquo;In_Frame_Del\u0026rsquo;, meaning they existed inside the reading frame of the genes and were size multiples of three, thereby preserving the reading frame. Whereas 92%, i.e. 84 indels were in noncoding regions and mostly were size multiples of two. Indels in coding regions may have a severe impact on the structure and function of gene products and hence are negatively selected, which explains the lower frequency of reading frame disrupting indels. Non-coding indels may be better tolerated due to more lenient selection pressure. It has been proposed that non-coding variants may play a role in influencing the gene expression of upstream and downstream genes. 91 variants were spread over coding and non-coding regions of 54 genes. These variants were observed to be occurring in and around specific nucleotide positions of gene sequences. The ATP-binding cassette protein, family A, member 2 (ABCA2) showed 6 frameshift indels, 585_586insTA, 567_568insCTTGC, 557_558insA, 580_581delCA, 572delA, and 559_563delGGTGT, all of which push the reading frame of the coding sequence and are concentrated around nucleotide (nt) 550 to 590. Hypothetical predicted transmembrane protein (LDBPK_240440) had three in-frame indels and two frameshift indels concentrated at nt 1270 to 1290. IGR between two Amastin-like proteins, LDBPK_080760 and LDBPK_080780 had four indels, which could possibly be involved in influencing gene expressions. Indels were seen in the 5\u0026rsquo; Flank region of the Histone H2 protein. 3',5'-cyclic nucleotide phosphodiesterase, putative indels were at 5\u0026rsquo;Flank upstream of the gene sequence. Sodium stibogluconate resistance protein, putative had indels which resulted in reading frame shift which were predicted to be of high impact. This trend continued in the rest of the genes, thereby indicating that indels tend to arise in specific regions of the gene sequences most prone to alterations.\u003c/p\u003e\u003cp\u003e\u003cb\u003eCopy number alterations\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe observed that all of the chromosomes were predicted to have local copy number alterations in them. Chromosomes 36 through 26, 24, 22, 21, 16, 12, 7 and 2 were observed to have developed variations in copy number in all 31 samples. While chromosome 1, 3 and 4 were predicted to have local copy number variations in 5, 14 and 17 samples respectively. We checked genes that were associated with loss in copy number and the frequency of samples they occurred in. We observed that most of the genes showing a loss in copy numbers, that occurred in \u0026ge;\u0026thinsp;30 samples, existed on chromosomes 2, 3, 7, 8, 9, 16, 22, 26 through 36. We checked the same for genes showing a gain in copy number and observed that the same chromosomes, which showed increased loss in copy number alterations, also had increased gain in copy number events. We found 393 genes with a gain in copy numbers; the highest number of CNV alterations were predicted in the chromosome 8 and 31 which may be because of their increased chromosomal copy number.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, WGS data of the Indian sub-continent has been analyzed to investigate the SNP, CNV and ploidy distribution in this region. There are few reports on Indian samples that deals with the genome-wide analysis of paromomycin (PMM) resistant \u003cem\u003eL. donovani\u003c/em\u003e and genetic characterization of \u003cem\u003eL. donovani\u003c/em\u003e variant from Himachal Pradesh (Ghosh, Kumar et al. \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2022\u003c/span\u003e, Lypaczewski, Thakur et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). The aneuploidy analysis has shown that the samples BPK* that belong to Nepal has shown different aneuploidy patterns as compared to other samples. The possible reason is that \u003cem\u003eL. donovani\u003c/em\u003e samples considered for analysis are PMM resistance that act as stimulus for aneuploidy modification. K133WT (Delhi, India) sample that has maximum number of variants has shown triploidy at chromosome 5. BHU.1220 sample collected during 1992 shows tetraploidy at the chromosome 31 only. Therefore, the analysis clearly indicates that \u003cem\u003eLeishmania\u003c/em\u003e has evolved over the time i.e. from 1992 to 2019 and there has been change in the ploidy in response to drug resistance and other stress conditions. The samples from Bihar, Jharkhand and West Bengal have shown triploidy in chromosome 31, however two samples have shown triploidy. It is evident that \u003cem\u003eL. donovani\u003c/em\u003e has different genetic constitution in Nepal area than Indian states (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e). Drug pressure in K133* sample and BPK* sample may lead to genetic drift in the parasite population that lead to change in ploidy. (Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e)\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003eL. donovani\u003c/em\u003e sample from Himachal has shown highest number of variants and this is attributed to its genetic hybridization between \u003cem\u003eL. donovani\u003c/em\u003e parents from \u0026lsquo;Yeti\u0026rsquo; clade reported from Nepal highlands. The SNPs that were present in either samples were removed and the novel SNPs present in this sample has been discussed in the following text. The details of the variants of the sample are given in Supplementary Table\u0026nbsp;3. It was observed that the following proteins were most mutated in the sample of Himachal namely, LdBPK_111210 encoding an ABC protein subfamilyA, LDBPK_060660 protein kinase and putative LdBPK_171090.1, LDBPK_281830, LDBPK_361500, LDBPK_091050, LDBPK_161130, LDBPK_262120, LDBPK_251150, LDBPK_191110 genes encode for hypothetical proteins (Supplementary Fig.\u0026nbsp;3). It is evident that the proteins which have shown more mutations are hypothetical proteins and need to be characterized. However, this sample follows the same pattern as other samples in case of higher mutation rate in transporter protein. Since this sample organism has been reported as an intraspecies hybrid two distinct clades of \u003cem\u003eL. donovani\u003c/em\u003e from Ethiopia can be the reason that there are numerous novel mutations (Lypaczewski, Thakur et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). However in all samples of Indian sub-continent transporter protein, amastin like proteins, protein involved in drug resistance, heat shock proteins and hypothetical proteins have shown maximum variants. The details of the variants present in all samples are given in Supplementary Table\u0026nbsp;4 and the name of protein encoded by corresponding gene is given in Supplementary Table\u0026nbsp;5. There is genetic plasticity in \u003cem\u003eLeishmania\u003c/em\u003e genome, CNV affects either some genomic region or whole chromosomes as observed for chromosome 8 and 31 that leads to aneuploidy. In recent studies, only chromosome 31 has shown high number of CNVs, however the present analysis shows high number of CNVs in chromosome 8 and 31 (Ghosh, Kumar et al. \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2022\u003c/span\u003e, Lypaczewski, Thakur et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). This analysis of Indian sub-continent data is providing a wholesome image of CNV which is different from ealier obsevations. The proteins which have shown maximum missense mutations are transporter protein, proteins involved in carbohydrate metabolism, heat shock protein, amastin proteins, sodium stibogluconate resistant protein and hypothetical proteins. While in the case sample from Delhi, India, mutations were observed in transcription, translation (60S ribosomal protein L22, 60S ribosomal protein L40, RNA binding protein, eukaryotic translation initiation factor 4E-1 and 4E-2), virulence factor (protein-tyrosine phosphatase), and signaling molecule (protein kinase and protein-tyrosine phosphatase) (Ghosh, Kumar et al. \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). 42 ABC genes namely ABCA to ABCH has been reported. In previous studies, membrane/transporter proteins like ABCG1, ABCG2, ABCG3, ABCA2, ABCA7, and ABCA8 has shown missense mutations (Ghosh, Verma et al. \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). However, in the present study, ABC subfamily A, member 2 (ABCA2) and member 9 (ABCA9) has shown maximum missense mutations which are primarily involved in drug efflux and responsible for multi-drug resistance. Two transmembrane glycorotein i.e. Amsatin-like surface protein and amastin surface glycofamily protein has shown maximum mutations after transporter proteins. As reported in earlier studies, amastin-like protein is reported to be an important precursor for the phenotype resistance development against antimonial and paramomycin (Ghosh, Kumar et al. \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Mannosyltransferase protein has shown high number of missense mutations. The overexpression of Mannosyltransferase protein in promastigote and amastigote forms of \u003cem\u003eL. braziliensis\u003c/em\u003e protects from trivalent antimony (Ribeiro, Rocha et al. \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). Heat shock protein 70-related protein 1, putative, mitochondrial has high missense mutations and in the earlier studies, other types of HSP proteins has shown mutations or altered expression and are also linked to miltefosine resistance (Vacchina, Norris-Mullins et al. \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2016\u003c/span\u003e, Alcolea, Alonso et al. \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Leishmaniasis treatment failure to sodium stibogluconate has been described in some studies which are related to expression of γ-Glumatylcysteine synthase, however there is no record of the missense mutations of sodium stibogluconate resistant protein putative. It was observed that sodium stiboglocunate resistant protein putative is among the top five mutated protein in the samples of Indian subcontinent. The mutated proteins observed from WGS data of \u003cem\u003eL. donovani\u003c/em\u003e from ISC may act as initial step for further validation of these proteins in the virulence and understanding the drug resistance mechanism of these proteins.\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eOwing to the changing climate conditions, Indian sub-continent geographical area is more prone to these changes. The analysis of the clinical samples of \u003cem\u003eL. donovani\u003c/em\u003e from Indian sub-continent has provided novel insights into the genetic makeup of the protozoan present here. The overall trend of the mutations in the protein of \u003cem\u003eL. donovani\u003c/em\u003e from Indian sub-continent shows that there is increasing mutations in the proteins which are involved in the virulence and drug-resistant mechanism. The membrane proteins has shown most of the variants as they throw out the drugs from the pathogen by efflux mechanism. CNV analysis has shown different pattern of aneuploidy in the \u003cem\u003eL. donovani\u003c/em\u003e from Indian sub-continent when compared to the earlier studies from other part of the world. Therefore, the proteins and variants obtained can be the founding steps to develop strategic drugs for the treatment of Leishmaniasis.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003ch2\u003eDeclaration of competing interest\u003c/h2\u003e\u003cp\u003eThe authors declare that they have no conflict of interest.\u003c/p\u003e\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eAV has curated the data, carried out the WGS analysis, and written the first draft of the paper. IK: Streamlined the analysis workflow, written the results and discussion. ANJ: has checked, edited, and given suggestions in the manuscript.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eANJ gratefully acknowledges the ICMR project for providing computing facilities.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAlcolea, P. J., A. Alonso, F. Garc\u0026iacute;a-Tabares, J. Larraga, L. T. Martins, F. J. Loayza, S. Ruiz-Garc\u0026iacute;a and V. Larraga (2023). \u0026quot;An insight into differential protein abundance throughout Leishmania donovani promastigote growth and differentiation.\u0026quot; \u003cu\u003eInternational Microbiology\u003c/u\u003e\u003cstrong\u003e26\u003c/strong\u003e(1): 25-42.\u003c/li\u003e\n\u003cli\u003eAlvarenga, D. G. d., P. M. F. Escalda, A. S. V. d. Costa and M. T. F. D. Monreal (2010). \u0026quot;Leishmaniose visceral: estudo retrospectivo de fatores associados \u0026agrave; letalidade.\u0026quot; \u003cu\u003eRevista da Sociedade Brasileira de Medicina Tropical\u003c/u\u003e\u003cstrong\u003e43\u003c/strong\u003e: 194-197.\u003c/li\u003e\n\u003cli\u003eAndrews, S. (2010). FastQC: a quality control tool for high throughput sequence data, Babraham Bioinformatics, Babraham Institute, Cambridge, United Kingdom.\u003c/li\u003e\n\u003cli\u003eAwasthi, A., R. K. Mathur and B. Saha (2004). \u0026quot;Immune response to Leishmania infection.\u0026quot; \u003cu\u003eIndian Journal of Medical Research\u003c/u\u003e\u003cstrong\u003e119\u003c/strong\u003e: 238-258.\u003c/li\u003e\n\u003cli\u003eBoeva, V., T. Popova, K. Bleakley, P. Chiche, J. Cappo, G. Schleiermacher, I. Janoueix-Lerosey, O. Delattre and E. Barillot (2012). \u0026quot;Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data.\u0026quot; \u003cu\u003eBioinformatics\u003c/u\u003e\u003cstrong\u003e28\u003c/strong\u003e(3): 423-425.\u003c/li\u003e\n\u003cli\u003eBolger, A. M., M. Lohse and B. Usadel (2014). \u0026quot;Trimmomatic: a flexible trimmer for Illumina sequence data.\u0026quot; \u003cu\u003eBioinformatics\u003c/u\u003e\u003cstrong\u003e30\u003c/strong\u003e(15): 2114-2120.\u003c/li\u003e\n\u003cli\u003eBora, N. and A. N. Jha (2020). \u0026quot;In silico metabolic pathway analysis identifying target against leishmaniasis\u0026ndash;a kinetic modeling approach.\u0026quot; \u003cu\u003eFrontiers in Genetics\u003c/u\u003e\u003cstrong\u003e11\u003c/strong\u003e: 179.\u003c/li\u003e\n\u003cli\u003eBora, N. and A. Nath Jha (2019). \u0026quot;An integrative approach using systems biology, mutational analysis with molecular dynamics simulation to challenge the functionality of a target protein.\u0026quot; \u003cu\u003eChemical biology \u0026amp; drug design\u003c/u\u003e\u003cstrong\u003e93\u003c/strong\u003e(6): 1050-1060.\u003c/li\u003e\n\u003cli\u003eCingolani, P., A. Platts, L. L. Wang, M. Coon, T. Nguyen, L. Wang, S. J. Land, X. Lu and D. M. Ruden (2012). \u0026quot;A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.\u0026quot; \u003cu\u003efly\u003c/u\u003e\u003cstrong\u003e6\u003c/strong\u003e(2): 80-92.\u003c/li\u003e\n\u003cli\u003eClark, K., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell and E. W. Sayers (2016). \u0026quot;GenBank.\u0026quot; \u003cu\u003eNucleic acids research \u003c/u\u003e\u003cstrong\u003e44\u003c/strong\u003e(D1): D67-D72.\u003c/li\u003e\n\u003cli\u003eDanecek, P., J. K. Bonfield, J. Liddle, J. Marshall, V. Ohan, M. O. Pollard, A. Whitwham, T. Keane, S. A. McCarthy and R. M. Davies (2021). \u0026quot;Twelve years of SAMtools and BCFtools.\u0026quot; \u003cu\u003eGigascience \u003c/u\u003e\u003cstrong\u003e10\u003c/strong\u003e(2): giab008.\u003c/li\u003e\n\u003cli\u003ede Paiva, R. M. C., V. Grazielle-Silva, M. S. Cardoso, B. N. Nakagaki, R. P. Mendonca-Neto, A. M. C. Canavaci, N. Souza Melo, P. M. Martinelli, A. P. Fernandes and W. D. daRocha (2015). \u0026quot;Amastin knockdown in Leishmania braziliensis affects parasite-macrophage interaction and results in impaired viability of intracellular amastigotes.\u0026quot; \u003cu\u003ePLoS pathogens \u003c/u\u003e\u003cstrong\u003e11\u003c/strong\u003e(12): e1005296.\u003c/li\u003e\n\u003cli\u003eDowning, T., H. Imamura, S. Decuypere, T. G. Clark, G. H. Coombs, J. A. Cotton, J. D. Hilley, S. de Doncker, I. Maes and J. C. Mottram (2011). \u0026quot;Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance.\u0026quot; \u003cu\u003eGenome research \u003c/u\u003e\u003cstrong\u003e21\u003c/strong\u003e(12): 2143-2156.\u003c/li\u003e\n\u003cli\u003eDumetz, F., H. Imamura, M. Sanders, V. Seblova, J. Myskova, P. Pescher, M. Vanaerschot, C. J. Meehan, B. Cuypers and G. De Muylder (2017). \u0026quot;Modulation of aneuploidy in Leishmania donovani during adaptation to different in vitro and in vivo environments and its impact on gene expression.\u0026quot; \u003cu\u003eMBio\u003c/u\u003e\u003cstrong\u003e8\u003c/strong\u003e(3): 10.1128/mbio. 00599-00517. \u003c/li\u003e\n\u003cli\u003eGhosh, S., V. Kumar, A. Verma, T. Sharma, D. Pradhan, A. Selvapandiyan, P. Salotra and R. Singh (2022). \u0026quot;Genome-wide analysis reveals allelic variation and chromosome copy number variation in paromomycin-resistant Leishmania donovani.\u0026quot; \u003cu\u003eParasitology Research \u003c/u\u003e\u003cstrong\u003e121\u003c/strong\u003e(11): 3121-3132.\u003c/li\u003e\n\u003cli\u003eGhosh, S., A. Verma, V. Kumar, D. Pradhan, A. Selvapandiyan, P. Salotra and R. Singh (2020). \u0026quot;Genomic and transcriptomic analysis for identification of genes and interlinked pathways mediating artemisinin resistance in Leishmania donovani.\u0026quot; \u003cu\u003eGenes\u003c/u\u003e\u003cstrong\u003e11\u003c/strong\u003e(11): 1362.\u003c/li\u003e\n\u003cli\u003eJoshi, A. B., M. R. Banjara, M. L. Das, P. Ghale, K. R. Pant, U. R. Pyakurel, G. Dahal, K. P. Paudel, C. L. Das, A. J. T. A. J. o. T. M. Kroeger and Hygiene (2024). \u0026quot;Epidemiological, Serological, and Entomological Investigation of New Visceral Leishmaniasis Foci in Nepal.\u0026quot; \u003cu\u003eThe American Journal of Tropical Medicine and Hygiene\u003c/u\u003e\u003cstrong\u003e110\u003c/strong\u003e(1): 44.\u003c/li\u003e\n\u003cli\u003eKumari, I., D. Lakhanpal, S. Swargam and A. Nath Jha (2023). \u0026quot;Leishmaniasis: Omics approaches to understand its biology from molecule to cell level.\u0026quot; \u003cu\u003eCurrent Protein and Peptide Science\u003c/u\u003e\u003cstrong\u003e24\u003c/strong\u003e(3): 229-239.\u003c/li\u003e\n\u003cli\u003eKumari, P., A. Mamud and A. N. Jha (2023). \u0026quot;Review on the Drug Intolerance and Vaccine Development for the Leishmaniasis.\u0026quot; \u003cu\u003eCurrent Drug Targets\u003c/u\u003e\u003cstrong\u003e24\u003c/strong\u003e(13): 1023-1031.\u003c/li\u003e\n\u003cli\u003eLi, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin and G. P. D. P. Subgroup (2009). \u0026quot;The sequence alignment/map format and SAMtools.\u0026quot; \u003cu\u003ebioinformatics \u003c/u\u003e\u003cstrong\u003e25\u003c/strong\u003e(16): 2078-2079.\u003c/li\u003e\n\u003cli\u003eLira, R., S. Sundar, A. Makharia, R. Kenney, A. Gam, E. Saraiva and D. Sacks (1999). \u0026quot;Evidence that the high incidence of treatment failures in Indian kala-azar is due to the emergence of antimony-resistant strains of Leishmania donovani.\u0026quot; \u003cu\u003eThe Journal of infectious diseases \u003c/u\u003e\u003cstrong\u003e180\u003c/strong\u003e(2): 564-567.\u003c/li\u003e\n\u003cli\u003eLypaczewski, P., L. Thakur, A. Jain, S. Kumari, K. Paulini, G. Matlashewski and M. Jain (2022). \u0026quot;An intraspecies Leishmania donovani hybrid from the Indian subcontinent is associated with an atypical phenotype of cutaneous disease.\u0026quot; \u003cu\u003eIscience \u003c/u\u003e\u003cstrong\u003e25\u003c/strong\u003e(2).\u003c/li\u003e\n\u003cli\u003eMann, S., K. Frasca, S. Scherrer, A. F. Henao-Mart\u0026iacute;nez, S. Newman, P. Ramanan and J. A. Suarez (2021). \u0026quot;A review of leishmaniasis: current knowledge and future directions.\u0026quot; \u003cu\u003eCurrent tropical medicine reports \u003c/u\u003e\u003cstrong\u003e8\u003c/strong\u003e: 121-132.\u003c/li\u003e\n\u003cli\u003eMart\u0026iacute;nez, F. V., E. T. Guerrero, R. Arenas and M. Q. Cedillo (2011). \u0026quot;Leishmaniasis en M\u0026eacute;xico.\u0026quot; \u003cu\u003eMedicina Cut\u0026aacute;nea Ibero-Latino-Americana \u003c/u\u003e\u003cstrong\u003e39\u003c/strong\u003e(4): 163-183.\u003c/li\u003e\n\u003cli\u003eMayakonda, A., D.-C. Lin, Y. Assenov, C. Plass and H. P. Koeffler (2018). \u0026quot;Maftools: efficient and comprehensive analysis of somatic variants in cancer.\u0026quot; \u003cu\u003eGenome research \u003c/u\u003e\u003cstrong\u003e28\u003c/strong\u003e(11): 1747-1756.\u003c/li\u003e\n\u003cli\u003eOuellette, M., D. L\u0026eacute;gar\u0026eacute;, A. Haimeur, K. Grondin, G. Roy, C. Brochu and B. Papadopoulou (1998). \u0026quot;ABC transporters in Leishmania and their role in drug resistance.\u0026quot; \u003cu\u003eDrug Resistance Updates \u003c/u\u003e\u003cstrong\u003e1\u003c/strong\u003e(1): 43-48.\u003c/li\u003e\n\u003cli\u003ePatil, R. R. and P. K. Chatterjee (2023). \u0026quot;Epidemiology of Visceral Leishmaniasis in India.\u0026quot;\u003c/li\u003e\n\u003cli\u003ePaysan-Lafosse, T., M. Blum, S. Chuguransky, T. Grego, B. L. Pinto, G. A. Salazar, M. L. Bileschi, P. Bork, A. Bridge and L. Colwell (2023). \u0026quot;InterPro in 2022.\u0026quot; \u003cu\u003eNucleic acids research\u003c/u\u003e \u003cstrong\u003e51\u003c/strong\u003e(D1): D418-D427.\u003c/li\u003e\n\u003cli\u003ePilling, O. A., J. L. Reis-Cunha, C. A. Grace, A. S. Berry, M. W. Mitchell, J. A. Yu, C. R. Malekshahi, E. Krespan, C. K. Go and C. Lombana (2023). \u0026quot;Selective whole-genome amplification reveals population genetics of Leishmania braziliensis directly from patient skin biopsies.\u0026quot; \u003cu\u003ePLoS Pathogens \u003c/u\u003e\u003cstrong\u003e19\u003c/strong\u003e(3): e1011230.\u003c/li\u003e\n\u003cli\u003eReuter, J. A., D. V. Spacek and M. P. Snyder (2015). \u0026quot;High-throughput sequencing technologies.\u0026quot; \u003cu\u003eMolecular cell \u003c/u\u003e\u003cstrong\u003e58\u003c/strong\u003e(4): 586-597.\u003c/li\u003e\n\u003cli\u003eRezaei, Z., A. Tahmasebi and B. Pourabbas (2024). \u0026quot;Using meta-analysis and machine learning to investigate the transcriptional response of immune cells to Leishmania infection.\u0026quot; \u003cu\u003ePLOS Neglected Tropical Diseases \u003c/u\u003e\u003cstrong\u003e18\u003c/strong\u003e(1): e0011892.\u003c/li\u003e\n\u003cli\u003eRibeiro, C. V., B. F. B. Rocha, D. d. S. Moreira, V. Peruhype-Magalh\u0026atilde;es and S. M. F. Murta (2019). \u0026quot;Mannosyltransferase (GPI-14) overexpression protects promastigote and amastigote forms of Leishmania braziliensis against trivalent antimony.\u0026quot; \u003cu\u003eParasites \u0026amp; vectors\u003c/u\u003e\u003cstrong\u003e12\u003c/strong\u003e: 1-7.\u003c/li\u003e\n\u003cli\u003eRiebenbauer, K., S. Czerny, M. Egg, N. Urban, T. Kinaciyan, A. Hampel, L. Fidelsberger, F. Karlhofer, S. Porkert and J. Walochnik (2024). \u0026quot;The changing epidemiology of human leishmaniasis in the non-endemic country of Austria between 2000 to 2021, including a congenital case.\u0026quot; \u003cu\u003ePLOS Neglected Tropical Diseases \u003c/u\u003e\u003cstrong\u003e18\u003c/strong\u003e(1): e0011875.\u003c/li\u003e\n\u003cli\u003eS\u0026aacute;dlov\u0026aacute;, J., M. Yeo, D. S. Mateus, J. Phelan, L. A. Hai, T. Bhattacharyya, S. Kurtev, O. Sebesta, J. Myskova and V. Seblova (2024). \u0026quot;Comparative genomics of Leishmania donovani progeny from genetic crosses in two sand fly species and impact on the diversity of diagnostic and vaccine candidates.\u0026quot; \u003cu\u003ePLOS Neglected Tropical Diseases \u003c/u\u003e\u003cstrong\u003e18\u003c/strong\u003e(1): e0011920.\u003c/li\u003e\n\u003cli\u003eSaha, D., N. J. Borah and A. N. Jha (2023). \u0026quot;Molecular scaffold recognition of drug molecules against essential genes of Leishmania donovani using biocomputing approach.\u0026quot; \u003cu\u003eSouth African Journal of Botany \u003c/u\u003e\u003cstrong\u003e162\u003c/strong\u003e: 52-63.\u003c/li\u003e\n\u003cli\u003eSanti, A. M. M. and S. M. F. Murta (2022). \u0026quot;Impact of genetic diversity and genome plasticity of Leishmania spp. in treatment and the search for novel chemotherapeutic targets.\u0026quot; \u003cu\u003eFrontiers in Cellular and Infection Microbiology\u003c/u\u003e\u003cstrong\u003e12\u003c/strong\u003e: 9.\u003c/li\u003e\n\u003cli\u003eSapp, S. G., R. Low, G. Nine, F. S. Nascimento, Y. Qvarnstrom and J. L. Barratt (2024). \u0026quot;Genetic characterization and description of Leishmania (Leishmania) ellisi sp. nov.: a new human-infecting species from the USA.\u0026quot; \u003cu\u003eParasitology Research \u003c/u\u003e\u003cstrong\u003e123\u003c/strong\u003e(1): 52.\u003c/li\u003e\n\u003cli\u003eSchlein, Y. (1993). \u0026quot;Leishmania and sandflies: interactions in the life cycle and transmission.\u0026quot; \u003cu\u003eParasitology Today \u003c/u\u003e\u003cstrong\u003e9\u003c/strong\u003e(7): 255-258.\u003c/li\u003e\n\u003cli\u003eShirley, M. D., Z. Ma, B. S. Pedersen and S. J. Wheelan (2015). Efficient\u0026quot; pythonic\u0026quot; access to FASTA files using pyfaidx, PeerJ PrePrints.\u003c/li\u003e\n\u003cli\u003eSofi, O. M. U. D., R. Godara, R. Katoch and A. Yadav (2023). \u0026quot;Morphological and molecular evidence of sandfly (Diptera: Psychodidae: Phlebotomine) and its relevance to recent cases of leishmaniasis from Jammu region of North India.\u0026quot; \u003cu\u003eJournal of Vector Borne Diseases \u003c/u\u003e\u003cstrong\u003e60\u003c/strong\u003e(3): 328-332.\u003c/li\u003e\n\u003cli\u003eTorres-Guerrero, E., M. R. Quintanilla-Cedillo, J. Ruiz-Esmenjaud and R. Arenas (2017). \u0026quot;Leishmaniasis: a review.\u0026quot; \u003cu\u003eFResearch \u003c/u\u003e\u003cstrong\u003e6\u003c/strong\u003e.\u003c/li\u003e\n\u003cli\u003eVacchina, P., B. Norris-Mullins, E. Carlson and M. Morales (2016). \u0026quot;A mitochondrial HSP70 (HSPA9B) is linked to miltefosine resistance and stress response in Leishmania donovani.\u0026quot; \u003cu\u003eParasites \u0026amp; vectors \u003c/u\u003e\u003cstrong\u003e9\u003c/strong\u003e: 1-15.\u003c/li\u003e\n\u003cli\u003eVan der Auwera, G. A., M. O. Carneiro, C. Hartl, R. Poplin, G. Del Angel, A. Levy‐Moonshine, T. Jordan, K. Shakir, D. Roazen and J. Thibault (2013). \u0026quot;From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline.\u0026quot; \u003cu\u003eCurrent protocols in bioinformatics \u003c/u\u003e\u003cstrong\u003e43\u003c/strong\u003e(1): 11.10. 11-11.10. 33.\u003c/li\u003e\n\u003cli\u003eWang, K., M. Li and H. Hakonarson (2010). \u0026quot;ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.\u0026quot; \u003cu\u003eNucleic acids research \u003c/u\u003e\u003cstrong\u003e38\u003c/strong\u003e(16): e164-e164.\u003c/li\u003e\n\u003cli\u003eZhang, W. W. and G. Matlashewski (2001). \u0026quot;Characterization of the A2\u0026ndash;A2rel gene cluster in Leishmania donovani: involvement of A2 in visceralization during infection.\u0026quot; \u003cu\u003eMolecular microbiology \u003c/u\u003e\u003cstrong\u003e39\u003c/strong\u003e(4): 935-948.\u003c/li\u003e\n\u003cli\u003eBharadava, K., Upadhyay, T. K., Kaushal, R. S., Ahmad, I., Alraey, Y., Siddiqui, S., \u0026amp; Saeed, M. (2024). Genomic Insight of Leishmania Parasite: In-Depth Review of Drug Resistance Mechanisms and Genetic Mutations. ACS omega, 9(11), 12500-12514.\u003c/li\u003e\n\u003cli\u003eAl Khoury, C., Thoumi, S., Tokajian, S., Sinno, A., Nemer, G., El Beyrouthy, M., \u0026amp; Rahy, K. (2024). ABC transporter inhibition by beauvericin partially overcomes drug resistance in Leishmania tropica. Antimicrobial Agents and Chemotherapy, 68(5), e01368-23.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTables 1 and 2 are available in the Supplementary Files section.\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"antonie-van-leeuwenhoek","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"anto","sideBox":"Learn more about [Antonie van Leeuwenhoek](https://www.springer.com/journal/10482)","snPcode":"10482","submissionUrl":"https://submission.nature.com/new-submission/10482/3","title":"Antonie van Leeuwenhoek","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Whole Genome Sequences, Leishmania, Indian sub-continent, Drug Resistance, Single Nucleotide Polymorphism, Mosaic Aneuploidy","lastPublishedDoi":"10.21203/rs.3.rs-7267613/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7267613/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eLeishmaniasis is one of the neglected tropical diseases that is endemic to over 90 countries and it\u0026rsquo;s cases are being reported from non-endemic countries as well like Austria. The cases Visceral Leishmaniasis (VL) caused by \u003cem\u003eLeishmania donovani\u003c/em\u003e are concentrated on the Indian Subcontinent. Several studies on the genetic heterogeneity, aneuploidy, and drug resistance emergence reported from Indian subcontinent as well as globally. However, no research has yet inspected the genomic data from the Indian subcontinent to depict a larger microscopic investigation at the genome level. We have considered whole genome sequence (WGS) data from the publicly available database i.e. ENA. Analysis has shown that there is tetraploidy in the chromosome 31, trisomy in chromosome 2 and 8 and trisomy was observed among chromosome 6 and 15 among some samples. This aneuploidy pattern is evolving over time observed in the present study. The pattern of aneuploidy variations of the Indian subcontinent sample differ from from other continents. Further, ATP-binding cassette (ABC) family, Amastin-like surface proteins, A2 genes, amino acid permeases, heat shock 70-related protein 1, mitochondrial precursor and sodium stibogluconate resistance protein were the proteins that exhibited maximum number of mutations among all the analyzed samples. The proteins showing highest number of mutations belong to membrane prtoeins that are involved in drug resistance mechanism. Most of these proteins are involved in the virulence and drug resistant mechanism. The present study provides the possible candidates which can be targeted to disarm the virulence of the protozoan and drug candidates for therapeutic interventions.\u003c/p\u003e","manuscriptTitle":"Insights into Genetic Heterogeneity and Drug Resistance in Leishmania donovani of the Indian sub-continent from genomic data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-11 10:06:35","doi":"10.21203/rs.3.rs-7267613/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-08-22T16:42:33+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-08-22T16:09:33+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-08-07T14:38:32+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"248772286517229067201520414179723271903","date":"2025-08-07T13:08:39+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"299056139358416874076492075913957149094","date":"2025-08-06T19:29:25+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-08-06T04:13:09+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-08-02T01:15:47+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-08-02T01:15:46+00:00","index":"","fulltext":""},{"type":"submitted","content":"Antonie van Leeuwenhoek","date":"2025-08-01T05:26:08+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"antonie-van-leeuwenhoek","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"anto","sideBox":"Learn more about [Antonie van Leeuwenhoek](https://www.springer.com/journal/10482)","snPcode":"10482","submissionUrl":"https://submission.nature.com/new-submission/10482/3","title":"Antonie van Leeuwenhoek","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"96700662-ae80-494d-abcc-bf187171d336","owner":[],"postedDate":"August 11th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-11-24T15:59:58+00:00","versionOfRecord":{"articleIdentity":"rs-7267613","link":"https://doi.org/10.1007/s10482-025-02199-1","journal":{"identity":"antonie-van-leeuwenhoek","isVorOnly":false,"title":"Antonie van Leeuwenhoek"},"publishedOn":"2025-11-19 15:57:04","publishedOnDateReadable":"November 19th, 2025"},"versionCreatedAt":"2025-08-11 10:06:35","video":"","vorDoi":"10.1007/s10482-025-02199-1","vorDoiUrl":"https://doi.org/10.1007/s10482-025-02199-1","workflowStages":[]},"version":"v1","identity":"rs-7267613","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7267613","identity":"rs-7267613","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.