Ensemble Techniques for Predictive Modeling of Leishmanial Activity via Molecular Fingerprints | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Ensemble Techniques for Predictive Modeling of Leishmanial Activity via Molecular Fingerprints Saif Nalband, Pallavi Kiratkar, Maulik Gupta, Mansi Gambhir, Surbhi Sonam, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6088035/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 14 Oct, 2025 Read the published version in BMC Medical Informatics and Decision Making → Version 1 posted 11 You are reading this latest preprint version Abstract Background: Leishmaniasis, a neglected tropical disease caused by Leishmania protozoan parasites and transmitted by sandflies, poses a significant global health challenge, especially in resource-limited environments. The life cycle of the parasite includes crucial amastigote and promastigote stages, each contributing importantly to the infection process. The current therapies for leishmaniasis face limitations due to considerable side effects and the rise of drug-resistant strains, underscoring the pressing need for new, effective, and safe treatment options. \textcolor{red}{Recent advancements in leishmaniasis vaccine development include live attenuated vaccines, recombinant vaccines, and the use of synthetic biology. These approaches aim to induce robust immune responses while ensuring safety. Controlled human infection studies are also being explored to accelerate vaccine development. However, a licensed vaccine remains elusive.} Method: This study introduces a novel method for drug discovery targeting leishmaniasis, employing machine learning and cheminformatics to forecast the efficacy of compounds against Leishmania promastigotes. A detailed dataset consisting of 65,057 molecules sourced from the PubChem database is utilized, with the Alamar Blue-based assay applied to assess drug susceptibility. The data encoding relies on molecular fingerprints derived from Simplified Molecular Input Line Entry System (SMILES) notations. We employed three distinct fingerprint algorithms, Avalon, MACCS Key, and Pharmacophore, for the development of machine learning models. Various algorithms, including random forest, multilayer perceptron, gradient boosting, and decision tree, are utilized to create models that effectively classify molecules as either active or inactive based on their structural and chemical characteristics, which could significantly impact the drug discovery process for leishmaniasis. Results: We additionally introduced a model based on ensembles, achieving a peak accuracy of 83.65% and an area under the curve of 0.8367. This study offers significant promise in enhancing drug discovery efforts focused on tackling the global issue of leishmaniasis. Conclusion: Furthermore, the proposed approach has the potential to serve as a framework for addressing other overlooked tropical diseases, offering a promising alternative to conventional drug discovery methods and their associated difficulties. Leishmanania Machine Learning Molecular Fingerprints Ensemble Learning Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 14 Oct, 2025 Read the published version in BMC Medical Informatics and Decision Making → Version 1 posted Editorial decision: Revision requested 05 May, 2025 Reviews received at journal 29 Apr, 2025 Reviews received at journal 29 Apr, 2025 Reviewers agreed at journal 29 Apr, 2025 Reviews received at journal 24 Apr, 2025 Reviewers agreed at journal 24 Apr, 2025 Reviewers agreed at journal 22 Apr, 2025 Reviewers agreed at journal 22 Apr, 2025 Reviewers invited by journal 22 Apr, 2025 Submission checks completed at journal 22 Apr, 2025 First submitted to journal 16 Apr, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6088035","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":446422910,"identity":"34a20088-c9e5-4821-b644-35c7ae74af2a","order_by":0,"name":"Saif Nalband","email":"","orcid":"","institution":"Thapar Institute of Engineering and Technology","correspondingAuthor":false,"prefix":"","firstName":"Saif","middleName":"","lastName":"Nalband","suffix":""},{"id":446422911,"identity":"e6a5067d-c276-4acc-b952-3c36c797db4f","order_by":1,"name":"Pallavi Kiratkar","email":"","orcid":"","institution":"Indian Institute of Science Education and Research","correspondingAuthor":false,"prefix":"","firstName":"Pallavi","middleName":"","lastName":"Kiratkar","suffix":""},{"id":446422912,"identity":"6b4e2509-9be3-4fc7-8a9e-9180863b81ff","order_by":2,"name":"Maulik Gupta","email":"","orcid":"","institution":"Thapar Institute of Engineering and Technology","correspondingAuthor":false,"prefix":"","firstName":"Maulik","middleName":"","lastName":"Gupta","suffix":""},{"id":446422913,"identity":"73e850f5-daf8-4111-8f6e-ef1357f45c24","order_by":3,"name":"Mansi Gambhir","email":"","orcid":"","institution":"Thapar Institute of Engineering and Technology","correspondingAuthor":false,"prefix":"","firstName":"Mansi","middleName":"","lastName":"Gambhir","suffix":""},{"id":446422914,"identity":"e680ab94-e39f-4792-96b9-52500918d733","order_by":4,"name":"Surbhi Sonam","email":"","orcid":"","institution":"D Y Patil International University","correspondingAuthor":false,"prefix":"","firstName":"Surbhi","middleName":"","lastName":"Sonam","suffix":""},{"id":446422915,"identity":"f4c8f00b-160a-445d-8fe5-ecca1da02bd7","order_by":5,"name":"Femi Robert","email":"","orcid":"","institution":"SRM Institute of Science and Technology, Kattankulathu Tamil Nadu","correspondingAuthor":false,"prefix":"","firstName":"Femi","middleName":"","lastName":"Robert","suffix":""},{"id":446422916,"identity":"e831deaf-b80f-427f-8364-f5812d0366ae","order_by":6,"name":"A. Amalin Prince","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABTUlEQVRIie2RMWvCQBSAXxDO5TDrlQTzFyKZBKt/JSEQl9j2B3RICMTF2q0I/RXdOp4EnNpfEAdDwalDupTYivSdNtZU6dzhPo53x937eLx7ABLJ/0ThNkYijrkITAnERhsYOJRPVfaKMtkpoXAo+UspDzW6VWBX5kTmFvUuWfDscX7Z0KLspfPRMUCLhlkBc53Un6ccrrvQ0IJDhaWeyZ2nZTvWZ5Y1GHutQJ+GEYUlJfTC5jBzgei8Uia1UYkTkzCfaINRogTMCSNlk2AvvsmB4GL2oWGk/bxU6p/tUdITSlgAKuorKpsjxUz9nyo1KBJHKAEVCsMnJT5SWql/hcoSFc86uwk8N8Ze7re9MKzijF36S2mm/YdsFc9NY+JmebHunN9qw8Ub/lhPVX0rz9+7TWNSUaqjUeL9OL5vMJmeHk/5i+ujG4lEIpHAF8m6e+J98oH5AAAAAElFTkSuQmCC","orcid":"","institution":"Birla Institute of Technology and Science, Pilani - Goa Campus","correspondingAuthor":true,"prefix":"","firstName":"A.","middleName":"Amalin","lastName":"Prince","suffix":""}],"badges":[],"createdAt":"2025-02-23 03:53:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6088035/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6088035/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12911-025-03041-4","type":"published","date":"2025-10-14T15:57:12+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":93956073,"identity":"0bf7b7c7-19bb-4dd6-bda4-f40500ba936f","added_by":"auto","created_at":"2025-10-20 16:10:06","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":986607,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscriptv1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6088035/v1_covered_e26970aa-c176-422d-b942-bc659b0ae25c.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Ensemble Techniques for Predictive Modeling of Leishmanial Activity via Molecular Fingerprints","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-informatics-and-decision-making","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"midm","sideBox":"Learn more about [BMC Medical Informatics and Decision Making](http://bmcmedinformdecismak.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/midm/default.aspx","title":"BMC Medical Informatics and Decision Making","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Leishmanania, Machine Learning, Molecular Fingerprints, Ensemble Learning","lastPublishedDoi":"10.21203/rs.3.rs-6088035/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6088035/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground:\u003c/strong\u003eLeishmaniasis, a neglected tropical disease caused by Leishmania protozoan parasites and transmitted by sandflies, poses a significant global health challenge, especially in resource-limited environments. The life cycle of the parasite includes crucial amastigote and promastigote stages, each contributing importantly to the infection process. The current therapies for leishmaniasis face limitations due to considerable side effects and the rise of drug-resistant strains, underscoring the pressing need for new, effective, and safe treatment options. \\textcolor{red}{Recent advancements in leishmaniasis vaccine development include live attenuated vaccines, recombinant vaccines, and the use of synthetic biology. These approaches aim to induce robust immune responses while ensuring safety. Controlled human infection studies are also being explored to accelerate vaccine development. However, a licensed vaccine remains elusive.}\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethod:\u003c/strong\u003eThis study introduces a novel method for drug discovery targeting leishmaniasis, employing machine learning and cheminformatics to forecast the efficacy of compounds against Leishmania promastigotes. A detailed dataset consisting of 65,057 molecules sourced from the PubChem database is utilized, with the Alamar Blue-based assay applied to assess drug susceptibility. The data encoding relies on molecular fingerprints derived from Simplified Molecular Input Line Entry System (SMILES) notations. We employed three distinct fingerprint algorithms, Avalon, MACCS Key, and Pharmacophore, for the development of machine learning models. Various algorithms, including random forest, multilayer perceptron, gradient boosting, and decision tree, are utilized to create models that effectively classify molecules as either active or inactive based on their structural and chemical characteristics, which could significantly impact the drug discovery process for leishmaniasis.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003e We additionally introduced a model based on ensembles, achieving a peak accuracy of 83.65% and an area under the curve of 0.8367. This study offers significant promise in enhancing drug discovery efforts focused on tackling the global issue of leishmaniasis.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e Furthermore, the proposed approach has the potential to serve as a framework for addressing other overlooked tropical diseases, offering a promising alternative to conventional drug discovery methods and their associated difficulties.\u003c/p\u003e","manuscriptTitle":"Ensemble Techniques for Predictive Modeling of Leishmanial Activity via Molecular Fingerprints","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-23 08:24:16","doi":"10.21203/rs.3.rs-6088035/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-05-05T05:09:40+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-29T16:12:39+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-29T08:37:05+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"314598939317861586845391670673308666617","date":"2025-04-29T05:56:16+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-24T10:29:16+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"83286707001916338484214286508525480793","date":"2025-04-24T09:41:52+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"120540952241539129301798426429555622636","date":"2025-04-23T02:53:08+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"35627621973521743727405594105092750399","date":"2025-04-22T14:23:35+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-04-22T04:35:21+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-04-22T04:32:43+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Medical Informatics and Decision Making","date":"2025-04-16T09:32:42+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-informatics-and-decision-making","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"midm","sideBox":"Learn more about [BMC Medical Informatics and Decision Making](http://bmcmedinformdecismak.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/midm/default.aspx","title":"BMC Medical Informatics and Decision Making","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"eedaa6c3-ab5a-4666-b006-96b984974279","owner":[],"postedDate":"April 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-10-20T16:03:47+00:00","versionOfRecord":{"articleIdentity":"rs-6088035","link":"https://doi.org/10.1186/s12911-025-03041-4","journal":{"identity":"bmc-medical-informatics-and-decision-making","isVorOnly":false,"title":"BMC Medical Informatics and Decision Making"},"publishedOn":"2025-10-14 15:57:12","publishedOnDateReadable":"October 14th, 2025"},"versionCreatedAt":"2025-04-23 08:24:16","video":"","vorDoi":"10.1186/s12911-025-03041-4","vorDoiUrl":"https://doi.org/10.1186/s12911-025-03041-4","workflowStages":[]},"version":"v1","identity":"rs-6088035","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6088035","identity":"rs-6088035","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.