Large-scale analysis of ligand binding mode similarities in the PDB using interaction fingerprints

preprint OA: closed
Full text JSON View at publisher
Full text 1,852 characters · extracted from oa-doi-fallback · click to expand
Abstract Three-dimensional structures of protein-ligand complexes are essential for insights into the molecular principles that govern ligand recognition and binding. With more than 180,000 ligand-bound entries in the Protein Data Bank (PDB), representing over two million individual complexes, the volume of available structural data offers unprecedented opportunities for large-scale analysis of interaction patterns. Analysis of interaction patterns across the PDB archive can help discover similarities and differences in the binding modes of ligands, assisting in drug discovery. However, large-scale analysis of up-to-date information remains a significant challenge due to the rapid growth of data. Here, we introduce the Extended Connectivity Interaction Fingerprint (ECIFP), an interaction-based fingerprint that simplifies 3D protein-ligand contact information into a fingerprint, while retaining key molecular and chemical features of the interacting fragments. The simpler fingerprint representation of the interaction data makes comparison of millions of protein-ligand complexes tractable. Benchmarking shows that ECIFP outperforms ligand-only Extended Connectivity Fingerprints in identifying similar binding sites across identical protein sequences occupied by chemically diverse ligands. Our analysis showed that similarities calculated using ECIFP can be used to compare macromolecular complexes with similar or different ligands. In this study, we demonstrate two large-scale applications of ECIFP: (1) identification of distinct binding modes for over 9,000 ligands across the entire PDB, and (2) detection of binding-mode similarities among structurally diverse ligands within the same binding site across 48,870 binding sites from over 21,000 proteins. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00