The Viral AlphaFold Database of monomers and homodimers reveals conserved protein folds in viruses of bacteria, archaea, and eukaryotes

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 1,792 characters · extracted from oa-doi-fallback · click to expand
Abstract Viruses are among the most abundant and genetically diverse entities on Earth, yet the functions and evolutionary origins of most viral proteins remain poorly understood. Their rapid evolution often obscures evolutionary relationships, making it difficult to assign functions using sequence-based methods alone. Although conservation of protein fold can reveal deep homologies undetectable by sequence comparison, viral proteins remain vastly underrepresented in structural databases, limiting our ability to explore them at the structural level. Here, we address this gap by clustering all unique viral sequences from the NCBI RefSeq database and predicting the structures of ∼27,000 representative proteins using AlphaFold2, creating a large-scale viral structural resource, the Viral AlphaFold Database (VAD). We uncover ∼10,000 proteins belonging to clusters that share folds across viruses infecting bacteria, archaea, and eukaryotes, revealing shared protein folds across diverse host-infecting viruses. We also predict oligomeric states using AlphaFold2-based homodimer modelling, alongside structural comparisons to the Protein Data Bank, providing valuable new data on the potential for viral proteins to oligomerise. We further reveal that large regions of the viral protein universe remain functionally dark and report the discovery and experimental validation of a previously uncharacterised antiviral toxin-antitoxin (TA) system. VAD is a resource that provides a foundation for exploring viral structure–function relationships, including ancient folds that shape viral interactions across all life. Predicted structures used in this study are available at data-sharing.atkinson-lab.com/vad/. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00