End-to-end single-stranded DNA sequence design with all-atom structure reconstruction

doi:10.64898/2025.12.05.692525

End-to-end single-stranded DNA sequence design with all-atom structure reconstruction

2025 · doi:10.64898/2025.12.05.692525

preprint OA: closed

Full text JSON View at publisher

Full text 1,853 characters · extracted from oa-doi-fallback · click to expand

Abstract Designing biological sequences that fold into predefined conformations is a central challenge in bioengineering. Although deep learning has enabled significant advances in protein and RNA sequence design, progress in single-stranded DNA (ssDNA) design has been constrained by the limited availability of structural data. To address this challenge, we introduce InvDNA, a deep learning-based method that designs ssDNA sequences directly from backbone atomic coordinates. This end-to-end formulation avoids the loss of structural information during backbone-to-feature conversion and further accommodates flexible backbone representations, dynamic sequence masking, and structural reconstruction objectives. These strategies bolster InvDNA’s ability to generalize across diverse ssDNA structural contexts while enabling additional functionalities, including generating diverse sequences for a given backbone, reconstructing nucleotide conformations from backbone and preserving functional sites. In benchmarks using experimentally determined ssDNA structures, InvDNA demonstrates more than a twofold improvement in sequence recovery compared with existing ssDNA and RNA sequence design approaches. Further computational validation using AlphaFold3 shows that 44.4% of InvDNA-designed sequences successfully fold into their predefined conformations. Notably, this success rate increases when backbone coordinates are perturbed to diversify the InvDNA-designed sequences. Collectively, these results establish InvDNA as a robust framework for rational ssDNA engineering. Competing Interest Statement The authors have declared no competing interest. Footnotes We updated the results of secondary structure prediction, analyzed structural redundancy between the training and test sets, and examined the release dates of ssDNA entries in the test set.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00