DNA Conformational Flexibility Descriptors Improve Transcription Factor Binding Prediction Across the Protein Families

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 2,072 characters · extracted from oa-doi-fallback · click to expand
Abstract Precise binding of transcription factors (TFs) to specific DNA sequences is fundamental to gene regulation, yet the molecular principles underpinning TF–DNA specificity remain incompletely understood. While nucleotide sequence and DNA shape are known determinants of TF binding, the role of DNA flexibility encompassing axial, torsional, and stretching dynamics— remains largely unexplored, particularly across diverse TF families. Here, we systematically integrate experimentally and computationally derived DNA flexibility descriptors into predictive models of TF–DNA binding specificity. Through extensive analyses of large-scale in vitro datasets from HT-SELEX, SELEX-Seq, protein binding microarrays encompassing mam-malian and Drosophila TFs, we demonstrate that flexibility-augmented models consistently outperform sequence based models, and DNA shape augmented models to an extent. These improvements are robust across diverse experimental platforms, and scale of the datasets, underscoring the importance of DNA conformational dynamics in indirect readout. Quantitative analyses of position-specific flexibility contributions reveal distinct “flexibility hotspots” within transcription factor binding sites and their flanking regions. This is exemplified by structural insights into the homeodomain TF MSX1, where localized DNA bendability directly correlates with enhanced binding affinity and precise recognition specificity. Finally, leveraging in vivo ChIP-Seq and DNase-Seq data from ENCODE, we further validate that DNA flexibility substantially enhances the identification of functional TF binding sites across various TF families and cellular contexts. Collectively, current findings substantiate DNA flexibility as a fundamental element of the cis-regulatory code and significantly advancing predictive frameworks of gene regulatory networks. Competing Interest Statement The authors have declared no competing interest. Footnotes Figure 1 & 4 revised. Github repository updated. https://github.com/nucleixlab/DNAflexibility_descriptors_for_TFBS

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00