Full text
2,670 characters
· extracted from
oa-doi-fallback
· click to expand
This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
You must log in to post a comment.
There are no comments or no comments have been made public for this article.
This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Add a Comment
You must log in to post a comment.
Comments
There are no comments or no comments have been made public for this article.
Carbonic anhydrases (CAs) attract interest for their critical roles in various physiological processes and potential application in CO2 sequestration to combat global warming. Despite being an important enzyme family, the classification and evolution of CAs remain elusive due to their high sequence diversity and long evolutionary history. In this paper, the in-silico strategy, Motif-weighted Alignment for Structure-based Protein Classification (MASPC) was developed, which uses OmegaFold simulated CA structures combined with weighted structural motif alignment, TM-weighted, to facilitate more precise polymorphic analysis of large enzyme datasets in a robust manner. The MASPC strategy was first validated by 74 ground-truth CA structures extracted from PDB, showing improved performance compared to sequence-based polymorphic analysis (ClustalO-RAxML). Subsequently, MASPC was applied to analyze a representative database, which contains 1603 CAs from 117 model organisms, with focus on α-, β-, and- γ- CA classes, to cover organisms from across life evolution history. The results indicated that α-, β-, and γ-CAs were well grouped in their own classes, with clearer clustering associated with the CA’s organism. The structural differences among the α-, β-, and γ-CAs revealed by MASPC supported the current understanding that CA classes are the results of convergent evolution. The sub-clusters in α- and β-CAs are highly associated with organisms according to their appearance in evolutionary history, demonstrating a close correlation between CA evolution and life evolution. Furthermore, the MASPC method was also applied to identify 27 potential α-CAs from the NCBI database with less than 40% sequence similarity to a template human carbonic anhydrase II (HCA-II) sequence, demonstrating possible applications in enzyme identification studies.
https://doi.org/10.32942/X25S7R
Bioinformatics, Life Sciences
Protein, alignment, evolution, Carbonic Anhydrase, carbon capture
Published: 2025-02-24 16:01
Last Updated: 2025-02-24 16:01
CC BY Attribution 4.0 International
Conflict of interest statement:
None
Data and Code Availability Statement:
Data and code is available at https://github.com/resplendentHSHI/TMweighted
Language:
English
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.