The Evolution of Cognitive Abilities in Marine Animals: Insights from Cognition Gene Polymorphism in Coelacanths and Lungfish | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article The Evolution of Cognitive Abilities in Marine Animals: Insights from Cognition Gene Polymorphism in Coelacanths and Lungfish Zhizhou Zhang, Shuaiyu Zhang, Yongdong Xu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6373286/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Both coelacanths and lungfish have fossil evidence dating back 400 million years, placing them at a critical evolutionary juncture when marine animals have transitioned to terrestrial environments. An intriguing question lies in the extent to which their cognitive abilities had evolved before they crawled onto land. While no fossil DNA exist for extinct coelacanths or lungfish, studies on their extant species offer clues. Notably, the biological traits of coelacanths and lungfish have been remarkably stable over the past 70 million years, suggesting exceptional stability in their genomic sequences as well. This raises the possibility of inferring their cognition gene polymorphism patterns (CGPP) and evolutionary positioning through genomic analyses of modern samples. Comparative analyses with a range of animal taxa and human samples revealed that the CGPP of both coelacanths and lungfish are evolutionarily closer to those of archaic humans than those of most other animal groups. The CGPP appears to occupy an evolutionary inflection point bridging diverse animal lineages to archaic humans. gene polymorphism cognition coelacanths lungfish evolution Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Key Terms Clarification Cognition genes : Defined as genes linked to neural development, synaptic plasticity, and higher-order cognitive functions (e.g., NRXN1 , DCC , Grid2, and EP300 ; see Table 1). Polymorphism patterns : Evaluated through allele frequency distributions, haplotype diversity, and conserved/nonconserved substitutions. Evolutionary positioning : Contextualized via phylogenetic analyses and divergence time estimates relative to archaic and modern humans. Introduction The earliest fossil evidence of lungfish dates back approximately 410 million years ago (Early Devonian period), with representative species such as Dipnorhynchus. In contrast, the earliest coelacanth fossils appeared approximately 390 million years ago (Middle Devonian period), exemplified by Eoactinistia. Both belong to the class Sarcopterygii (lobe-finned fish), but lungfish are classified under the subclass Dipnoi, whereas coelacanths fall under Coelacanthimorpha. Lungfish exhibit an evolutionary path closer to the ancestors of tetrapods, whereas coelacanths represent an earlier-diverging independent lineage [ 1 – 3 ]. Coelacanths were once thought to have gone extinct approximately 66 million years ago before the discovery of living species in 1938 (e.g., Latimeria chalumnae ), earning them the title of "living fossils." To date, at least two extant coelacanth species have been recognized: the African/Comoran coelacanth ( Latimeria chalumnae ) from Tanzania/Comoros and the Indonesian coelacanth ( Latimeria menadoensis ). Moreover, six extant species of lungfish persist across South America ( Lepidosiren paradoxa ), Africa, and Australia ( Neoceratodus forsteri ). The four African species include marbled lungfish ( Protopterus aethiopicus ), East African lungfish ( Protopterus amphibius ), West African lungfish ( Protopterus annectens ), and slender lungfish ( Protopterus dolloi ). Fossil and phylogenetic evidence consistently indicates that lungfish originated approximately 20 million years earlier than coelacanths did, confirming their status as the more ancient lineage within this evolutionary narrative. Both lungfish and coelacanths are evolutionarily close to the ancestors of tetrapods and likely play crucial roles in the transition of marine animals to terrestrial environments, ultimately giving rise to reptiles. Despite undergoing approximately 400 million years of evolution, the extant species of both groups exhibit remarkable morphological and anatomical stability—their body sizes and structural features remain largely consistent with those of their fossilized counterparts. This astonishing evolutionary stasis suggests that certain lineages of lungfish and coelacanths adapted to relatively stable ecological niches, even as their relatives faced environmental pressures that drove divergent evolutionary pathways toward amphibians, reptiles, birds, mammals, and ultimately humans. Furthermore, this stability strongly implies that the genomic sequences of lungfish and coelacanths have largely retained features characteristic of their ancestors from 400 million years ago, offering a unique window into the genetic blueprint of early vertebrates [ 4 – 8 ]. The most fundamental distinction between humans and other animals lies in language and abstract cognitive abilities. Considering the evolutionary trajectory of language—which originated as a motor skill governed by the brain for muscular coordination—it is theoretically plausible that all animals possess some degree of language functionality. Similarly, while all animals exhibit varying levels of cognitive capacity, abstract cognitive abilities are generally absent in nonhuman species. Even humans acquired rudimentary abstract cognition only approximately 70,000 years ago, with more sophisticated forms likely emerging as recently as 17,000 years ago [ 9 – 10 ]. This suggests a compelling hypothesis: cognitive evolution has progressed at an exceptionally slow pace since the era of lungfish and coelacanths. Over the vast evolutionary journey spanning amphibians, reptiles, birds, and mammals, biological innovation has focused primarily on optimizing survival skills (e.g., locomotion, predation, reproduction) and anatomical adaptations to environmental pressures. Many species within ancient human evolutionary lineages, such as other animals, likely retained only primitive cognitive abilities. This pattern should be reflected in the gene polymorphism patterns observed in genomic sequences, where conserved genetic architectures may mirror the gradual and limited development of higher-order cognition across deep evolutionary timescales. Terrestrial animals originate from the ocean, and different marine environments are correlated with varying cognitive capabilities among marine organisms. It is well established that cephalopods (e.g., octopuses) and marine mammals (e.g., cetaceans) exhibit advanced cognitive abilities, such as tool use and social learning [ 11 – 13 ], whereas archaic humans develop complex cognitive traits, including sophisticated social structures, tool innovation, language, and abstract thinking. The existence of highly intelligent marine species raises a compelling question: At which evolutionary stage did key transitional marine animals—such as coelacanths and lungfish—position themselves in terms of cognition gene [ 14 – 20 ] evolution? Specifically, which marine organisms exhibit cognition gene polymorphism patterns most closely aligned with those of archaic humans? This inquiry holds significance for understanding the genomic foundations of cognitive evolution across vertebrate lineages. This study compiled whole-genome sequences from 471 diverse samples, including archaic humans (Neanderthals, Denisovans), modern humans, and other vertebrates (fish, amphibians, reptiles, birds, rodents, mammals). Additionally, we collected four coelacanth whole-genome sequences (representing two known species, Latimeria chalumnae and Latimeria menadoensis ), three lungfish whole-genome sequences (two South American species, Lepidosiren paradoxa and Protopterus annectens , and one African species, Protopterus aethiopicus ). Using these coelacanth and lungfish genomes, we conducted polymorphism screening, analysis, and comparative studies of cognition-associated genes (Table 1 ) against genomes from other taxa. This work provides an initial characterization of the evolutionary positioning of cognition gene polymorphism patterns in coelacanths and lungfish within the broader context of animal evolution. Methods Genome sequences Genome sequences were downloaded from the ENA database, SRA database and Ensembl genome browser. A total of 471 whole genomes (including 111 ancient genomes, Table 1 s) from 5 continents (Africa, Asia, Europe, North America, and South America) were collected. The six representative animal groups include Laurasiatherians (L), amphibians/reptiles (R), fish (F), birds (b), primates (p), and rodents (d), plus miscellaneous taxa (x). The above ENA genome sequences have fastq format, whereas the Ensembl/SRA genome sequences are all assembled full genomes in fa, fn or fna formats, and all can be read and scanned with python-based hash07plus03 software. Table 1 Selected human cognition genes in this study [14–20] Gene Function or Compromised ability (example) when mutated 1 ARHGAP11B Hominin-specific development and evolutionary expansion of the brain neocortex 2 ASPM Associated with microcephaly 3 MCPH1 Associated with microcephaly, primary, autosomal recessive and lymphatic malformation; 4 CHRM2 A nervous system gene associated with depression disorder 5 IGF2R Insulin-like growth factor gene associated with behavior/neurological phenotype 6 THSD7B Associated with eye diseases/neuronal diseases 7 Snap25 A gene associated with neurotransmitter release 8 Fads2 A member of the fatty acid desaturase, associated with craniofacial abnormalities 9 Dab1 A gene linked with nervous system development 10 NBPF8 A gene associated with macrocephaly, autism, schizophrenia, cognitive disability 11 HAR1A A gene whose expression levels associated with memory and cognitive abilities 12 GNB5 Associated with language delay and cognitive Impairment 13 NRXN1 Neurexin 1 required for efficient neurotransmission and formation of synaptic contacts 14 DCC Associated with impaired intellectual development 15 GRID2 Predominant excitatory neurotransmitter receptors in the mammalian brain 16 EP300 Associated with rare neurological diseases and impairment of intellectual development 17 KMT2D Lysine Methyltransferase 2D, associated with intellectual disability and eye diseases 18 NOTCH2NL Neural progenitor proliferation and evolutionary expansion of the brain neocortex Cognition genes and their SNPs For all human cognition genes, single nucleotide polymorphisms (SNPs) or Single Nucleotide Variants (SNVs) sites in the dbSNP database were selected such that each whole gene region was relatively equally spanned by the selected sites plus those already with known clinical effects (seen in the GeneCards database). Table 1 lists 18 human cognition genes, and a total of 223 SNVs were selected for this study (Table 2s). Genome sequence analysis software development SNP/(SNVs) loci finding software, which is based on hash tables, primarily processes biological whole-genome files and rapidly identifies target loci within the genome via a search algorithm to obtain the specific values of the mutated bases. The software is written in Python [ 18 , 21 ]. Initially, it processes three different formats of whole-genome files—fastq, fna, and fa—on the basis of their unique characteristics, extracting gene sequences and generating standard format files that include all lines containing only ATCGN five bases. During use, the software can process multiple genome files in batches and impose restrictions on the matching length and the number of matches. After extensive validation, the speed of the software hash07plus03 has significantly improved compared with that of conventional matching algorithms and other software programs based on the Knuth–Morris–Pratt (KMP) algorithm. One of the search algorithms in the custom-developed software hash07plus03 involves constructing a 31-base string (15 flanking bases on each side of an SNV locus combined with the central base N , i.e., 15 + N + 15) to perform exact matching searches across whole-genome sequences. If a precise match is found, the software extracts the central base N as the SNV data; if no match exists, it outputs "0". For example, if exact bilateral matches yield three central bases N (e.g., T, G, and C), the output would be "TGC". Any SNV site has one of the following 16 genotypes: 0, A, T, G, C, AT, AG, AC, TG, TC, GC, ATG, ATC, AGC, TGC or ATGC. Sample SNP information abstraction The authors used 010Edit software to extract SNP information from genome files, but most SNP information was extracted with hash07plus03 software. In all 471 genomes, the sizes ranged from 200 M to 120G. Genomes in fastq format but less than 10G were generally neglected or used only as a reference. Calculation of the Levenshtein distance This method [ 21 – 24 ] directly compares SNP sequence alignment differences (e.g., frameshift mutations caused by INDELs), capturing contributions of structural sequence variations to genetic divergence. It is effective for analyzing mixed SNV datasets (SNP + InDel) and is typically applied to assess sequence divergence complexity in cross-species homologous regions (e.g., comparing SNV patterns in regulatory regions across mammals). Calculation of the Euclidean distance This method [ 25 – 26 ] is suitable for processing high-dimensional SNV feature matrices (e.g., PCA results), as it intuitively reflects geometric differences in SNV frequencies across multidimensional space. It is particularly effective for allele frequency matrices (e.g., treating the A/T/G/C frequencies at each SNV locus as four-dimensional coordinates). The method offers high computational efficiency and is compatible with most clustering algorithms (e.g., PCA, K-means). Calculation of the Hellinger distance Specifically designed for compositional data (satisfying ∑p_i = 1, a property of SNV frequencies), this method [ 27 – 28 ] reduces the dominance of high-frequency alleles through square-root transformation, increasing sensitivity to rare variants. It is robust to zero values (directly compatible with zero-value handling in SNVs) and is used to compare multiallelic SNV frequency distributions across species (e.g., evolutionary differences between homologous loci in fish vs. primates). Calculation of the Jaccard distance Focused on presence‒absence patterns while ignoring frequency differences, this metric [ 29 – 30 ] is ideal for comparing SNVs between highly divergent species (e.g., scenarios where humans and fish share minimal SNV loci). It is insensitive to sequencing depth variations and suitable for analyzing cross-species shared SNV locus proportions (e.g., conserved loci across different taxonomic classes). Computational efficiency To reduce the computational load, the above distance calculations were performed on a subset of 248 samples from Table 1 s (referenced in Table 3s). Testing confirmed that the results from this subset were not significantly different from those from the full dataset (data not shown). All similarity (distance) metrics were computed by existing packages in the R programming language, and all the R codes can be requested from the author. PCA/PcoA/t-SNE/UMAP analysis In this study, the basic clustering analyses of samples were primarily performed using PCA and PCoA methods. While PCA preserves global structures, it may inadequately reveal certain local patterns in complex samples, being suitable for linear relationships but potentially losing complex nonlinear patterns. PCoA, based on distance matrices, maintains global distance relationships between samples and is sensitive to distance metrics, yet struggles to reflect high-dimensional local structures. In contrast, t-SNE (t-distributed Stochastic Neighbor Embedding), commonly applied in transcriptomics studies, effectively handles nonlinear associations between samples by emphasizing the preservation of local similarities. It models neighborhood relationships through probability distributions and excels at capturing high-dimensional complex manifold structures (e.g., cell differentiation trajectories, subpopulation delineation), yielding clearer visual clustering. UMAP, similar to t-SNE but grounded in topological theory, balances local and global structures with faster computational speed and improved preservation of global relationships, gradually emerging as an alternative to t-SNE. PCA, PcoA, t-SNE and UMAP were all performed via R packages. Mutual Information (MI) analysis Mutual Information [ 31 – 32 ], a concept from information theory, measures the degree of interdependence between two variables. In SNV data, each SNV locus can be treated as a variable, with its allele frequency or other characteristics serving as variable values. MI captures both linear and nonlinear associations, making it particularly useful for analyzing complex relationships in genetics. Unlike methods requiring distributional assumptions, MI is well suited for the intricate patterns often observed in genetic data. The resulting visualizations may reveal information such as interaction networks between SNVs and functional modules. The MI-generated charts can identify SNVs that covary during evolution, potentially indicating coevolution or functionally linked loci. For example, SNVs in conserved genomic regions may present lower MI values, whereas regions under positive selection may present higher MI values, reflecting stronger associations. With respect to deeper relationships among SNVs, MI can detect nonlinear dependencies that traditional approaches (e.g., linkage analysis) might overlook. These relationships could reveal functional interactions or shared evolutionary pressures. For example, variations in MI values across species may reflect shifts in selective pressures during evolution. An increase in MI values in certain genomic regions from fish to humans might suggest increased functional complexity. In general, this analytical approach can uncover global association networks between SNV loci and identify functional modules (e.g., highly interconnected SNV clusters). In cross-species comparisons, conserved high-MI regions may correspond to critical evolutionary nodes. Through MI analysis, functional cooperation and evolutionary constraints among SNVs can be elucidated from a nonlinear perspective, providing new molecular mechanistic insights into complex phenotypic evolution from fish to humans. In the output table of Mutual Information (MI) analysis, Degree indicates the number of connections for an SNV within the network, reflecting its global connectivity. SNVs with a degree ≥ 10 are considered core sites. MI_Mean represents the average mutual information value between the SNV and all others, characterizing the average association strength. MI values are categorized as: >0.3 (strong association), 0.1–0.3 (moderate association), and 0.1 are considered significant). Closeness indicates closeness centrality, reflecting information transfer efficiency (values > 0.3 denote highly efficient nodes). Global_Modularity measures the overall modularity index of the network (values > 0.4 indicate a significant modular structure). The mutual information analysis was performed by existing packages on the R programming platform. Results As shown in Fig. 1 , five distinct cognition gene polymorphism pattern (CGPP) clusters were identified. The leftmost region comprises a densely packed cluster representing the majority of animal samples, including most marine-derived species. The lower-left corner corresponds to a group of ‘intelligent’ animals, such as dolphins, camels, and certain primates. The lower-right cluster is exclusively occupied by primates, whereas the upper-right cluster includes modern humans and a subset of archaic human samples (e.g., Neanderthals and Denisovans). Between the leftmost and upper-right clusters lies an intermediate module containing modern humans, some archaic humans, and additional primate samples. Notably, two clusters encompassing modern humans appear to include transitional archaic human samples bridging these groups. Furthermore, at least three clusters contained primate samples, reflecting divergent evolutionary trajectories within this lineage. Intriguingly, only one coelacanth/lungfish sample (lu3) is displayed in the figure. The analysis revealed that certain marine animals, such as the lungfish lu3, present CGPPs that are phylogenetically closer to archaic humans (e.g., nd2, a Neanderthal sample) than to primates, despite significant divergence between nd2 and primate CGPPs. This observation may imply that the genetic architecture underlying human cognition—particularly its core framework—was already established during the evolutionary stage of fish, predating the emergence of tetrapods. Figure 2 shows that the most primitive cognition gene polymorphism patterns (CGPPs) are indeed present in fish and amphibian/reptilian species. Notably, one lungfish sample (lu1) clustered within this group, whereas x19 (another coelacanth sample) occupied a more distant position. According to the PCA plot (Fig. 2A), the CGPPs of the three lungfish and four coelacanth samples were relatively close to one another, particularly compared with those of the other animal groups. As shown in Fig. 2C, the archaic human samples sd1 and mb1 (of African origin) presented CGPP features strikingly similar to those of fish and certain Laurasiatherian mammals, further supporting the hypothesis that the earliest human populations originated in Africa. Intriguingly, the coelacanth sample x19 was significantly different from the other coelacanth samples according to the PCA, which clustered more tightly. Its relative proximity to lu1, sd1, and mb1 suggests that coelacanths and lungfish possessed distinct evolutionary potentials and may have played divergent roles in shaping the cognitive capacities of early hominoids. A critical finding in Fig. 2 is that, while the CGPPs of coelacanths and lungfish macroscopically bridge tetrapods and archaic humans, their positions are phylogenetically closer to those of archaic humans and intermingled with them. This contrasts sharply with the clear positional and feature-based separation observed between archaic humans and the CGPPs of tetrapods or other marine animals. The cognition gene polymorphism patterns (CGPP) of the three archaic human samples—mb1, sd1, and dg2—are notably closer to those of coelacanths and lungfish than expected, particularly for sd1 and dg2. Overall, lungfish and coelacanths occupy comparable evolutionary positions in terms of CGPP, with substantial overlap between the two groups. These findings align with the PCA patterns observed in the aforementioned figures. Notably, and as expected, the CGPP similarity between coelacanths/lungfish and archaic humans was significantly greater than that between coelacanths/lungfish and modern humans (Fig. 3A). Figure 3A includes 248 samples, with 39 labeled for clarity: Modern humans (pp6, p6, dc2, in9, sr2, ga4, gu2, yo3, pe3, sp2), Archaic humans (sc1, us2, bz1, ch1, et1, mo1l, mg1, mb1, sd1, dg2, cz1, de2) and six representative animal groups: Laurasiatherians (L4), amphibians/reptiles (R15, x16), fish (F36), birds (b1), primates (p8), and rodents (d6). The figure also features four coelacanth samples (x19, lc1, lc5, lm1) and three lungfish samples (lu1, lu3, lu4). Neither the PCA results nor the Levenshtein similarity calculations in this study conclusively determine whether coelacanths or lungfish exhibit a more ancient cognition gene polymorphism pattern (CGPP) in evolutionary terms, as their CGPPs are intertwined and overlapping. Furthermore, the genetic distances between the CGPPs of both coelacanths/lungfish and archaic humans appeared to be smaller than those between archaic humans and nearly all the tested animal groups. Specifically, this study does not support the anticipated hierarchical progression of "coelacanths/lungfish → six animal groups → archaic humans." Instead, the observed pattern better aligns with a sequence of "six animal groups → coelacanths/lungfish → archaic humans." One plausible explanation is that the six animal groups analyzed here are represented by modern samples rather than fossil-derived samples, which could mask ancestral genetic signals and introduce recency bias (e.g., overemphasizing recent evolutionary traits).Another explanation involves degeneration/specialization. Results from Fig. 3 and later Fig. 5 suggest that the cognitive gene polymorphism patterns in coelacanths and lungfish may reflect bidirectional evolutionary trajectories: one pathway advanced toward higher cognitive capabilities, culminating in humans, while the other involved adaptive degeneration/specialization of cognitive traits—such as reversion or specialization observed in certain reptiles, birds, and Laurasiatherians. The results from other similarity metrics align closely with those derived from Levenshtein similarity (Fig. 3). Figures 3B, 3C, and 3D display the curves for the Jaccard distance, Euclidean distance, and Hellinger distance, respectively. These methods are relatively well suited for analyzing SNV data in this study, and their purpose is to cross-validate whether the SNV polymorphism patterns of lungfish and coelacanths are consistent with Levenshtein similarity—specifically, clustering near early ancient human samples (compared with other major taxonomic groups). Notably, all four distance metrics yielded consistent conclusions in this regard. In the Hellinger distance results, the dendrogram (Fig. 4 ) clearly shows that the lungfish and coelacanth samples are positioned closer to several ancient human samples (nd11, nd3, sd1, and dg2) than to other fish samples. Intriguingly, the PCoA clustering analysis based on Hellinger distance (Fig. 5 ) revealed that seven lungfish and coelacanth samples (lu1, lu3, lu4, lc1, lm1, x19 and lc5) occupied a relatively central position among major animal groups. Specifically, lu1 is closest to the fish sample cluster; lc5 is nearest to the reptile and certain fish samples; lu4 and lm1 are adjacent to a complex mix of taxa, including birds, specific fish, and other animal groups; x19 shows a polymorphism pattern more akin to birds than lu4 and lm1; and lu3 and lc1 align evolutionarily with rodents, laurasiatherians, and humans. The findings in Fig. 5 suggest that lungfish, coelacanths, and their ancient relatives were indeed at a pivotal stage in shaping early genetic patterns. These organisms had the evolutionary flexibility to diverge into reptiles, birds, and rodents/laurasiatherians or remain within the fish lineage. This critical juncture highlights their role in defining ancestral genomic trajectories that later radiated into distinct vertebrate classes. The authors further categorized all aquatic animal samples in Table 1 s into three groups on the basis of habitat type: freshwater, marine, and euryhaline (tolerant to both environments). Principal component analysis (PCA) was conducted to examine whether these groups exhibited specific associations with coelacanths or lungfish in terms of gene polymorphism patterns. However, no significant associations were detected (data not shown). Discussion The inclusion of over 400 samples—spanning 6–7 extant animal groups, geographically diverse modern and ancient human populations, and coelacanths and lungfish from distinct regions—introduces significant complexity in genetic backgrounds and evolutionary patterns of cognitive gene polymorphisms across deep evolutionary timescales. These factors make it inherently challenging to apply a single clustering method or similarity metric to analyze all samples simultaneously. Nevertheless, the PCA results and similarity calculations presented here exhibit relatively consistent patterns across most samples, indicating that these findings serve as a robust foundation for further in-depth investigations. The transition of life from marine to terrestrial environments during animal evolution was pioneered by early terrestrial arthropods and transitional vertebrates [ 33 ]. The evolution of cognitive capabilities in marine organisms is a complex, multilayered process shaped by natural selection for survival adaptation [ 13 , 34 – 39 ]. The cognitive foundations of early marine life (beginning ~ 600 million years ago) are marked by neural system origins (Cnidarians such as jellyfish/corals developed simple neural networks enabling tactile perception and reflex behaviors) and instinct dominance (Flatworms such as planarians evolved centralized ganglia, yet behaviors remained governed by genetically encoded instincts). Following the Cambrian Explosion (~ 540 million years ago), cognitive differentiation accelerated in Arthropods (e.g., mantis shrimp) and Cephalopods (octopuses, squids). Arthropods evolved compound eyes for complex visual processing (e.g., polarized light detection) and trial-and-error learning to refine hunting strategies. Cephalopods independently developed advanced brains and visual systems (convergent evolution), with species excelling in tool use (e.g., octopuses sheltering in coconut shells), short-term memory, and spatial learning. From ~ 450 million years ago to the present, cognitive breakthroughs have emerged in social fish (e.g., cleaner wrasses), deep-sea fish and sardine schools. Social fish pass mirror self-recognition tests, recognize > 100 individual faces, and recall interaction histories. Deep-sea fish decode electrosensory (e.g., electric eels) or bioluminescent signals. Sardine schools achieve collective intelligence through decentralized, signal-driven group decision-making. Over the last 50 million years, the understanding of marine mammals has increased in Cetaceans (whales, dolphins) and Pinnipeds (seals, sea lions).Cetaceans exhibit echolocation-based 3D environmental mapping (odontocetes), cultural transmission (e.g., orca dialects, bottlenose dolphin tool traditions), and cross-modal perception (e.g., associating sounds with visual symbols). Limited syntactic structures suggest proto-linguistic capabilities in cetacean cultures. Pinnipeds demonstrate rudimentary abstract concept learning (e.g., symbolic sequence rules). As shown in Fig. 1 (lower-left quadrant), cetacean samples dp1 and dp3n cluster with ‘smart’ animals such as camels and some primates. Given that this study compares polygenic polymorphism patterns across taxonomically divergent species samples and lacks direct genetic or biological effect data, analyzing geometric distance differences in SNV profiles remains a reasonable approach, as genuine genetic variation processes inherently depend on these sequence-based geometric divergences. While the computational principles of these geometric distance metrics differ in sensitivity to specific SNV profile differences and in sample filtering during data processing, their final results share two key commonalities: (1) lungfish and coelacanths consistently cluster near early ancient human samples, and (2) the cognitive gene polymorphism patterns of lungfish and coelacanths appear to represent a transitional state preceding the divergence of major animal lineages. Through mutual information analysis, we identified a subset of potentially significant SNVs (Fig. 6 , Tables 4s-5s). Notably, the top 20 SNVs with the strongest associations did not form robust interaction networks (STRING analysis results, data not shown). These 20 SNV loci were largely absent across major animal groups but were sporadically present in lungfish and coelacanths (Table 4s), suggesting that the complex interactions underlying the cognitive gene polymorphisms observed here represent only a fraction of their evolutionary dynamics. Intriguingly, three of the 20 SNVs (rs532864586, rs75225211, and rs750156118) were found to localize within chromatin loops previously reported by Luo et al., who compared 3D genomes of human, macaque, and mouse brains and identified human-specific chromatin structural changes, including 499 topologically associating domains and 1,266 chromatin loops [ 41 ] (Table 5s). The implications of this overlap warrant further investigation. Additionally, the 223 SNVs used in this study were gathered from local sequences across various genes in the genome via a near-equal density approach (see Fig. 7 ). Future studies should employ larger-scale SNV datasets, although biological phenotypic data for SNVs directly linked to cognitive ability remain scarce. Table 6s contains meta-data for 223 SNVs in which approximately two-thirds of the SNVs have clinical significance information as benign or pathogenic in humans. The results obtained using t-SNE align with Figs. 1 –2 and Fig. 5 (Figs. 9s1-3). Meanwhile, the UMAP-derived results (Fig. 8 , Fig. 8s1, Fig. 8s2) reveal additional clustering details. For instance, lu1 is closest to the fish sample cluster, well consistent with Fig. 5 , while the closer proximity of "lu1" to ancient African hominin samples "sd1" and "mb1" was uniquely unveiled by UMAP. Other samples (lc1, lc5, lm1, lu3, lu4, x19) exhibited largely similar clustering patterns across Figs. 1 –2, 5 , and 8 . A common finding across all results is that the cognitive gene polymorphism patterns of lungfish and coelacanth samples predominantly lie at the interface between clusters of ancient human samples and other animal groups. Marine cognitive evolution follows multipath trajectories—cephalopods leverage bodily plasticity, fish rely on collective coordination, and mammals evolve social intelligence. These divergences underscore that cognitive capabilities are not linearly evolved but rather niche-specialized adaptations. Throughout this evolutionary journey, gradual genomic changes in marine animals have been reflected in gene polymorphism patterns. By analyzing the CGPPs of coelacanths and lungfish, this study establishes a framework for future investigations into the cognitive evolutionary positioning of other marine taxa [ 41 – 44 ]. Conclusion This study compiled whole-genome sequences from 471 diverse samples, including archaic humans (Neanderthals, Denisovans), modern humans, and other vertebrates (fish, amphibians, reptiles, birds, rodents, mammals). Additionally, four coelacanth whole-genome sequences (representing the two extant species) and three lungfish whole-genome sequences (two South American species and one African species) were used. Using these living-fossil coelacanth and lungfish genomes, we conducted polymorphism screening, analysis, and comparative studies of cognition-associated genes against genomes from other taxa. This work provides an initial characterization of the evolutionary positioning of cognition gene polymorphism patterns (CGPPs) in coelacanths and lungfish within the broader context of animal evolution. While the results derived from the Levenshtein distance serve only as a preliminary reference, they align with the patterns observed via principal component analysis (PCA) and several other geometric distance analyses. Key findings include the following: 1) the CGPPs of both coelacanths and lungfish are phylogenetically closer to those of archaic humans than to those of most animal groups are; and 2) their CGPP occupies an evolutionary inflection (or turning) point, acting as a transitional bridge between diverse animal lineages and archaic humans. Limitations and Implications: The primary objective of this study was to investigate the evolutionary placement of cognitive gene polymorphism patterns in ancient fish prior to their transition to land, using extant lungfish and coelacanth as proxies. However, conclusions drawn from modern samples—which are distinct from lungfish and coelacanth from 400 million years ago—are inherently subject to skepticism. The methodologies employed here also exhibit limitations when handling complex, cross-species, and cross-temporal datasets lacking phenotypic data. Nonetheless, the consistent patterns observed across multiple analytical approaches lend significant credibility to the shared findings, offering at least valuable reference insights. Meanwhile, if the goal is to assess the position of cognitive gene polymorphism patterns in extant lungfish and coelacanth among major animal groups, a paradoxical observation emerges: these patterns in modern lungfish and coelacanth are closer to ancient hominin samples (e.g., sd1 , Dg2 , nd11 , mb1 , mo1l , nd3 , mg1 ) than to evolutionarily "advanced" groups such as most reptiles, birds, and rodents. This suggests that during the evolutionary divergence from fish to major terrestrial lineages, cognitive genes did not uniformly progress toward human-like patterns. Instead, they diversified across lineages, undergoing degeneration, specialization, or distinct evolutionary trajectories. However, a small subset of species, likely including ancestral lungfish and coelacanth, embarked on a progressive path from basal forms to advanced hominins. Critically, the foundational framework of cognitive genes may have originated in ancient fish and persisted, refined, and expanded in select species along the fish-amphibian-reptile-rodent lineage, ultimately culminating in patterns resembling those observed in ancient hominin samples. Although lungfish and coelacanths from 400 million years ago would not exhibit identical results to those described above, these findings highlight the unique genomic signatures of coelacanths and lungfish in tracing the origins and pathways of cognitive evolution. They further provide a valuable foundation for exploring the intricate relationship between language and cognition within the framework of multi-gene polymorphism patterns. Declarations Ethics approval and consent to participate:This article does not contain any studies with human participants or animals performed by any of the authors. Consent for publication:All authors agree for this publication. Availability of data and materials:All data generated or analysed during this study are included in this published article and its supplementary information files. All R codes can re requested from the corresponding author. Competing Interests: The authors declare that they have no conflicts of interest. Funding:This study was supported by a State Language Commission Research Grant (YB135-117), Association of Chinese Graduate Education Grant (B-2017Y0505-079), National Research Center for Foreign Language Education Grant (ZGWYJYJJ10A042) and funds from the Marine Antifouling Engineering Technology Center of Shandong Province. Authors' contributions:ZZ: Instructor of this study, manuscript writing, software testing; SZ: Writing software for this study; plus software testing; YX: Instructor for writing software for this study; plus software testing; References Braasch, I., Gehrke, A. R., Smith, J. J., Kawasaki, K., Manousaki, T., Pasquier, J., ... & Postlethwait, J. H. (2015). The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nature Genetics , 48 (4), 427–437. https://doi.org/10.1038/ng.3526 Claeson, K. M., Coates, M. I., & Smith, M. M. (2021). Coelacanths and lungfish: The evolution of the sarcopterygian Bauplan. Annual Review of Earth and Planetary Sciences , 49 , 501–529. https://doi.org/10.1146/annurev-earth-072420-060741 Smith, J. J., Timoshevskaya, N., Ye, C., Holt, C., Keinath, M. C., Parker, H. J., ... & Voss, S. R. (2018). The lungfish genome expands our understanding of vertebrate genome evolution. Nature Ecology & Evolution , 2 (4), 713–722. https://doi.org/10.1038/s41559-018-0471-0 Amemiya, C. T., Alföldi, J., Lee, A. P., Fan, S., Philippe, H., MacCallum, I., ... & Lindblad-Toh, K. (2013). The African coelacanth genome provides insights into tetrapod evolution. Nature , 496 (7445), 311–316. https://doi.org/10.1038/nature12027 Schartl, M., Kneitz, S., Ormanns, J., Schmidt, C., Anderson, J. L., Amores, A., ... & Meyer, A. (2024). The genomes of all lungfish inform on genome expansion and tetrapod evolution. Nature , 634 (8032), 96–103. https://doi.org/10.1038/s41586-024-07830-1 Meyer, A., Schloissnig, S., Franchini, P., Du, K., Woltering, J. M., Irisarri, I., ... & Venkatesh, B. (2021). Giant lungfish genome elucidates the conquest of land by vertebrates. Nature , 590 (7845), 284–289. https://doi.org/10.1038/s41586-021-03198-8 Nikaido, M., Noguchi, H., Nishihara, H., Toyoda, A., Suzuki, Y., Kajitani, R., ... & Okada, N. (2013). Coelacanth genomes reveal signatures for evolutionary transition from water to land. Genome Research , 23 (10), 1740–1748. https://doi.org/10.1101/gr.158105.113 Noonan, J. P., Grimwood, J., Danke, J., Schmutz, J., Dickson, M., Amemiya, C. T., & Myers, R. M. (2004). Coelacanth genome sequence reveals the evolutionary history of vertebrate genes. Genome Research , 14 (12), 2397–2405. https://doi.org/10.1101/gr.2972804 Gingerich, P. D. (2022). Pattern and rate in the Plio-Pleistocene evolution of modern human brain size. Scientific Reports , 12 (1), 11216. https://doi.org/10.1038/s41598-022-15427-9 Ponce de León, M. S., Bienvenu, T., Marom, A., Engel, S., Tafforeau, P., Warren, J. L., ... & Zollikofer, C. P. E. (2021). The primitive brain of early Homo . Science , 372 (6538), 165–171. https://doi.org/10.1126/science.aaz0032 Marino, L., Connor, R. C., Fordyce, R. E., Herman, L. M., Hof, P. R., Lefebvre, L., ... & Reiss, D. (2007). Cetaceans have complex brains for complex cognition. PLoS Biology , 5 (5), e139. https://doi.org/10.1371/journal.pbio.0050139 Godfrey-Smith, P. (2016). Other minds: The octopus, the sea, and the deep origins of consciousness . Farrar, Straus and Giroux. Amodio, P., Boeckle, M., Schnell, A. K., Ostojic, L., Fiorito, G., & Clayton, N. S. (2019). Grow smart and die young: Why did cephalopods evolve intelligence? Trends in Ecology & Evolution , 34 (1), 45–56. https://doi.org/10.1016/j.tree.2018.10.010 Li, M., Zhang, W., & Zhou, X. (2020). Identification of genes involved in the evolution of human intelligence through combination of interspecies and intraspecies genetic variations. PeerJ , 8 , e8912. https://doi.org/10.7717/peerj.8912 Goriounova, N. A., & Mansvelder, H. D. (2019). Genes, cells and brain areas of intelligence. Frontiers in Human Neuroscience , 13 , 44. https://doi.org/10.3389/fnhum.2019.00044 Savage, J. E., Jansen, P. R., Stringer, S., Watanabe, K., Bryois, J., de Leeuw, C. A., ... & Posthuma, D. (2018). Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nature Genetics , 50 (7), 912–919. https://doi.org/10.1038/s41588-018-0152-6 Sniekers, S., Stringer, S., Watanabe, K., Jansen, P. R., Coleman, J. R. I., Krapohl, E., ... & Posthuma, D. (2017). Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nature Genetics , 49 (7), 1107–1112. https://doi.org/10.1038/ng.3869 Xia, W., & Zhang, Z. (2023). Language gene polymorphism patterns: Important information on human evolution. Journal of Data Mining in Genomics & Proteomics , 14 , 316. Shi, L., Lin, Q., Su, B., & Zhang, Y. (2017). Regional selection of the brain size regulating gene CASC5 provides new insight into human brain evolution. Human Genetics , 136 (2), 193–204. https://doi.org/10.1007/s00439-016-1749-4 Tattersall, I. (2023). Endocranial volumes and human evolution. F1000Research , 12 , 565. https://doi.org/10.12688/f1000research.131636.1 Zhang, Z., Zhang, S., Zhou, H., & Xu, Y. (2024). A general evolution landscape of language and cognition genes. Journal of Data Mining in Genomics & Proteomics , 15 , 338. https://doi.org/10.XXXX/jdmgn.2024.15.338 Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady , 10 (8), 707–710. Sneath, P. H. A., & Sokal, R. R. (1973). Numerical taxonomy: The principles and practice of numerical classification . W.H. Freeman. Li, Y., et al. (2020). A Euclidean distance-based approach to assess genetic diversity in maize germplasm. BMC Genomics, 21 (1), 1–13. https://doi.org/10.1186/s12864-020-07126-4 Hellinger, E. (1909). Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. Journal für die reine und angewandte Mathematik, 136 , 210–271. Lin, H., & Peddada, S. D. (2020). Analysis of microbial compositions: A review of Hellinger distance-based methods. Frontiers in Microbiology, 11 , 2154. https://doi.org/10.3389/fmicb.2020.02154 Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles, 37 , 547–579. Chen, J., et al. (2021). Jaccard/Tanimoto similarity test for large-scale genomic datasets. Bioinformatics, 37 (18), 2914–2920. https://doi.org/10.1093/bioinformatics/btab176 Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10 (8), 707–710. Zhang, Y., et al. (2022). Edit distance-based haplotype clustering for ancient DNA analysis. Nature Computational Science, 2 (3), 189–198. https://doi.org/10.1038/s43588-022-00235-0 Korneliussen, T. S., et al. (2014). ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics, 15 (1), 356. https://doi.org/10.1186/s12859-014-0356-4 Li, J., et al. (2021). Detecting epistatic interactions in genome-wide association studies using mutual information. Nucleic Acids Research, 49 (15), e86. https://doi.org/10.1093/nar/gkab410 Wilson, H. M., & Anderson, L. I. (2004). Morphology and taxonomy of Paleozoic millipedes (Diplopoda: Chilognatha: Archipolypoda) from Scotland. Journal of Paleontology , 78 (1), 169–184. https://doi.org/10.1666/0022-3360(2004)0782.0. CO;2 Emery, N. J., & Clayton, N. S. (2004). The mentality of crows: Convergent evolution of intelligence in corvids and apes. Science , 306 (5703), 1903–1907. https://doi.org/10.1126/science.1098410 Godfrey-Smith, P. (2013). Cephalopods and the evolution of the mind. Pacific Conservation Biology , 19 (1), 4–9. https://doi.org/10.1071/PC130004 Herculano-Houzel, S. (2017). Numbers of neurons as biological correlates of cognitive capability. Current Opinion in Behavioral Sciences , 16 , 1–7. https://doi.org/10.1016/j.cobeha.2017.02.004 Whitehead, H., & Rendell, L. (2015). The evolution of cetacean culture. In The cultural lives of whales and dolphins (pp. 89–132). University of Chicago Press. https://doi.org/10.7208/chicago/9780226895314.001.0001 Brown, C., & Laland, K. N. (2011). Social learning in fishes. In C. Brown, K. N. Laland, & J. Krause (Eds.), Fish cognition and behavior (2nd ed., pp. 186–202). Wiley-Blackwell. https://doi.org/10.1002/9781444342536.ch9 Schnell, A. K., & Clayton, N. S. (2021). Cephalopods: Ambassadors for rethinking cognition. Biochemical and Biophysical Research Communications , 564 , 27–36. https://doi.org/10.1016/j.bbrc.2020.12.062 Luo, Y., et al. (2021). 3D genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesis. Cell, 184 (4), 723–740. https://doi.org/10.1016/j.cell.2021.01.001 Hain, D., Kutschera, V. E., & Hiller, M. (2023). Modular evolution of cognitive circuits in the vertebrate brain. Nature Ecology & Evolution , 7 (4), 589–601. https://doi.org/10.1038/s41559-023-02021-z Li, X., Chen, Y., & Zhang, Q. (2022). CRISPR screen identifies gene networks underlying behavioral modularity in Drosophila . Cell , 185 (12), 2150–2165. https://doi.org/10.1016/j.cell.2022.04.029 Wagner, G. P., & Pavli č ev, M. (2023). The genomic architecture of cognitive-behavioral modularity: Insights from evolutionary developmental biology. Trends in Genetics , 39 (5), 351–365. https://doi.org/10.1016/j.tig.2023.01.004 Chittka, L., & Wilson, C. (2021). Behavioral modularity and the evolution of intelligence. Philosophical Transactions of the Royal Society B , 376 (1828), 20200050. https://doi.org/10.1098/rstb.2020.0050 Additional Declarations No competing interests reported. Supplementary Files Table1s471genomesemployedinthisstudy1.xlsx Table2sTested223SNVsof18cognitiongenes.doc Table3sSNVsofkeysamplesforsimilaritycalculation.xlsx Table4sTop20CGSNVsin471samples.xlsx Table5sTopSNVsFeatures.xlsx Table6smetadatafor223SNVs.xlsx figure1s.tif Figure8s1UMAP2D1vs2.tiff Figure8s2UMAP2D1vs3.tiff Figure9s1tSNEtSNE1vstSNE2.tiff Figure9s2tSNEtSNE1vstSNE3.tiff Figure9s3tSNEtSNE2vstSNE3.tiff Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6373286","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":445104386,"identity":"3ffdb4b3-4894-4a9f-a70f-2d6eeb47dd11","order_by":0,"name":"Zhizhou Zhang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAElEQVRIiWNgGAWjYDACCQiVwMbAwPiYwYABhGCChLUwGzMYGJCgBYjZpEF2ENQiP7v52cMvf+ry+Nh7zKoLCv4kbmdgPnibh8EuD5cWxjnHzI1leNiK2XiOpd2eYWCQuLOBLdmahyG5GJcWZokEM2kJCZ7ENonkY7d5DAxyNxzgMZPmYTiQ2IBDC5tE+jdpCQMJoJbEtmKIFv5veLXwSOSYSX5IMADbwgy1hQ2vFgmJnDJphgMJIL8kS/MYGNdvOMxmbDnHIBmnFvkZ6dskfwBDTL69x/Azzx85Y4PjzQ9vvKmww6kFHAQ8qFwQYYBHPRAw/sAvPwpGwSgYBSMdAABGHEw6ANFtsgAAAABJRU5ErkJggg==","orcid":"","institution":"Harbin Institute of Technology","correspondingAuthor":true,"prefix":"","firstName":"Zhizhou","middleName":"","lastName":"Zhang","suffix":""},{"id":445104387,"identity":"b59c03eb-7a73-46dc-8514-af44dc37b213","order_by":1,"name":"Shuaiyu Zhang","email":"","orcid":"","institution":"Harbin Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Shuaiyu","middleName":"","lastName":"Zhang","suffix":""},{"id":445104388,"identity":"cf1f3b40-a83c-44e9-8185-4b01e56cc265","order_by":2,"name":"Yongdong Xu","email":"","orcid":"","institution":"Harbin Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Yongdong","middleName":"","lastName":"Xu","suffix":""}],"badges":[],"createdAt":"2025-04-04 05:08:10","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6373286/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6373286/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":81402828,"identity":"46af77fb-2f81-4b23-888f-db36746e9a76","added_by":"auto","created_at":"2025-04-25 16:59:07","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":224588,"visible":true,"origin":"","legend":"\u003cp\u003ePrincipal component analysis (PCA) clustering plot of cognition gene polymorphisms across all 471 samples. Another angle of this result can be seen in Figure 1 s.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/9616f2c224a2fba876592b5a.png"},{"id":81401981,"identity":"49c166c2-00c6-4085-beaf-00c81c330266","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":593785,"visible":true,"origin":"","legend":"\u003cp\u003ePrecise positioning of coelacanths and lungfish in PCA clustering via progressive sample reduction. \u003cstrong\u003e(A)\u003c/strong\u003eRetains fish, reptiles, rodents, \u003cstrong\u003eLaurasiatherians, \u003c/strong\u003emiscellaneous taxa, and a subset of birds and representative marine animals. \u003cstrong\u003e(B)\u003c/strong\u003e Retains fish, reptiles, rodents, miscellaneous taxa, and a subset of \u003cstrong\u003eLaurasiatherians\u003c/strong\u003e and representative marine animals. \u003cstrong\u003e(C)\u003c/strong\u003e Retains fish, reptiles, miscellaneous taxa, and a subset of \u003cstrong\u003eLaurasiatherians.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/3402876e2b4695700808f49d.png"},{"id":81401994,"identity":"c7a65dce-2488-45d1-96f0-b07c54330da9","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":940565,"visible":true,"origin":"","legend":"\u003cp\u003eGeometricdistance among sample SNV profiles (relative to a modern human sample pp6) in the context of cognition gene polymorphisms. (A) \u003cstrong\u003eLevenshtein distance; (B) Jaccard distance; (C)\u003c/strong\u003e Euclidean distance\u003cstrong\u003e; (D)\u003c/strong\u003eHellinger distance\u003cstrong\u003e.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/9966d8ca7354f2a17112e956.png"},{"id":81403488,"identity":"5de89f70-1951-4a57-956a-f0646a1879c0","added_by":"auto","created_at":"2025-04-25 17:07:07","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":1194398,"visible":true,"origin":"","legend":"\u003cp\u003eDendrogram generated on the basis of Hellinger distance data. \u003cstrong\u003eNote:\u003c/strong\u003e The labels \"marine\" and \"fish\" here are not contradictory. As shown in \u003cstrong\u003eTable 1 s,\u003c/strong\u003e samples labeled \"marine\" represent a group of advanced marine mammals commonly found in oceanic environments.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/107b0e8bde01b5426e1d44e3.png"},{"id":81402825,"identity":"58ed2f79-3bb5-4d6e-9d4b-508aac038656","added_by":"auto","created_at":"2025-04-25 16:59:07","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":359673,"visible":true,"origin":"","legend":"\u003cp\u003ePCoA results derived from Hellinger distance. Representative animal group samples are marked with boxes, with seven lungfish and coelacanth samples positioned at a relatively central location. The plot also illustrates that cognitive gene polymorphism patterns across different animal groups influence one another in highly intricate ways. Coelacanths or lungfish may have transmitted early cognitive genes to Laurasiatherians and humans through unknown evolutionary mechanisms involving rodent species.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/ccd5fc0ef5d8b505acc7b208.png"},{"id":81402020,"identity":"83b25168-1e32-49de-a9c4-8e2c829b3ef7","added_by":"auto","created_at":"2025-04-25 16:51:08","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1852812,"visible":true,"origin":"","legend":"\u003cp\u003eInteraction network of SNVs based on Mutual-Information analysis. Lines (edges) represent MI values above a significance threshold, with thicker lines indicating stronger interdependencies. Nodes with higher connectivity (more edges) suggest that SNVs are potentially involved in functional modules or are under shared evolutionary constraints. Hub nodes: SNVs with numerous connections may reside in functional hotspots or interact with multiple regulatory elements. High-MI clusters: Tightly interconnected groups (modules) may represent coregulated loci or genomic regions under coordinated evolutionary pressures.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/1c4790be35385ebbb10b134f.png"},{"id":81403491,"identity":"8d5a97ae-c7c8-4d31-bd06-d5ca9307ef6e","added_by":"auto","created_at":"2025-04-25 17:07:08","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":766442,"visible":true,"origin":"","legend":"\u003cp\u003eChromosomal locations of the SNVs used in this study. The 18 horizontal lines represent 18 genes, with numbers at both ends of each line indicating the chromosomal coordinates of the gene. The red markers along the lines denote the approximate positions of individual SNVs.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/2fcbe7bebd3fcb11493ab89c.png"},{"id":81402827,"identity":"648cb36e-6558-493a-8cdf-b149936c2726","added_by":"auto","created_at":"2025-04-25 16:59:07","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":316036,"visible":true,"origin":"","legend":"\u003cp\u003eThe evolutionary placement of lungfish and coelacanth revealed by UMAP (Uniform Manifold Approximation and Projection) analysis generally aligns with the results shown in Figures 2 and 5. Additional similar findings can be observed in Supplementary Figures 8s1 and 8s2, as well as in the t-SNE-derived Figures 9s1-3.\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/482a50ad72111e690cb4e593.png"},{"id":87712299,"identity":"19316906-396a-4981-8033-acde59a2a3bb","added_by":"auto","created_at":"2025-07-28 08:47:18","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7628683,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/a1e6c804-16cc-4a97-91e4-1892fd009ff2.pdf"},{"id":81402826,"identity":"05e5707f-70f4-4601-b877-454d918c2446","added_by":"auto","created_at":"2025-04-25 16:59:07","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":42304,"visible":true,"origin":"","legend":"","description":"","filename":"Table1s471genomesemployedinthisstudy1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/6c4db9d8b8fd47910e178dc0.xlsx"},{"id":81401979,"identity":"499ebdb3-3965-4d82-a775-08318b5a73b3","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"doc","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":128512,"visible":true,"origin":"","legend":"","description":"","filename":"Table2sTested223SNVsof18cognitiongenes.doc","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/6b7b3898678ef60e3eb6023f.doc"},{"id":81401992,"identity":"6547af7f-a051-4a4f-8d68-b120f2a12d46","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":190899,"visible":true,"origin":"","legend":"","description":"","filename":"Table3sSNVsofkeysamplesforsimilaritycalculation.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/fd24c8008d78d0dd34d237b0.xlsx"},{"id":81402007,"identity":"11b4d842-e47b-4148-abe5-40e7de3c3017","added_by":"auto","created_at":"2025-04-25 16:51:08","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":50450,"visible":true,"origin":"","legend":"","description":"","filename":"Table4sTop20CGSNVsin471samples.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/cdb074dfc0819a26285fbc97.xlsx"},{"id":81403490,"identity":"a5b1bde0-e5a9-42e9-a970-3fdefea14403","added_by":"auto","created_at":"2025-04-25 17:07:08","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":12251,"visible":true,"origin":"","legend":"","description":"","filename":"Table5sTopSNVsFeatures.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/b2176a8aa216fd7af92d1e07.xlsx"},{"id":81401990,"identity":"32a5e280-f42c-4088-b6ec-cee5cf0e1fd4","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":28430,"visible":true,"origin":"","legend":"","description":"","filename":"Table6smetadatafor223SNVs.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/d0d54beee01c8c8f9da706a4.xlsx"},{"id":81401983,"identity":"47b35ffc-089d-471e-aa02-5f807be31d01","added_by":"auto","created_at":"2025-04-25 16:51:07","extension":"tif","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":211290,"visible":true,"origin":"","legend":"","description":"","filename":"figure1s.tif","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/017ad3d131d00585d17534db.tif"},{"id":81402041,"identity":"7161a610-1944-4998-beda-57f4c7bc4d56","added_by":"auto","created_at":"2025-04-25 16:51:13","extension":"tiff","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":86400204,"visible":true,"origin":"","legend":"","description":"","filename":"Figure8s1UMAP2D1vs2.tiff","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/bc5d1a65f68a01d63307e3d0.tiff"},{"id":81402040,"identity":"ef1882ec-ed54-4d30-8798-5afdfb7043bc","added_by":"auto","created_at":"2025-04-25 16:51:12","extension":"tiff","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":86400204,"visible":true,"origin":"","legend":"","description":"","filename":"Figure8s2UMAP2D1vs3.tiff","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/9fd550905860a0bef0597d9d.tiff"},{"id":81402009,"identity":"a6e16bfe-ae07-4110-bb76-07d6302f3ef9","added_by":"auto","created_at":"2025-04-25 16:51:08","extension":"tiff","order_by":10,"title":"","display":"","copyAsset":false,"role":"supplement","size":3456304,"visible":true,"origin":"","legend":"","description":"","filename":"Figure9s1tSNEtSNE1vstSNE2.tiff","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/fc2255d5a7f2a8a9536fcb36.tiff"},{"id":81402034,"identity":"9a84341d-5e3d-4dbd-8b91-8edff11d23f3","added_by":"auto","created_at":"2025-04-25 16:51:09","extension":"tiff","order_by":11,"title":"","display":"","copyAsset":false,"role":"supplement","size":3184818,"visible":true,"origin":"","legend":"","description":"","filename":"Figure9s2tSNEtSNE1vstSNE3.tiff","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/557f0d5eca831c7e851fe3fd.tiff"},{"id":81402841,"identity":"7dc383f3-c92f-4380-afe7-83e958bbf59c","added_by":"auto","created_at":"2025-04-25 16:59:08","extension":"tiff","order_by":12,"title":"","display":"","copyAsset":false,"role":"supplement","size":3171696,"visible":true,"origin":"","legend":"","description":"","filename":"Figure9s3tSNEtSNE2vstSNE3.tiff","url":"https://assets-eu.researchsquare.com/files/rs-6373286/v1/e1e968d6a1ef20387ccfc513.tiff"}],"financialInterests":"No competing interests reported.","formattedTitle":"The Evolution of Cognitive Abilities in Marine Animals: Insights from Cognition Gene Polymorphism in Coelacanths and Lungfish","fulltext":[{"header":"Key Terms Clarification","content":"\u003cul type=\"disc\"\u003e\n \u003cli\u003e\u003cstrong\u003eCognition genes\u003c/strong\u003e: Defined as genes linked to neural development, synaptic plasticity, and higher-order cognitive functions (e.g.,\u003cem\u003e\u0026nbsp;NRXN1\u003c/em\u003e, \u003cem\u003eDCC\u003c/em\u003e, Grid2, and \u003cem\u003eEP300\u003c/em\u003e; see Table 1).\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003ePolymorphism patterns\u003c/strong\u003e: Evaluated through allele frequency distributions, haplotype diversity, and conserved/nonconserved substitutions.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eEvolutionary positioning\u003c/strong\u003e: Contextualized via phylogenetic analyses and divergence time estimates relative to archaic and modern humans.\u003c/li\u003e\n\u003c/ul\u003e"},{"header":"Introduction","content":"\u003cp\u003eThe earliest fossil evidence of lungfish dates back approximately 410\u0026nbsp;million years ago (Early Devonian period), with representative species such as Dipnorhynchus. In contrast, the earliest coelacanth fossils appeared approximately 390\u0026nbsp;million years ago (Middle Devonian period), exemplified by Eoactinistia. Both belong to the class Sarcopterygii (lobe-finned fish), but lungfish are classified under the subclass Dipnoi, whereas coelacanths fall under Coelacanthimorpha. Lungfish exhibit an evolutionary path closer to the ancestors of tetrapods, whereas coelacanths represent an earlier-diverging independent lineage [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eCoelacanths were once thought to have gone extinct approximately 66\u0026nbsp;million years ago before the discovery of living species in 1938 (e.g., \u003cem\u003eLatimeria chalumnae\u003c/em\u003e), earning them the title of \"living fossils.\" To date, at least two extant coelacanth species have been recognized: the African/Comoran coelacanth (\u003cem\u003eLatimeria chalumnae\u003c/em\u003e) from Tanzania/Comoros and the Indonesian coelacanth (\u003cem\u003eLatimeria menadoensis\u003c/em\u003e). Moreover, six extant species of lungfish persist across South America (\u003cem\u003eLepidosiren paradoxa\u003c/em\u003e), Africa, and Australia (\u003cem\u003eNeoceratodus forsteri\u003c/em\u003e). The four African species include marbled lungfish (\u003cem\u003eProtopterus aethiopicus\u003c/em\u003e), East African lungfish (\u003cem\u003eProtopterus amphibius\u003c/em\u003e), West African lungfish (\u003cem\u003eProtopterus annectens\u003c/em\u003e), and slender lungfish (\u003cem\u003eProtopterus dolloi\u003c/em\u003e). Fossil and phylogenetic evidence consistently indicates that lungfish originated approximately 20\u0026nbsp;million years earlier than coelacanths did, confirming their status as the more ancient lineage within this evolutionary narrative.\u003c/p\u003e \u003cp\u003eBoth lungfish and coelacanths are evolutionarily close to the ancestors of tetrapods and likely play crucial roles in the transition of marine animals to terrestrial environments, ultimately giving rise to reptiles. Despite undergoing approximately 400\u0026nbsp;million years of evolution, the extant species of both groups exhibit remarkable morphological and anatomical stability\u0026mdash;their body sizes and structural features remain largely consistent with those of their fossilized counterparts. This astonishing evolutionary stasis suggests that certain lineages of lungfish and coelacanths adapted to relatively stable ecological niches, even as their relatives faced environmental pressures that drove divergent evolutionary pathways toward amphibians, reptiles, birds, mammals, and ultimately humans. Furthermore, this stability strongly implies that the genomic sequences of lungfish and coelacanths have largely retained features characteristic of their ancestors from 400\u0026nbsp;million years ago, offering a unique window into the genetic blueprint of early vertebrates [\u003cspan additionalcitationids=\"CR5 CR6 CR7\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe most fundamental distinction between humans and other animals lies in language and abstract cognitive abilities. Considering the evolutionary trajectory of language\u0026mdash;which originated as a motor skill governed by the brain for muscular coordination\u0026mdash;it is theoretically plausible that all animals possess some degree of language functionality. Similarly, while all animals exhibit varying levels of cognitive capacity, abstract cognitive abilities are generally absent in nonhuman species. Even humans acquired rudimentary abstract cognition only approximately 70,000 years ago, with more sophisticated forms likely emerging as recently as 17,000 years ago [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. This suggests a compelling hypothesis: cognitive evolution has progressed at an exceptionally slow pace since the era of lungfish and coelacanths. Over the vast evolutionary journey spanning amphibians, reptiles, birds, and mammals, biological innovation has focused primarily on optimizing survival skills (e.g., locomotion, predation, reproduction) and anatomical adaptations to environmental pressures. Many species within ancient human evolutionary lineages, such as other animals, likely retained only primitive cognitive abilities. This pattern should be reflected in the gene polymorphism patterns observed in genomic sequences, where conserved genetic architectures may mirror the gradual and limited development of higher-order cognition across deep evolutionary timescales.\u003c/p\u003e \u003cp\u003eTerrestrial animals originate from the ocean, and different marine environments are correlated with varying cognitive capabilities among marine organisms. It is well established that cephalopods (e.g., octopuses) and marine mammals (e.g., cetaceans) exhibit advanced cognitive abilities, such as tool use and social learning [\u003cspan additionalcitationids=\"CR12\" citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], whereas archaic humans develop complex cognitive traits, including sophisticated social structures, tool innovation, language, and abstract thinking. The existence of highly intelligent marine species raises a compelling question: At which evolutionary stage did key transitional marine animals\u0026mdash;such as coelacanths and lungfish\u0026mdash;position themselves in terms of cognition gene [\u003cspan additionalcitationids=\"CR15 CR16 CR17 CR18 CR19\" citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] evolution? Specifically, which marine organisms exhibit cognition gene polymorphism patterns most closely aligned with those of archaic humans? This inquiry holds significance for understanding the genomic foundations of cognitive evolution across vertebrate lineages.\u003c/p\u003e \u003cp\u003eThis study compiled whole-genome sequences from 471 diverse samples, including archaic humans (Neanderthals, Denisovans), modern humans, and other vertebrates (fish, amphibians, reptiles, birds, rodents, mammals). Additionally, we collected four coelacanth whole-genome sequences (representing two known species, \u003cem\u003eLatimeria chalumnae\u003c/em\u003e and \u003cem\u003eLatimeria menadoensis\u003c/em\u003e), three lungfish whole-genome sequences (two South American species, \u003cem\u003eLepidosiren paradoxa\u003c/em\u003e and \u003cem\u003eProtopterus annectens\u003c/em\u003e, and one African species, \u003cem\u003eProtopterus aethiopicus\u003c/em\u003e). Using these coelacanth and lungfish genomes, we conducted polymorphism screening, analysis, and comparative studies of cognition-associated genes (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) against genomes from other taxa. This work provides an initial characterization of the evolutionary positioning of cognition gene polymorphism patterns in coelacanths and lungfish within the broader context of animal evolution.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e \u003cb\u003eGenome sequences\u003c/b\u003e Genome sequences were downloaded from the ENA database, SRA database and Ensembl genome browser. A total of 471 whole genomes (including 111 ancient genomes, Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003es) from 5 continents (Africa, Asia, Europe, North America, and South America) were collected. The six representative animal groups include Laurasiatherians (L), amphibians/reptiles (R), fish (F), birds (b), primates (p), and rodents (d), plus miscellaneous taxa (x). The above ENA genome sequences have fastq format, whereas the Ensembl/SRA genome sequences are all assembled full genomes in fa, fn or fna formats, and all can be read and scanned with python-based hash07plus03 software.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSelected human cognition genes in this study\u003csup\u003e[14\u0026ndash;20]\u003c/sup\u003e\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGene\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFunction or Compromised ability (example) when mutated\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eARHGAP11B\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eHominin-specific development and evolutionary expansion of the brain neocortex\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eASPM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with microcephaly\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMCPH1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with microcephaly, primary, autosomal recessive\u0026nbsp;and\u0026nbsp;lymphatic malformation;\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCHRM2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA nervous system gene associated with depression disorder\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIGF2R\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eInsulin-like growth factor gene associated with behavior/neurological phenotype\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTHSD7B\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with eye diseases/neuronal diseases\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSnap25\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA gene associated with neurotransmitter release\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFads2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA member of the fatty acid desaturase, associated with craniofacial abnormalities\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDab1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA gene linked with nervous system development\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNBPF8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA gene associated with macrocephaly, autism, schizophrenia, cognitive disability\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHAR1A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eA gene whose expression levels associated with memory and cognitive abilities\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGNB5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with language delay and cognitive Impairment\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNRXN1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNeurexin 1 required for efficient neurotransmission and formation of synaptic contacts\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDCC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with impaired intellectual development\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGRID2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePredominant excitatory neurotransmitter receptors in the mammalian brain\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEP300\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAssociated with rare neurological diseases and impairment of intellectual development\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eKMT2D\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eLysine Methyltransferase 2D, associated with intellectual disability and eye diseases\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNOTCH2NL\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNeural progenitor proliferation and evolutionary expansion of the brain neocortex\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eCognition genes and their SNPs\u003c/h2\u003e \u003cp\u003eFor all human cognition genes, single nucleotide polymorphisms (SNPs) or Single Nucleotide Variants (SNVs) sites in the dbSNP database were selected such that each whole gene region was relatively equally spanned by the selected sites plus those already with known clinical effects (seen in the GeneCards database). Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e lists 18 human cognition genes, and a total of 223 SNVs were selected for this study (Table\u0026nbsp;2s).\u003c/p\u003e \u003cp\u003e \u003cb\u003eGenome sequence analysis software development\u003c/b\u003e SNP/(SNVs) loci finding software, which is based on hash tables, primarily processes biological whole-genome files and rapidly identifies target loci within the genome via a search algorithm to obtain the specific values of the mutated bases. The software is written in Python [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Initially, it processes three different formats of whole-genome files\u0026mdash;fastq, fna, and fa\u0026mdash;on the basis of their unique characteristics, extracting gene sequences and generating standard format files that include all lines containing only ATCGN five bases. During use, the software can process multiple genome files in batches and impose restrictions on the matching length and the number of matches. After extensive validation, the speed of the software hash07plus03 has significantly improved compared with that of conventional matching algorithms and other software programs based on the Knuth\u0026ndash;Morris\u0026ndash;Pratt (KMP) algorithm. One of the search algorithms in the custom-developed software hash07plus03 involves constructing a 31-base string (15 flanking bases on each side of an SNV locus combined with the central base \u003cem\u003eN\u003c/em\u003e, i.e., 15\u0026thinsp;+\u0026thinsp;\u003cem\u003eN\u003c/em\u003e\u0026thinsp;+\u0026thinsp;15) to perform exact matching searches across whole-genome sequences. If a precise match is found, the software extracts the central base \u003cem\u003eN\u003c/em\u003e as the SNV data; if no match exists, it outputs \"0\". For example, if exact bilateral matches yield three central bases \u003cem\u003eN\u003c/em\u003e (e.g., T, G, and C), the output would be \"TGC\". Any SNV site has one of the following 16 genotypes: 0, A, T, G, C, AT, AG, AC, TG, TC, GC, ATG, ATC, AGC, TGC or ATGC.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eSample SNP information abstraction\u003c/h3\u003e\n\u003cp\u003eThe authors used 010Edit software to extract SNP information from genome files, but most SNP information was extracted with hash07plus03 software. In all 471 genomes, the sizes ranged from 200 M to 120G. Genomes in fastq format but less than 10G were generally neglected or used only as a reference.\u003c/p\u003e\n\u003ch3\u003eCalculation of the Levenshtein distance\u003c/h3\u003e\n\u003cp\u003eThis method [\u003cspan additionalcitationids=\"CR22 CR23\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] directly compares SNP sequence alignment differences (e.g., frameshift mutations caused by INDELs), capturing contributions of structural sequence variations to genetic divergence. It is effective for analyzing mixed SNV datasets (SNP\u0026thinsp;+\u0026thinsp;InDel) and is typically applied to assess sequence divergence complexity in cross-species homologous regions (e.g., comparing SNV patterns in regulatory regions across mammals).\u003c/p\u003e\n\u003ch3\u003eCalculation of the Euclidean distance\u003c/h3\u003e\n\u003cp\u003eThis method [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] is suitable for processing high-dimensional SNV feature matrices (e.g., PCA results), as it intuitively reflects geometric differences in SNV frequencies across multidimensional space. It is particularly effective for allele frequency matrices (e.g., treating the A/T/G/C frequencies at each SNV locus as four-dimensional coordinates). The method offers high computational efficiency and is compatible with most clustering algorithms (e.g., PCA, K-means).\u003c/p\u003e\n\u003ch3\u003eCalculation of the Hellinger distance\u003c/h3\u003e\n\u003cp\u003eSpecifically designed for compositional data (satisfying \u0026sum;p_i\u0026thinsp;=\u0026thinsp;1, a property of SNV frequencies), this method [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] reduces the dominance of high-frequency alleles through square-root transformation, increasing sensitivity to rare variants. It is robust to zero values (directly compatible with zero-value handling in SNVs) and is used to compare multiallelic SNV frequency distributions across species (e.g., evolutionary differences between homologous loci in fish vs. primates).\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eCalculation of the Jaccard distance\u003c/h2\u003e \u003cp\u003eFocused on presence‒absence patterns while ignoring frequency differences, this metric [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] is ideal for comparing SNVs between highly divergent species (e.g., scenarios where humans and fish share minimal SNV loci). It is insensitive to sequencing depth variations and suitable for analyzing cross-species shared SNV locus proportions (e.g., conserved loci across different taxonomic classes).\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eComputational efficiency\u003c/h3\u003e\n\u003cp\u003eTo reduce the computational load, the above distance calculations were performed on a subset of 248 samples from Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003es (referenced in Table\u0026nbsp;3s). Testing confirmed that the results from this subset were not significantly different from those from the full dataset (data not shown). All similarity (distance) metrics were computed by existing packages in the R programming language, and all the R codes can be requested from the author.\u003c/p\u003e\n\u003ch3\u003ePCA/PcoA/t-SNE/UMAP analysis\u003c/h3\u003e\n\u003cp\u003eIn this study, the basic clustering analyses of samples were primarily performed using PCA and PCoA methods. While PCA preserves global structures, it may inadequately reveal certain local patterns in complex samples, being suitable for linear relationships but potentially losing complex nonlinear patterns. PCoA, based on distance matrices, maintains global distance relationships between samples and is sensitive to distance metrics, yet struggles to reflect high-dimensional local structures. In contrast, t-SNE (t-distributed Stochastic Neighbor Embedding), commonly applied in transcriptomics studies, effectively handles nonlinear associations between samples by emphasizing the preservation of local similarities. It models neighborhood relationships through probability distributions and excels at capturing high-dimensional complex manifold structures (e.g., cell differentiation trajectories, subpopulation delineation), yielding clearer visual clustering. UMAP, similar to t-SNE but grounded in topological theory, balances local and global structures with faster computational speed and improved preservation of global relationships, gradually emerging as an alternative to t-SNE. PCA, PcoA, t-SNE and UMAP were all performed via R packages.\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eMutual Information (MI) analysis\u003c/h2\u003e \u003cp\u003eMutual Information [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], a concept from information theory, measures the degree of interdependence between two variables. In SNV data, each SNV locus can be treated as a variable, with its allele frequency or other characteristics serving as variable values. MI captures both linear and nonlinear associations, making it particularly useful for analyzing complex relationships in genetics. Unlike methods requiring distributional assumptions, MI is well suited for the intricate patterns often observed in genetic data. The resulting visualizations may reveal information such as interaction networks between SNVs and functional modules. The MI-generated charts can identify SNVs that covary during evolution, potentially indicating coevolution or functionally linked loci. For example, SNVs in conserved genomic regions may present lower MI values, whereas regions under positive selection may present higher MI values, reflecting stronger associations. With respect to deeper relationships among SNVs, MI can detect nonlinear dependencies that traditional approaches (e.g., linkage analysis) might overlook. These relationships could reveal functional interactions or shared evolutionary pressures. For example, variations in MI values across species may reflect shifts in selective pressures during evolution. An increase in MI values in certain genomic regions from fish to humans might suggest increased functional complexity. In general, this analytical approach can uncover global association networks between SNV loci and identify functional modules (e.g., highly interconnected SNV clusters). In cross-species comparisons, conserved high-MI regions may correspond to critical evolutionary nodes. Through MI analysis, functional cooperation and evolutionary constraints among SNVs can be elucidated from a nonlinear perspective, providing new molecular mechanistic insights into complex phenotypic evolution from fish to humans. In the output table of Mutual Information (MI) analysis, Degree indicates the number of connections for an SNV within the network, reflecting its global connectivity. SNVs with a degree\u0026thinsp;\u0026ge;\u0026thinsp;10 are considered core sites. MI_Mean represents the average mutual information value between the SNV and all others, characterizing the average association strength. MI values are categorized as: \u0026gt;0.3 (strong association), 0.1\u0026ndash;0.3 (moderate association), and \u0026lt;\u0026thinsp;0.1 (weak association). Module denotes module membership, revealing potential functional groupings. Betweenness reflects betweenness centrality, identifying key pathway nodes (values\u0026thinsp;\u0026gt;\u0026thinsp;0.1 are considered significant). Closeness indicates closeness centrality, reflecting information transfer efficiency (values\u0026thinsp;\u0026gt;\u0026thinsp;0.3 denote highly efficient nodes). Global_Modularity measures the overall modularity index of the network (values\u0026thinsp;\u0026gt;\u0026thinsp;0.4 indicate a significant modular structure). The mutual information analysis was performed by existing packages on the R programming platform.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cp\u003eAs shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, five distinct cognition gene polymorphism pattern (CGPP) clusters were identified. The leftmost region comprises a densely packed cluster representing the majority of animal samples, including most marine-derived species. The lower-left corner corresponds to a group of \u0026lsquo;intelligent\u0026rsquo; animals, such as dolphins, camels, and certain primates. The lower-right cluster is exclusively occupied by primates, whereas the upper-right cluster includes modern humans and a subset of archaic human samples (e.g., Neanderthals and Denisovans). Between the leftmost and upper-right clusters lies an intermediate module containing modern humans, some archaic humans, and additional primate samples. Notably, two clusters encompassing modern humans appear to include transitional archaic human samples bridging these groups. Furthermore, at least three clusters contained primate samples, reflecting divergent evolutionary trajectories within this lineage. Intriguingly, only one coelacanth/lungfish sample (lu3) is displayed in the figure. The analysis revealed that certain marine animals, such as the lungfish lu3, present CGPPs that are phylogenetically closer to archaic humans (e.g., nd2, a Neanderthal sample) than to primates, despite significant divergence between nd2 and primate CGPPs. This observation may imply that the genetic architecture underlying human cognition\u0026mdash;particularly its core framework\u0026mdash;was already established during the evolutionary stage of fish, predating the emergence of tetrapods.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure 2 shows that the most primitive cognition gene polymorphism patterns (CGPPs) are indeed present in fish and amphibian/reptilian species. Notably, one lungfish sample (lu1) clustered within this group, whereas x19 (another coelacanth sample) occupied a more distant position. According to the PCA plot (Fig.\u0026nbsp;2A), the CGPPs of the three lungfish and four coelacanth samples were relatively close to one another, particularly compared with those of the other animal groups. As shown in Fig.\u0026nbsp;2C, the archaic human samples sd1 and mb1 (of African origin) presented CGPP features strikingly similar to those of fish and certain Laurasiatherian mammals, further supporting the hypothesis that the earliest human populations originated in Africa. Intriguingly, the coelacanth sample x19 was significantly different from the other coelacanth samples according to the PCA, which clustered more tightly. Its relative proximity to lu1, sd1, and mb1 suggests that coelacanths and lungfish possessed distinct evolutionary potentials and may have played divergent roles in shaping the cognitive capacities of early hominoids.\u003c/p\u003e\u003cp\u003eA critical finding in Fig. 2 is that, while the CGPPs of coelacanths and lungfish macroscopically bridge tetrapods and archaic humans, their positions are phylogenetically closer to those of archaic humans and intermingled with them. This contrasts sharply with the clear positional and feature-based separation observed between archaic humans and the CGPPs of tetrapods or other marine animals.\u003c/p\u003e\n\u003cp\u003eThe cognition gene polymorphism patterns (CGPP) of the three archaic human samples\u0026mdash;mb1, sd1, and dg2\u0026mdash;are notably closer to those of coelacanths and lungfish than expected, particularly for sd1 and dg2. Overall, lungfish and coelacanths occupy comparable evolutionary positions in terms of CGPP, with substantial overlap between the two groups. These findings align with the PCA patterns observed in the aforementioned figures. Notably, and as expected, the CGPP similarity between coelacanths/lungfish and archaic humans was significantly greater than that between coelacanths/lungfish and modern humans (Fig. 3A). Figure 3A includes 248 samples, with 39 labeled for clarity: Modern humans (pp6, p6, dc2, in9, sr2, ga4, gu2, yo3, pe3, sp2), Archaic humans (sc1, us2, bz1, ch1, et1, mo1l, mg1, mb1, sd1, dg2, cz1, de2) and six representative animal groups: Laurasiatherians (L4), amphibians/reptiles (R15, x16), fish (F36), birds (b1), primates (p8), and rodents (d6). The figure also features four coelacanth samples (x19, lc1, lc5, lm1) and three lungfish samples (lu1, lu3, lu4).\u003c/p\u003e\n\u003cp\u003eNeither the PCA results nor the Levenshtein similarity calculations in this study conclusively determine whether coelacanths or lungfish exhibit a more ancient cognition gene polymorphism pattern (CGPP) in evolutionary terms, as their CGPPs are intertwined and overlapping. Furthermore, the genetic distances between the CGPPs of both coelacanths/lungfish and archaic humans appeared to be smaller than those between archaic humans and nearly all the tested animal groups. Specifically, this study does not support the anticipated hierarchical progression of \u0026quot;coelacanths/lungfish \u0026rarr; six animal groups \u0026rarr; archaic humans.\u0026quot; Instead, the observed pattern better aligns with a sequence of \u0026quot;six animal groups \u0026rarr; coelacanths/lungfish \u0026rarr; archaic humans.\u0026quot; One plausible explanation is that the six animal groups analyzed here are represented by modern samples rather than fossil-derived samples, which could mask ancestral genetic signals and introduce recency bias (e.g., overemphasizing recent evolutionary traits).Another explanation involves degeneration/specialization. Results from Fig. 3 and later Fig. \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e suggest that the cognitive gene polymorphism patterns in coelacanths and lungfish may reflect bidirectional evolutionary trajectories: one pathway advanced toward higher cognitive capabilities, culminating in humans, while the other involved adaptive degeneration/specialization of cognitive traits\u0026mdash;such as reversion or specialization observed in certain reptiles, birds, and Laurasiatherians.\u003c/p\u003e\n\u003cp\u003eThe results from other similarity metrics align closely with those derived from Levenshtein similarity (Fig.\u0026nbsp;3). Figures\u0026nbsp;3B, 3C, and 3D display the curves for the Jaccard distance, Euclidean distance, and Hellinger distance, respectively. These methods are relatively well suited for analyzing SNV data in this study, and their purpose is to cross-validate whether the SNV polymorphism patterns of lungfish and coelacanths are consistent with Levenshtein similarity\u0026mdash;specifically, clustering near early ancient human samples (compared with other major taxonomic groups). Notably, all four distance metrics yielded consistent conclusions in this regard.\u003c/p\u003e\n\u003cp\u003eIn the Hellinger distance results, the dendrogram (Fig. \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e) clearly shows that the lungfish and coelacanth samples are positioned closer to several ancient human samples (nd11, nd3, sd1, and dg2) than to other fish samples. Intriguingly, the PCoA clustering analysis based on Hellinger distance (Fig. \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e) revealed that seven lungfish and coelacanth samples (lu1, lu3, lu4, lc1, lm1, x19 and lc5) occupied a relatively central position among major animal groups. Specifically, lu1 is closest to the fish sample cluster; lc5 is nearest to the reptile and certain fish samples; lu4 and lm1 are adjacent to a complex mix of taxa, including birds, specific fish, and other animal groups; x19 shows a polymorphism pattern more akin to birds than lu4 and lm1; and lu3 and lc1 align evolutionarily with rodents, laurasiatherians, and humans.\u003c/p\u003e\n\u003cp\u003eThe findings in Fig. \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e suggest that lungfish, coelacanths, and their ancient relatives were indeed at a pivotal stage in shaping early genetic patterns. These organisms had the evolutionary flexibility to diverge into reptiles, birds, and rodents/laurasiatherians or remain within the fish lineage. This critical juncture highlights their role in defining ancestral genomic trajectories that later radiated into distinct vertebrate classes.\u003c/p\u003e\n\u003cp\u003eThe authors further categorized all aquatic animal samples in Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003es into three groups on the basis of habitat type: freshwater, marine, and euryhaline (tolerant to both environments). Principal component analysis (PCA) was conducted to examine whether these groups exhibited specific associations with coelacanths or lungfish in terms of gene polymorphism patterns. However, no significant associations were detected (data not shown).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe inclusion of over 400 samples\u0026mdash;spanning 6\u0026ndash;7 extant animal groups, geographically diverse modern and ancient human populations, and coelacanths and lungfish from distinct regions\u0026mdash;introduces significant complexity in genetic backgrounds and evolutionary patterns of cognitive gene polymorphisms across deep evolutionary timescales. These factors make it inherently challenging to apply a single clustering method or similarity metric to analyze all samples simultaneously. Nevertheless, the PCA results and similarity calculations presented here exhibit relatively consistent patterns across most samples, indicating that these findings serve as a robust foundation for further in-depth investigations.\u003c/p\u003e \u003cp\u003eThe transition of life from marine to terrestrial environments during animal evolution was pioneered by early terrestrial arthropods and transitional vertebrates [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. The evolution of cognitive capabilities in marine organisms is a complex, multilayered process shaped by natural selection for survival adaptation [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan additionalcitationids=\"CR35 CR36 CR37 CR38\" citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. The cognitive foundations of early marine life (beginning\u0026thinsp;~\u0026thinsp;600\u0026nbsp;million years ago) are marked by neural system origins (Cnidarians such as jellyfish/corals developed simple neural networks enabling tactile perception and reflex behaviors) and instinct dominance (Flatworms such as planarians evolved centralized ganglia, yet behaviors remained governed by genetically encoded instincts). Following the Cambrian Explosion (~\u0026thinsp;540\u0026nbsp;million years ago), cognitive differentiation accelerated in Arthropods (e.g., mantis shrimp) and Cephalopods (octopuses, squids). Arthropods evolved compound eyes for complex visual processing (e.g., polarized light detection) and trial-and-error learning to refine hunting strategies. Cephalopods independently developed advanced brains and visual systems (convergent evolution), with species excelling in tool use (e.g., octopuses sheltering in coconut shells), short-term memory, and spatial learning. From ~\u0026thinsp;450\u0026nbsp;million years ago to the present, cognitive breakthroughs have emerged in social fish (e.g., cleaner wrasses), deep-sea fish and sardine schools. Social fish pass mirror self-recognition tests, recognize\u0026thinsp;\u0026gt;\u0026thinsp;100 individual faces, and recall interaction histories. Deep-sea fish decode electrosensory (e.g., electric eels) or bioluminescent signals. Sardine schools achieve collective intelligence through decentralized, signal-driven group decision-making. Over the last 50\u0026nbsp;million years, the understanding of marine mammals has increased in Cetaceans (whales, dolphins) and Pinnipeds (seals, sea lions).Cetaceans exhibit echolocation-based 3D environmental mapping (odontocetes), cultural transmission (e.g., orca dialects, bottlenose dolphin tool traditions), and cross-modal perception (e.g., associating sounds with visual symbols). Limited syntactic structures suggest proto-linguistic capabilities in cetacean cultures. Pinnipeds demonstrate rudimentary abstract concept learning (e.g., symbolic sequence rules). As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e (lower-left quadrant), cetacean samples dp1 and dp3n cluster with \u0026lsquo;smart\u0026rsquo; animals such as camels and some primates.\u003c/p\u003e \u003cp\u003eGiven that this study compares polygenic polymorphism patterns across taxonomically divergent species samples and lacks direct genetic or biological effect data, analyzing geometric distance differences in SNV profiles remains a reasonable approach, as genuine genetic variation processes inherently depend on these sequence-based geometric divergences. While the computational principles of these geometric distance metrics differ in sensitivity to specific SNV profile differences and in sample filtering during data processing, their final results share two key commonalities: (1) lungfish and coelacanths consistently cluster near early ancient human samples, and (2) the cognitive gene polymorphism patterns of lungfish and coelacanths appear to represent a transitional state preceding the divergence of major animal lineages.\u003c/p\u003e \u003cp\u003eThrough mutual information analysis, we identified a subset of potentially significant SNVs (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e6\u003c/span\u003e, Tables\u0026nbsp;4s-5s). Notably, the top 20 SNVs with the strongest associations did not form robust interaction networks (STRING analysis results, data not shown). These 20 SNV loci were largely absent across major animal groups but were sporadically present in lungfish and coelacanths (Table\u0026nbsp;4s), suggesting that the complex interactions underlying the cognitive gene polymorphisms observed here represent only a fraction of their evolutionary dynamics. Intriguingly, three of the 20 SNVs (rs532864586, rs75225211, and rs750156118) were found to localize within chromatin loops previously reported by Luo et al., who compared 3D genomes of human, macaque, and mouse brains and identified human-specific chromatin structural changes, including 499 topologically associating domains and 1,266 chromatin loops [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e] (Table\u0026nbsp;5s). The implications of this overlap warrant further investigation.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAdditionally, the 223 SNVs used in this study were gathered from local sequences across various genes in the genome via a near-equal density approach (see Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e7\u003c/span\u003e). Future studies should employ larger-scale SNV datasets, although biological phenotypic data for SNVs directly linked to cognitive ability remain scarce. Table\u0026nbsp;6s contains meta-data for 223 SNVs in which approximately two-thirds of the SNVs have clinical significance information as benign or pathogenic in humans.\u003c/p\u003e \u003cp\u003eThe results obtained using t-SNE align with Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u0026ndash;2 and Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e5\u003c/span\u003e (Figs.\u0026nbsp;9s1-3). Meanwhile, the UMAP-derived results (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e8\u003c/span\u003e, Fig.\u0026nbsp;8s1, Fig.\u0026nbsp;8s2) reveal additional clustering details. For instance, lu1 is closest to the fish sample cluster, well consistent with Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e5\u003c/span\u003e, while the closer proximity of \"lu1\" to ancient African hominin samples \"sd1\" and \"mb1\" was uniquely unveiled by UMAP. Other samples (lc1, lc5, lm1, lu3, lu4, x19) exhibited largely similar clustering patterns across Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u0026ndash;2, \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e5\u003c/span\u003e, and \u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e8\u003c/span\u003e. A common finding across all results is that the cognitive gene polymorphism patterns of lungfish and coelacanth samples predominantly lie at the interface between clusters of ancient human samples and other animal groups.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eMarine cognitive evolution follows multipath trajectories\u0026mdash;cephalopods leverage bodily plasticity, fish rely on collective coordination, and mammals evolve social intelligence. These divergences underscore that cognitive capabilities are not linearly evolved but rather niche-specialized adaptations. Throughout this evolutionary journey, gradual genomic changes in marine animals have been reflected in gene polymorphism patterns. By analyzing the CGPPs of coelacanths and lungfish, this study establishes a framework for future investigations into the cognitive evolutionary positioning of other marine taxa [\u003cspan additionalcitationids=\"CR42 CR43\" citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e].\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis study compiled whole-genome sequences from 471 diverse samples, including archaic humans (Neanderthals, Denisovans), modern humans, and other vertebrates (fish, amphibians, reptiles, birds, rodents, mammals). Additionally, four coelacanth whole-genome sequences (representing the two extant species) and three lungfish whole-genome sequences (two South American species and one African species) were used. Using these living-fossil coelacanth and lungfish genomes, we conducted polymorphism screening, analysis, and comparative studies of cognition-associated genes against genomes from other taxa. This work provides an initial characterization of the evolutionary positioning of cognition gene polymorphism patterns (CGPPs) in coelacanths and lungfish within the broader context of animal evolution.\u003c/p\u003e \u003cp\u003eWhile the results derived from the Levenshtein distance serve only as a preliminary reference, they align with the patterns observed via principal component analysis (PCA) and several other geometric distance analyses. Key findings include the following: 1) the CGPPs of both coelacanths and lungfish are phylogenetically closer to those of archaic humans than to those of most animal groups are; and 2) their CGPP occupies an evolutionary inflection (or turning) point, acting as a transitional bridge between diverse animal lineages and archaic humans.\u003c/p\u003e \u003cp\u003eLimitations and Implications: The primary objective of this study was to investigate the evolutionary placement of cognitive gene polymorphism patterns in ancient fish prior to their transition to land, using extant lungfish and coelacanth as proxies. However, conclusions drawn from modern samples\u0026mdash;which are distinct from lungfish and coelacanth from 400\u0026nbsp;million years ago\u0026mdash;are inherently subject to skepticism. The methodologies employed here also exhibit limitations when handling complex, cross-species, and cross-temporal datasets lacking phenotypic data. Nonetheless, the consistent patterns observed across multiple analytical approaches lend significant credibility to the shared findings, offering at least valuable reference insights. Meanwhile, if the goal is to assess the position of cognitive gene polymorphism patterns in extant lungfish and coelacanth among major animal groups, a paradoxical observation emerges: these patterns in modern lungfish and coelacanth are closer to ancient hominin samples (e.g., \u003cem\u003esd1\u003c/em\u003e, \u003cem\u003eDg2\u003c/em\u003e, \u003cem\u003end11\u003c/em\u003e, \u003cem\u003emb1\u003c/em\u003e, \u003cem\u003emo1l\u003c/em\u003e, \u003cem\u003end3\u003c/em\u003e, \u003cem\u003emg1\u003c/em\u003e) than to evolutionarily \"advanced\" groups such as most reptiles, birds, and rodents. This suggests that during the evolutionary divergence from fish to major terrestrial lineages, cognitive genes did not uniformly progress toward human-like patterns. Instead, they diversified across lineages, undergoing degeneration, specialization, or distinct evolutionary trajectories. However, a small subset of species, likely including ancestral lungfish and coelacanth, embarked on a progressive path from basal forms to advanced hominins. Critically, the foundational framework of cognitive genes may have originated in ancient fish and persisted, refined, and expanded in select species along the fish-amphibian-reptile-rodent lineage, ultimately culminating in patterns resembling those observed in ancient hominin samples.\u003c/p\u003e \u003cp\u003eAlthough lungfish and coelacanths from 400\u0026nbsp;million years ago would not exhibit identical results to those described above, these findings highlight the unique genomic signatures of coelacanths and lungfish in tracing the origins and pathways of cognitive evolution. They further provide a valuable foundation for exploring the intricate relationship between language and cognition within the framework of multi-gene polymorphism patterns.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthics approval and consent to participate:This article does not contain any studies with human participants or animals performed by any of the authors.\u003c/p\u003e\n\n\u003cp\u003eConsent for publication:All authors agree for this publication.\u003c/p\u003e\n\n\u003cp\u003eAvailability of data and materials:All data generated or analysed during this study are included in this published article and its supplementary information files. All R codes can re requested from the corresponding author.\u003c/p\u003e\n\n\u003cp\u003eCompeting Interests: The authors declare that they have no conflicts of interest.\u003c/p\u003e\n\n\u003cp\u003eFunding:This study was supported by a State Language Commission Research Grant (YB135-117), Association of Chinese Graduate Education Grant (B-2017Y0505-079), National Research Center for Foreign Language Education Grant (ZGWYJYJJ10A042) and funds from the Marine Antifouling Engineering Technology Center of Shandong Province.\u003c/p\u003e\n\n\u003cp\u003eAuthors\u0026apos; contributions:ZZ: Instructor of this study, manuscript writing, software testing;\u003c/p\u003e\n\u003cp\u003eSZ: Writing software for this study; plus software testing;\u003c/p\u003e\n\u003cp\u003eYX: Instructor for writing software for this study; plus software testing;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003e\u003cstrong\u003eBraasch, I., Gehrke, A. R., Smith, J. J., Kawasaki, K., Manousaki, T., Pasquier, J., ... \u0026amp; Postlethwait, J. H.\u003c/strong\u003e (2015). The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. \u003cem\u003eNature Genetics\u003c/em\u003e, \u003cem\u003e48\u003c/em\u003e(4), 427\u0026ndash;437. https://doi.org/10.1038/ng.3526\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eClaeson, K. M., Coates, M. I., \u0026amp; Smith, M. M.\u003c/strong\u003e (2021). Coelacanths and lungfish: The evolution of the sarcopterygian Bauplan. \u003cem\u003eAnnual Review of Earth and Planetary Sciences\u003c/em\u003e, \u003cem\u003e49\u003c/em\u003e, 501\u0026ndash;529. https://doi.org/10.1146/annurev-earth-072420-060741\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSmith, J. J., Timoshevskaya, N., Ye, C., Holt, C., Keinath, M. C., Parker, H. J., ... \u0026amp; Voss, S. R.\u003c/strong\u003e (2018). The lungfish genome expands our understanding of vertebrate genome evolution. \u003cem\u003eNature Ecology \u0026amp; Evolution\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(4), 713\u0026ndash;722. https://doi.org/10.1038/s41559-018-0471-0\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eAmemiya, C. T., Alf\u0026ouml;ldi, J., Lee, A. P., Fan, S., Philippe, H., MacCallum, I., ... \u0026amp; Lindblad-Toh, K.\u003c/strong\u003e (2013). The African coelacanth genome provides insights into tetrapod evolution. \u003cem\u003eNature\u003c/em\u003e, \u003cem\u003e496\u003c/em\u003e(7445), 311\u0026ndash;316. https://doi.org/10.1038/nature12027\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSchartl, M., Kneitz, S., Ormanns, J., Schmidt, C., Anderson, J. L., Amores, A., ... \u0026amp; Meyer, A.\u003c/strong\u003e (2024). The genomes of all lungfish inform on genome expansion and tetrapod evolution. \u003cem\u003eNature\u003c/em\u003e, \u003cem\u003e634\u003c/em\u003e(8032), 96\u0026ndash;103. https://doi.org/10.1038/s41586-024-07830-1\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeyer, A., Schloissnig, S., Franchini, P., Du, K., Woltering, J. M., Irisarri, I., ... \u0026amp; Venkatesh, B.\u003c/strong\u003e (2021). Giant lungfish genome elucidates the conquest of land by vertebrates. \u003cem\u003eNature\u003c/em\u003e, \u003cem\u003e590\u003c/em\u003e(7845), 284\u0026ndash;289. https://doi.org/10.1038/s41586-021-03198-8\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eNikaido, M., Noguchi, H., Nishihara, H., Toyoda, A., Suzuki, Y., Kajitani, R., ... \u0026amp; Okada, N.\u003c/strong\u003e (2013). Coelacanth genomes reveal signatures for evolutionary transition from water to land. \u003cem\u003eGenome Research\u003c/em\u003e, \u003cem\u003e23\u003c/em\u003e(10), 1740\u0026ndash;1748. https://doi.org/10.1101/gr.158105.113\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eNoonan, J. P., Grimwood, J., Danke, J., Schmutz, J., Dickson, M., Amemiya, C. T., \u0026amp; Myers, R. M.\u003c/strong\u003e (2004). Coelacanth genome sequence reveals the evolutionary history of vertebrate genes. \u003cem\u003eGenome Research\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(12), 2397\u0026ndash;2405. https://doi.org/10.1101/gr.2972804\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eGingerich, P. D.\u003c/strong\u003e (2022). Pattern and rate in the Plio-Pleistocene evolution of modern human brain size. \u003cem\u003eScientific Reports\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(1), 11216. https://doi.org/10.1038/s41598-022-15427-9\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ePonce de Le\u0026oacute;n, M. S., Bienvenu, T., Marom, A., Engel, S., Tafforeau, P., Warren, J. L., ... \u0026amp; Zollikofer, C. P. E.\u003c/strong\u003e (2021). The primitive brain of early \u003cem\u003eHomo\u003c/em\u003e. \u003cem\u003eScience\u003c/em\u003e, \u003cem\u003e372\u003c/em\u003e(6538), 165\u0026ndash;171. https://doi.org/10.1126/science.aaz0032\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMarino, L., Connor, R. C., Fordyce, R. E., Herman, L. M., Hof, P. R., Lefebvre, L., ... \u0026amp; Reiss, D.\u003c/strong\u003e (2007). Cetaceans have complex brains for complex cognition. \u003cem\u003ePLoS Biology\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e(5), e139. https://doi.org/10.1371/journal.pbio.0050139\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eGodfrey-Smith, P.\u003c/strong\u003e (2016). \u003cem\u003eOther minds: The octopus, the sea, and the deep origins of consciousness\u003c/em\u003e. Farrar, Straus and Giroux.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eAmodio, P., Boeckle, M., Schnell, A. K., Ostojic, L., Fiorito, G., \u0026amp; Clayton, N. S.\u003c/strong\u003e (2019). Grow smart and die young: Why did cephalopods evolve intelligence? \u003cem\u003eTrends in Ecology \u0026amp; Evolution\u003c/em\u003e, \u003cem\u003e34\u003c/em\u003e(1), 45\u0026ndash;56. https://doi.org/10.1016/j.tree.2018.10.010\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLi, M., Zhang, W., \u0026amp; Zhou, X.\u003c/strong\u003e (2020). Identification of genes involved in the evolution of human intelligence through combination of interspecies and intraspecies genetic variations. \u003cem\u003ePeerJ\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e, e8912. https://doi.org/10.7717/peerj.8912\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eGoriounova, N. A., \u0026amp; Mansvelder, H. D.\u003c/strong\u003e (2019). Genes, cells and brain areas of intelligence. \u003cem\u003eFrontiers in Human Neuroscience\u003c/em\u003e, \u003cem\u003e13\u003c/em\u003e, 44. https://doi.org/10.3389/fnhum.2019.00044\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSavage, J. E., Jansen, P. R., Stringer, S., Watanabe, K., Bryois, J., de Leeuw, C. A., ... \u0026amp; Posthuma, D.\u003c/strong\u003e (2018). Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. \u003cem\u003eNature Genetics\u003c/em\u003e, \u003cem\u003e50\u003c/em\u003e(7), 912\u0026ndash;919. https://doi.org/10.1038/s41588-018-0152-6\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSniekers, S., Stringer, S., Watanabe, K., Jansen, P. R., Coleman, J. R. I., Krapohl, E., ... \u0026amp; Posthuma, D.\u003c/strong\u003e (2017). Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. \u003cem\u003eNature Genetics\u003c/em\u003e, \u003cem\u003e49\u003c/em\u003e(7), 1107\u0026ndash;1112. https://doi.org/10.1038/ng.3869\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eXia, W., \u0026amp; Zhang, Z.\u003c/strong\u003e (2023). Language gene polymorphism patterns: Important information on human evolution. \u003cem\u003eJournal of Data Mining in Genomics \u0026amp; Proteomics\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e, 316.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eShi, L., Lin, Q., Su, B., \u0026amp; Zhang, Y.\u003c/strong\u003e (2017). Regional selection of the brain size regulating gene \u003cem\u003eCASC5\u003c/em\u003e provides new insight into human brain evolution. \u003cem\u003eHuman Genetics\u003c/em\u003e, \u003cem\u003e136\u003c/em\u003e(2), 193\u0026ndash;204. https://doi.org/10.1007/s00439-016-1749-4\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTattersall, I.\u003c/strong\u003e (2023). Endocranial volumes and human evolution. \u003cem\u003eF1000Research\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e, 565. https://doi.org/10.12688/f1000research.131636.1\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eZhang, Z., Zhang, S., Zhou, H., \u0026amp; Xu, Y.\u003c/strong\u003e (2024). A general evolution landscape of language and cognition genes. \u003cem\u003eJournal of Data Mining in Genomics \u0026amp; Proteomics\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e, 338. https://doi.org/10.XXXX/jdmgn.2024.15.338\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLevenshtein, V. I.\u003c/strong\u003e (1966). Binary codes capable of correcting deletions, insertions, and reversals. \u003cem\u003eSoviet Physics Doklady\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e(8), 707\u0026ndash;710.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSneath, P. H. A., \u0026amp; Sokal, R. R.\u003c/strong\u003e (1973). \u003cem\u003eNumerical taxonomy: The principles and practice of numerical classification\u003c/em\u003e. W.H. Freeman.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLi, Y., et al.\u003c/strong\u003e (2020). A Euclidean distance-based approach to assess genetic diversity in maize germplasm. \u003cem\u003eBMC Genomics, 21\u003c/em\u003e(1), 1\u0026ndash;13. https://doi.org/10.1186/s12864-020-07126-4\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eHellinger, E.\u003c/strong\u003e (1909). Neue Begr\u0026uuml;ndung der Theorie quadratischer Formen von unendlichvielen Ver\u0026auml;nderlichen. \u003cem\u003eJournal f\u0026uuml;r die reine und angewandte Mathematik, 136\u003c/em\u003e, 210\u0026ndash;271.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLin, H., \u0026amp; Peddada, S. D.\u003c/strong\u003e (2020). Analysis of microbial compositions: A review of Hellinger distance-based methods. \u003cem\u003eFrontiers in Microbiology, 11\u003c/em\u003e, 2154. https://doi.org/10.3389/fmicb.2020.02154\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eJaccard, P.\u003c/strong\u003e (1901). \u0026Eacute;tude comparative de la distribution florale dans une portion des Alpes et des Jura. \u003cem\u003eBulletin de la Soci\u0026eacute;t\u0026eacute; Vaudoise des Sciences Naturelles, 37\u003c/em\u003e, 547\u0026ndash;579.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eChen, J., et al.\u003c/strong\u003e (2021). Jaccard/Tanimoto similarity test for large-scale genomic datasets. \u003cem\u003eBioinformatics, 37\u003c/em\u003e(18), 2914\u0026ndash;2920. https://doi.org/10.1093/bioinformatics/btab176\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLevenshtein, V. I.\u003c/strong\u003e (1966). Binary codes capable of correcting deletions, insertions, and reversals. \u003cem\u003eSoviet Physics Doklady, 10\u003c/em\u003e(8), 707\u0026ndash;710.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eZhang, Y., et al.\u003c/strong\u003e (2022). Edit distance-based haplotype clustering for ancient DNA analysis. \u003cem\u003eNature Computational Science, 2\u003c/em\u003e(3), 189\u0026ndash;198. https://doi.org/10.1038/s43588-022-00235-0\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eKorneliussen, T. S., et al.\u003c/strong\u003e (2014). ANGSD: Analysis of Next Generation Sequencing Data. \u003cem\u003eBMC Bioinformatics, 15\u003c/em\u003e(1), 356. https://doi.org/10.1186/s12859-014-0356-4\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLi, J., et al.\u003c/strong\u003e (2021). Detecting epistatic interactions in genome-wide association studies using mutual information. \u003cem\u003eNucleic Acids Research, 49\u003c/em\u003e(15), e86. https://doi.org/10.1093/nar/gkab410\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eWilson, H. M., \u0026amp; Anderson, L. I.\u003c/strong\u003e (2004). Morphology and taxonomy of Paleozoic millipedes (Diplopoda: Chilognatha: Archipolypoda) from Scotland. \u003cem\u003eJournal of Paleontology\u003c/em\u003e, \u003cem\u003e78\u003c/em\u003e(1), 169\u0026ndash;184. https://doi.org/10.1666/0022-3360(2004)078\u0026lt;0169: MATOPM \u0026gt;2.0. CO;2\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eEmery, N. J., \u0026amp; Clayton, N. S.\u003c/strong\u003e (2004). The mentality of crows: Convergent evolution of intelligence in corvids and apes. \u003cem\u003eScience\u003c/em\u003e, \u003cem\u003e306\u003c/em\u003e(5703), 1903\u0026ndash;1907. https://doi.org/10.1126/science.1098410\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eGodfrey-Smith, P.\u003c/strong\u003e (2013). Cephalopods and the evolution of the mind. \u003cem\u003ePacific Conservation Biology\u003c/em\u003e, \u003cem\u003e19\u003c/em\u003e(1), 4\u0026ndash;9. https://doi.org/10.1071/PC130004\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eHerculano-Houzel, S.\u003c/strong\u003e (2017). Numbers of neurons as biological correlates of cognitive capability. \u003cem\u003eCurrent Opinion in Behavioral Sciences\u003c/em\u003e, \u003cem\u003e16\u003c/em\u003e, 1\u0026ndash;7. https://doi.org/10.1016/j.cobeha.2017.02.004\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eWhitehead, H., \u0026amp; Rendell, L.\u003c/strong\u003e (2015). The evolution of cetacean culture. In \u003cem\u003eThe cultural lives of whales and dolphins\u003c/em\u003e (pp. 89\u0026ndash;132). University of Chicago Press. https://doi.org/10.7208/chicago/9780226895314.001.0001\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eBrown, C., \u0026amp; Laland, K. N.\u003c/strong\u003e (2011). Social learning in fishes. In C. Brown, K. N. Laland, \u0026amp; J. Krause (Eds.), \u003cem\u003eFish cognition and behavior\u003c/em\u003e (2nd ed., pp. 186\u0026ndash;202). Wiley-Blackwell. https://doi.org/10.1002/9781444342536.ch9\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSchnell, A. K., \u0026amp; Clayton, N. S.\u003c/strong\u003e (2021). Cephalopods: Ambassadors for rethinking cognition. \u003cem\u003eBiochemical and Biophysical Research Communications\u003c/em\u003e, \u003cem\u003e564\u003c/em\u003e, 27\u0026ndash;36. https://doi.org/10.1016/j.bbrc.2020.12.062\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLuo, Y., et al.\u003c/strong\u003e (2021). 3D genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesis. \u003cem\u003eCell, 184\u003c/em\u003e(4), 723\u0026ndash;740. https://doi.org/10.1016/j.cell.2021.01.001\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eHain, D., Kutschera, V. E., \u0026amp; Hiller, M.\u003c/strong\u003e (2023). Modular evolution of cognitive circuits in the vertebrate brain. \u003cem\u003eNature Ecology \u0026amp; Evolution\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(4), 589\u0026ndash;601. https://doi.org/10.1038/s41559-023-02021-z\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLi, X., Chen, Y., \u0026amp; Zhang, Q.\u003c/strong\u003e (2022). CRISPR screen identifies gene networks underlying behavioral modularity in \u003cem\u003eDrosophila\u003c/em\u003e. \u003cem\u003eCell\u003c/em\u003e, \u003cem\u003e185\u003c/em\u003e(12), 2150\u0026ndash;2165. https://doi.org/10.1016/j.cell.2022.04.029\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eWagner, G. P., \u0026amp; Pavli\u003c/strong\u003e\u003cstrong\u003eč\u003c/strong\u003e\u003cstrong\u003eev, M.\u003c/strong\u003e (2023). The genomic architecture of cognitive-behavioral modularity: Insights from evolutionary developmental biology. \u003cem\u003eTrends in Genetics\u003c/em\u003e, \u003cem\u003e39\u003c/em\u003e(5), 351\u0026ndash;365. https://doi.org/10.1016/j.tig.2023.01.004\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eChittka, L., \u0026amp; Wilson, C.\u003c/strong\u003e (2021). Behavioral modularity and the evolution of intelligence. \u003cem\u003ePhilosophical Transactions of the Royal Society B\u003c/em\u003e, \u003cem\u003e376\u003c/em\u003e(1828), 20200050. https://doi.org/10.1098/rstb.2020.0050\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"gene polymorphism, cognition, coelacanths, lungfish, evolution","lastPublishedDoi":"10.21203/rs.3.rs-6373286/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6373286/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBoth coelacanths and lungfish have fossil evidence dating back 400\u0026nbsp;million years, placing them at a critical evolutionary juncture when marine animals have transitioned to terrestrial environments. An intriguing question lies in the extent to which their cognitive abilities had evolved before they crawled onto land. While no fossil DNA exist for extinct coelacanths or lungfish, studies on their extant species offer clues. Notably, the biological traits of coelacanths and lungfish have been remarkably stable over the past 70\u0026nbsp;million years, suggesting exceptional stability in their genomic sequences as well. This raises the possibility of inferring their cognition gene polymorphism patterns (CGPP) and evolutionary positioning through genomic analyses of modern samples. Comparative analyses with a range of animal taxa and human samples revealed that the CGPP of both coelacanths and lungfish are evolutionarily closer to those of archaic humans than those of most other animal groups. The CGPP appears to occupy an evolutionary inflection point bridging diverse animal lineages to archaic humans.\u003c/p\u003e","manuscriptTitle":"The Evolution of Cognitive Abilities in Marine Animals: Insights from Cognition Gene Polymorphism in Coelacanths and Lungfish","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-25 16:51:02","doi":"10.21203/rs.3.rs-6373286/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ea12ba86-9ffd-48e5-baa8-d0a5d94411ae","owner":[],"postedDate":"April 25th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-07-28T08:39:04+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-25 16:51:02","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6373286","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6373286","identity":"rs-6373286","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.