On the utility of Deep Learning for model classification and parameter estimation on complex diversification scenarios

doi:10.1101/2025.08.27.671290

On the utility of Deep Learning for model classification and parameter estimation on complex diversification scenarios

2025 · doi:10.1101/2025.08.27.671290

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 2,443 characters · extracted from oa-doi-fallback · click to expand

Abstract Birth-Death models applied to dated phylogenies are a useful tool to study past diversification dynamics. Parameters in these stochastic models are typically inferred using likelihood-based methods such as Maximum Likelihood Estimation (MLE) or Bayesian Inference. However, these approaches exhibit computational tractability issues in the case of models of moderate to high complexity. One approach to increase model complexity while remaining computationally tractable in the context of birth-death modelling is machine learning. So far, these techniques have been explored in the context of serially-sampled phylogenies (phylodynamics) and trait-dependent birth-death models. Here, we explored the power of Convolutional Neural Networks (CNNs), a type of Deep Learning (DL) method, to solve classification and regression (parameter estimation) tasks under constant-rate and time-homogeneous, rate-variable birth-death models. In particular, we compared six diversification scenarios: Constant Birth-Death, High-Extinction, Mass-Extinction, Diversity-Dependent, Stasis-and-Radiate, and Waxing-and-Waning. We simulated 10, 000 phylogenetic trees under each diversification scenario, which were encoded using a vectorization procedure that captures the topology and branch length information. The encoded trees were used to train or test a set of CNNs models that were designed to tailor three empirical case studies differing in the number of tips. We compared CNNs performance with MLE inference. Our results show that CNNs exhibited classification accuracy levels of 93-78%, whereas maximum likelihood estimation achieved levels of 74-70%. The most difficult scenarios to predict for the CNNs were the high-extinction and mass-extinction scenarios, which were often misidentified as one another. For the regression tasks, mean average errors were comparable between CNNs models and MLE inference, and they also coincided in their difficulty estimating ratio parameters such as mass extinction survival and turnover. Finally, we applied our CNNs to three empirical studies (eucalypts, conifers and cetaceans) and discussed potential shortcomings and future avenues for improvement in the application of deep-learning birth–death modelling approaches. Competing Interest Statement The authors have declared no competing interest. Footnotes ↵∗ co-senior authors Author name corrected, all of the rest is the same until next revisions.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00