Mining for Species, Locations, Habitats, and Ecosystems from Scientific Papers in Invasion Biology: A Large-Scale Exploratory Study with Large Language Models

preprint OA: closed
Full text JSON View at publisher
Full text 1,765 characters · extracted from oa-doi-fallback · click to expand
This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint. You must log in to post a comment. There are no comments or no comments have been made public for this article. This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint. Add a Comment You must log in to post a comment. Comments There are no comments or no comments have been made public for this article. This paper presents an exploratory study that harnesses the capabilities of large language models (LLMs) to mine key ecological entities from invasion biology literature. Specifically, we focus on extracting species names, their locations, associated habitats, and ecosystems, information that is critical for understanding species spread, predicting future invasions, and informing conservation efforts. Traditional text mining approaches often struggle with the complexity of ecological terminology and the subtle linguistic patterns found in these texts. By applying general-purpose LLMs without domain-specific fine-tuning, we uncover both the promise and limitations of using these models for ecological entity extraction. In doing so, this study lays the groundwork for more advanced, automated knowledge extraction tools that can aid researchers and practitioners in understanding and managing biological invasions. https://doi.org/10.32942/X29D1X Engineering, Life Sciences large language models, Information Extraction, Generative AI, invasion biology, Literature review, prompt engineering, schema-based information extraction Published: 2025-03-06 03:35 CC-By Attribution-ShareAlike 4.0 International Conflict of interest statement: None Data and Code Availability Statement: https://doi.org/10.5281/zenodo.13956882 Language: English

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00