The role of genetic observatory networks in the detection and forecasting of marine non-indigenous species | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article The role of genetic observatory networks in the detection and forecasting of marine non-indigenous species Justine Pagnier, Tobias Andermann, Mats Andersson, Matthias Obst This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8702791/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract Marine biological invasions threaten ocean health, yet management remains reactive rather than proactive. Predictive tools, such as species distribution models, have the potential to provide indications about which areas are particularly likely to be colonised by non-indigenous species, allowing a more proactive management approach. Here we introduce an integrated framework utilising DNA-based monitoring data from genetic observatory networks to identify non-indigenous species for modelling and to independently validate the species distribution models forecasting invasion risk areas across European seas. We modelled habitat suitability for 69 marine non-indigenous species using global occurrence data and an ensemble modelling approach based on 6,555 individual species distribution models built with five different algorithms. Model validation against independent DNA-based detections from observatory networks showed 90% of observations occurred in predicted suitable habitat, confirming robust predictive capacity. Under current conditions, models identify invasion hotspots in the North Sea, North Atlantic, Mediterranean, and Black Sea. Climate projections to 2100 reveal pronounced vulnerability in Arctic and subarctic regions (up to 300% increase in habitat suitability under SSP5-8.5 scenario), while Mediterranean regions show modest change. We further demonstrate how the models can be applied in preventive action by supporting decisions in ballast water management. By coupling standardised and spatiotemporally consistent molecular monitoring with predictive modelling, we provide a scalable approach for marine biosecurity forecasts in a rapidly changing ocean. Biological sciences/Ecology/Biodiversity Biological sciences/Ecology/Invasive species Biological sciences/Ecology/Ecological modelling Biological sciences/Biological techniques/Genetic techniques Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction In a rapidly changing world, the ability to predict ecological dynamics and potential disruptions is essential. Marine ecosystems are increasingly threatened by rising temperatures 1 – 3 , ocean acidification 2 , 4 , salinity changes 5 , and intensifying human activities 6 . Among these pressures, biological invasions by non-indigenous species (NIS) represent a persistent and accelerating driver of ecological change. These species, also called alien , non-native , or introduced species, are often transported through shipping, aquaculture, or drifting marine debris 7 . Some become invasive , outcompeting native organisms, restructuring food webs, and altering ecosystem services 8 . Biological invasions across marine, terrestrial, and freshwater ecosystems are now recognized as one of the top five drivers of global biodiversity loss 9 – 11 . Their global economic costs have risen by 702% from 1980–1999 to 2000–2019 12,13 . Anticipating and mitigating invasions has become a priority for managing and preserving ocean ecosystems 14 . Biological invasions happen through distinct, yet coupled, stages of transport, introduction, establishment, and spread 8 , each presenting a separate opportunity for intervention. Yet, current marine biosecurity remains largely reactive rather than preventive, with countries typically deploying expensive post-invasion responses that rarely prove effective 15 , 16 . Detection efforts, although effective, operate after introduction has already occurred. They allow responses during the establishment and spread phases, when eradication becomes exponentially more difficult and costly. However, species detection in marine environments presents significant logistical challenges. The ocean's vast scale and dynamic nature make systematic surveillance difficult, and traditional survey methods remain spatially patchy, resource-intensive, and challenging to standardise across regions. In this context, DNA-based monitoring with metabarcoding offers a solution, enabling rapid, standardised detection of biodiversity from small environmental samples. The ARMS-MBON network (Autonomous Reef Monitoring Structures - Marine Biodiversity Observation Network) illustrates this approach: through standardised deployments of ARMS units at marine observatories across Europe and adjacent regions since 2018, this network provides comparable DNA-based biodiversity data that capture a wide spectrum of benthic organisms 17 . Recent studies demonstrate that ARMS-MBON can detect marine NIS effectively and at early stages 18 – 20 . Additionally, many of the standards and methods developed by ARMS-MBON are now implemented in national monitoring programs ( e.g ., Swedish Ports Monitoring program 21 , 22 ). Yet, while improved detection accelerates response time, it remains inherently reactive. Prediction, on the other hand, enables proactive intervention before introduction occurs. Species Distribution Models (SDMs) 23 , 24 , also known as Habitat Suitability or Ecological Niche-based Models, provide a way to forecast where species are likely to establish under current and future conditions. By identifying which regions provide suitable habitat for potential invaders, we can target prevention efforts at the transport stage. For example, this enables strengthening biosecurity at high-risk ports, prioritising surveillance in vulnerable ecosystems, and allocating management resources where they will be most effective. This is particularly crucial in marine systems, where dispersal can span hundreds of kilometers 7 , 25 . SDMs have been increasingly used to forecast climate-driven range shifts 26 – 29 , identify suitable habitats in data-poor regions 30 , and assess invasion risk 31 – 34 . One key application of spatial risk forecasts is the designation of Ballast Water Contingency Areas (BWCAs), which are zones where ships can perform ballast water exchange in case of contingencies. These areas must be carefully defined not to favour the spread of NIS into new areas. However, current marine biosecurity lacks integrated analytical frameworks that couple these predictive models with sensor networks and management decision tools. This disconnect, where monitoring data, ecological forecasts, and policy interventions remain poorly integrated, creates time lags that prevent proactive NIS management. The convergence of genetic monitoring and species distribution modelling offers a solution: a continuous feedback loop from detection to risk assessment to intervention. Such integrated frameworks align with emerging digital infrastructures like Digital Twins of the Ocean (DTOs) 35 , which combine oceanographic, biological, and socioeconomic data into dynamic systems for scenario exploration and early warning. In this study, we demonstrate the validity of coupling DNA-based monitoring with species distribution modelling for NIS risk assessment and management. Specifically, we address three questions: (i) can SDMs trained on global occurrence data accurately predict current distributions of NIS detected by genetic observatories? (ii) can ensemble models identify spatial hotspots with elevated establishment potential (i.e., areas where multiple NIS find suitable habitat simultaneously) under current and future ocean conditions? and (iii) how can such spatially explicit forecasts support preventive NIS management? Here, we modelled habitat suitability for 69 marine NIS detected by genetic observatories across European seas using global occurrence records from the Global Biodiversity Information Facility ( www.gbif.org ) and the Ocean Biodiversity Information System ( www.obis.org ), coupled with environmental predictors from Bio-ORACLE 36 . We then validate predictions against available independent DNA-based detections. Our models predict where NIS can establish based on environmental conditions. This provides a foundation for risk assessment that can be integrated with species-specific impact evaluations to prioritise management actions. By projecting suitability under multiple climate scenarios, we assess how alien spread may shift as ocean conditions change, providing a blueprint for a sensor-model-decision framework for evidence-based marine biosecurity in a changing ocean. We illustrate how these spatial forecasts can be integrated with existing marine and digital infrastructure to support ecosystem-based management in the future. Results We successfully modelled habitat suitability across current and future ocean conditions for various marine NIS detected by genetic observatory networks (ARMS-MBON 17 , Swedish Ports Monitoring 21 , 22 ). These species span 10 major taxonomic groups (22 arthropods, 11 annelids, 10 molluscs, 7 rhodophytes, 7 chordates, 4 bryozoans, 3 cnidarians, 3 ochrophytes, one ctenophore, and one dinoflagellate, see Supplementary File 1), revealing species-specific predictions of suitable habitat as well as invasion pressure range-shift under climate change across European waters. Model assessments Individual models Of the 81 NIS initially selected, 69 passed quality criteria and produced viable models, with 95 individual models built per species (6,555 SDMs total): 75 cross-validation models (5 algorithms × 5 k-folds × 3 pseudo-absences datasets), 15 fold-averaged models per pseudo-absences dataset, and 5 full-dataset models, trained on global occurrence data from GBIF and OBIS. Models were evaluated using 5-fold cross-validation, with performance assessed using TSS and ROC-AUC metrics. All algorithms demonstrated strong discrimination ability on validation data (Table 1 ), with AUC values ranging from 0.859 ± 0.132 (GAM) to 0.917 ± 0.0827 (RF) and 0.916 ± 0.0747 (MAXNET). Threshold-dependent performance showed greater variability, with TSS scores ranging from 0.611 ± 0.244 (GAM) to 0.687 ± 0.188 (MAXNET). As expected, calibration metrics were uniformly higher than validation metrics across all algorithms (Table 1 ), indicating train-test gaps of varying magnitude between algorithms. For example, RF showed potential overfitting with near-perfect calibration (AUC ≈ 1.00, TSS = 0.989 ± 0.00983) compared to validation scores (AUC = 0.917 ± 0.0827, TSS = 0.647 ± 0.213). In contrast, MAXNET exhibited the smallest train-test gap (calibration TSS = 0.826 ± 0.0644 vs. validation TSS = 0.687 ± 0.188), while other algorithms showed intermediate patterns. Despite these gaps, validation metrics remained high across all algorithms (mean AUC = 0.892 ± 0.101, mean TSS = 0.644 ± 0.214), indicating robust discriminatory ability. The ensemble modelling framework, which combines predictions from algorithms with complementary strengths and weaknesses, provides additional regularisation for final projections (see next section). Table 1 Models evaluation metrics. Average (± standard deviation) values across all species, pseudo-absences’ datasets, and model runs, for each algorithm and for all combined. GAM MARS MAXNET RF XGBOOST All Calibration TSS 0.813 ± 0.118 0.841 ± 0.0877 0.826 ± 0.0644 0.989 ± 0.00983 0.861 ± 0.088 0.866 ± 0.104 Validation TSS 0.611 ± 0.244 0.648 ± 0.201 0.687 ± 0.188 0.647 ± 0.213 0.625 ± 0.210 0.644 ± 0.214 Calibration AUC 0.948 ± 0.0397 0.961 ± 0.0288 0.961 ± 0.0209 1.00 ± 0.000673 0.968 ± 0.0247 0.968 ± 0.0314 Validation AUC 0.859 ± 0.132 0.883 ± 0.102 0.916 ± 0.0747 0.917 ± 0.0827 0.888 ± 0.0893 0.892 ± 0.101 The number of occurrence records used per species ranged from 10 (minimum allowed) to 847 (median: 118), with a clear positive relationship between sample size and model performance. Species with fewer than 50 occurrence records showed the highest variability and lowest mean performance (AUC: 0.822 ± 0.15; TSS: 0.491 ± 0.3), while species with 50–200 occurrences demonstrated substantial improvement (AUC: 0.906 ± 0.06; TSS: 0.67 ± 0.14) (Fig. 1 a). Model performance continued to improve with increasing sample size, reaching optimal and stable performance in species with more than 100 occurrences (AUC: >0.94; TSS: >0.75). Beyond improving performance, larger occurrences datasets substantially enhanced model performance consistency. Standard deviation for TSS decreased 84% (from 0.300 to 0.048) and for AUC decreased 87% (from 0.149 to 0.019) across the sample size gradient (Fig. 1 b), indicating that models built with > 100 occurrences produce more consistent predictions across different algorithms and validation folds. Ensemble models Given the variable performance across individual algorithms, we constructed ensemble models to use the strengths of multiple approaches. To create robust ensemble predictions for each species, we retained only individual models that met minimum performance thresholds (TSS > 0.6 and AUC > 0.85). Four species had fewer than 10 models available for ensemble construction and were flagged for cautious interpretation: Apionsoma (Apionsoma) misakianum (n = 9 models) Herdmania momus (n = 5), Tharyx setigera (n = 4), and Pseudocalanus acuspes (n = 4). Figures showing the performance distribution across species and algorithms are available in Supplementary Figs. 1 and 2. Ensemble models showed improved performance compared to individual algorithms, with average scores of AUC = 0.976 ± 0.0137 and TSS = 0.863 ± 0.055 across all 69 species (calibration values, as ensemble models do not have validation values). Performance varied by species, with AUC ranging from 0.927 to 0.997 and TSS from 0.707 to 0.970. Overall, ensemble models improved predictive performance, providing robust suitability estimates for subsequent spatial analyses. External validation with independent DNA-based detections We used metabarcoding-based species detections from genetic observatory networks as an independent dataset to evaluate our model predictions. These detections from ARMS-MBON deployments (2020 and 2021) and Swedish Ports Monitoring (2023 and 2025) were unpublished in GBIF/OBIS at the time of modelling and therefore excluded from model training. An independent validation dataset was therefore available for 33 species. Our model predictions showed strong agreement with independent ARMS observations (Table 2 ) as approximately 90% of detections occurred in areas predicted as suitable (suitability > 0.5), and around 47% in areas predicted as highly suitable (suitability > 0.8) by our ensemble models (Supplementary Fig. 3). Species-specific and combined validation results are provided in Supplementary File 2. Absences were not tested as DNA-based absences do not confirm true absences. Individual suitability maps and uncertainty assessment Weighted ensemble model predictions provide habitat suitability estimates ranging from 0 (unsuitable) to 1 (highly suitable) for each species. We assessed prediction uncertainty using two complementary metrics: coefficient of variation ( EMcv in BIOMOD2, renamed CoV here) across algorithm predictions, and committee averaging ( EMca in BIOMOD2, renamed CA here) showing the proportion of models agreeing on presence/absence. However, coefficients of variation values are inflated in areas of very low habitat suitability due to division by near-zero mean values, and we therefore recommend interpreting uncertainty with care in these areas. All species-specific habitat suitability, CoV and CA maps are available on Figshare (see Data Availability Statement), for current ocean conditions and three 2100 climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5). Here we present detailed results for two example species spanning different taxonomic groups and geographic patterns: the mollusc Crepidula fornicata and the copepod Acartia (Acanthacartia) tonsa (Fig. 2 ). C. fornicata shows widespread suitable habitat throughout the North Atlantic, North Sea, and Kattegat/Skagerrak regions, with particularly high suitability along the southern and eastern coastline of the North Sea (Fig. 2 a). Despite this broad predicted range, uncertainty remains relatively low across most suitable areas (Fig. 2 b), with CoV values predominantly below 0.5, indicating good consensus among modelling algorithms. Model agreement is high (CA > 0.75) throughout suitable areas (Fig. 2 c), suggesting robust predictions in these areas. A. tonsa displays a different geographical pattern, with suitable habitat concentrated in the Baltic Sea, and discrete areas in the Mediterranean and Black Sea (Fig. 2 d). Uncertainty patterns for this species show moderate CoV values (0.25–0.75) in key suitable areas, while model agreement remains high (CA > 0.75) in core suitable habitats such as the southern and central Baltic Sea (Fig. 2 f). For both species, the coefficient of variation increases with distance to the coast, and these areas are also areas where the models do not fully agree on absences or presences. Climate projections under the moderate emissions scenario (SSP2-4.5, 2100) reveal species-specific patterns of habitat redistribution (Fig. 3 ). C. fornicata exhibits the most important redistribution, with substantial decreases in suitability (< -0.15) predicted throughout its current core areas in the North Sea, English Channel, and Irish Sea (Fig. 3 a). In parallel, we observe suitability gains in previously unsuitable Arctic waters, particularly the Barents Sea and northern Norwegian coast, suggesting a northward displacement of suitable habitat. For A. tonsa , projected changes are more geographically restricted (Fig. 3 c). The species faces moderate to substantial decreases across much of the Baltic Sea, currently a core suitable habitat, as well as scattered decreases in the Mediterranean and Black Sea. All species-specific projections for all three SSP scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5) and associated uncertainty metrics are available in the data repository. Current priority areas for NIS management Beyond species-specific assessments, identifying geographic areas where multiple NIS may co-occur is critical for prioritising limited biosecurity resources. To identify priority areas for biosecurity alert and intervention, we aggregated suitability predictions across all 69 modelled species, revealing geographic hotspots where environmental conditions support multiple NIS (Fig. 4 ). Mean suitability values reflect the average habitat favorability across all 69 modelled species in each grid cell, with higher values indicating that, on average, environmental conditions are more favorable for the species pool. Note that this metric captures average suitability rather than species richness. Areas with high mean values may reflect either many species with moderate suitability or fewer species with very high suitability. Severe priority areas (S ≥ 0.6, dark red) are spatially restricted to small, fragmented coastal patches primarily concentrated along both sides of the English Channel and the northern part of the Bay of Biscay. High priority zones (S = 0.5–0.6, red-orange) extend these hotspots along the North Sea coastlines and up to the Kattegat/Skagerrak region. These zones also comprise the northern Adriatic Sea, the Atlantic side of the Strait of Gibraltar, the northern Black Sea coast, the Tunisian coast, the Egyptian coast around the Suez Canal, parts of the southern French Mediterranean coast and of the northern parts of the Norwegian coastline. Moderate priority areas (S = 0.3–0.5) show broader geographic extent, covering additional Scandinavian coastal regions (including the Baltic), Icelandic coasts, and appearing in scattered locations throughout the Mediterranean and along Atlantic-facing shores. Arctic waters and offshore areas show consistently low suitability (S < 0.3), reflecting limited NIS establishment under current conditions. Future changes in NIS suitability and regional patterns We projected habitat suitability to 2100 using environmental data from Bio-ORACLE v3.0 based on CMIP6 climate models under three emission scenarios: SSP1-2.6 (high mitigation), SSP2-4.5 (moderate mitigation), and SSP5-8.5 (high emissions). Change in mean suitability was calculated as the difference between future (2090–2100 average) and current (2000–2020 average) conditions across all 69 species. Projections under future climate scenarios revealed predominantly northward increases in mean habitat suitability, with magnitude strongly dependent on emission pathway (Fig. 5 ). Under the low emission scenario (SSP1-2.6), changes in mean suitability were relatively modest with 96% of the total study area (European continental shelf) experiencing minimal change (between − 0.05 and + 0.05, white on maps). Only 2.8% of the total study area showed slight to moderate increases, primarily in northern latitudes (Fig. 5 a), while less than 1% exhibited slight decreases along the coast of the North Sea and the Kattegat. The intermediate emission scenario (SSP2-4.5) projected more pronounced increases, with 16% of the total study area experiencing slight to substantial increases, particularly in the Norwegian and Barents Seas (Fig. 5 b), and only 0.4% showing slight decreases in small coastal areas. The high emission scenario (SSP5-8.5) produced the most dramatic changes: 55% of the total study area showed slight to moderate increases, while 3.6% exhibited important to major increases (Fig. 5 c). These increasingly suitable areas were concentrated in northern regions, especially in Arctic and sub-Arctic waters. We analysed these changes across 18 Marine Ecoregions of the World (MEOW) 37 within European waters to identify regional patterns of climate-driven alien spread risk. Regional analysis revealed a pronounced latitudinal gradient (Fig. 6 , Supplementary Table 1). Arctic and subarctic ecoregions showed both the largest relative (Fig. 6 a) and absolute (Fig. 6 b) increases. A detailed table with the computed relative and absolute changes can be found in Supplementary File 3. Arctic regions exhibited the most substantial relative changes under SSP5-8.5: the North and East Barents Sea showed a 301% ± 135% increase (mean ± SD across pixels within the ecoregion), followed by Northern Norway and Finnmark (104% ± 71%), and Icelandic waters (North and East Iceland: 129% ± 50%; South and West Iceland: 132% ± 70%). Even under the conservative SSP1-2.6 scenario, the Barents Sea showed a 65% ± 57% increase. Mid-latitude Atlantic regions (North Sea, Celtic Seas, South European Atlantic Shelf) demonstrated more moderate responses, with increases of 70–90% under SSP5-8.5 and 10–15% under SSP1-2.6. Mediterranean ecoregions generally showed modest increases of 10–30% under SSP5-8.5, with the Adriatic and Ionian Seas (28–29%) at the higher end and the Alboran Sea showing minimal change. Under lower-emission scenarios, Mediterranean increases were around 10% or below. The Black Sea exhibited near-zero or slightly negative changes across all scenarios despite currently high habitat suitability. Ballast water contingency areas (BWCA) We then explored a potential application of the above results for biosecurity. Using the aggregated suitability map for current conditions, we identified potential Ballast Water Contingency Areas in the eastern North Sea and Baltic Sea by applying spatial criteria: we kept areas with mean habitat suitability below 0.2 and located > 7 kilometers from Marine Protected Areas, offshore wind farms, and coastlines. These represent invasion cold spots where NIS establishment risk is minimised (Fig. 7 ). Ships operating in these regions could be directed to these zones during contingency situations requiring ballast water discharge. Note that these thresholds are illustrative and may be adjusted based on specific contingency scenarios and regional biosecurity priorities. Discussion The application of species distribution models (SDMs) to non-indigenous species (NIS) has been contested due to potential violations of the niche-environment equilibrium assumption 38 – 40 . For example, recent work on the Pacific oyster invasion in Swedish waters has shown that species-habitat associations at invasion fronts may differ from established populations 41 . However, Jiménez-Valverde et al. 42 argue that when appropriately constructed ( i.e. using distributional data from all invaded regions, predictors linked to physiological requirements, and careful model evaluation), SDMs provide robust predictions for invasion risk assessment. We show here that a dynamic integration of SDMs with standardised genetic observatories addresses these limitations allowing for continuously updated species distributions and thereby constantly refining estimates of the realised environmental preferences in invaded regions. By modelling the habitat suitability of 69 non-indigenous marine metazoan and algal taxa (spanning ten phyla) detected on Autonomous Reef Monitoring Structures across Europe, we obtained species-specific habitat suitability maps (current and future scenarios) and maps of priority areas for NIS management. These species were systematically and manually curated to confirm their introduced status, distinguishing genuine non-indigenous taxa from native species and potential contamination 19 . This curation step is essential because while DNA-based monitoring efficiently captures cryptic invasions and early-stage establishments in near real-time 43 , 44 , genetic detections alone cannot distinguish introduced from native occurrences. Our validation demonstrates that ensembles of models trained on global occurrence data effectively predict the distributions of non-indigenous species detected through genetic observatories. Our ensemble modelling approach combines predictions from algorithms with complementary strengths and weaknesses, effectively regularising final projections by balancing conservative and flexible model types with different overfitting tendencies. Across 33 tested species, 90% of their independent DNA-based detections from the ARMS-MBON network and the Swedish Ports Monitoring occurred within areas predicted as suitable habitat (> 0.5). Importantly, this framework allows continuous model validation: as new DNA-based detections emerge, they can be used to assess model accuracy before being incorporated into updated training datasets, creating an iterative cycle of prediction, validation, and refinement. It is important to note that our habitat suitability projections are impact-neutral: we model environmental suitability for non-indigenous species regardless of whether individual species have demonstrated ecological or economic impacts. Not all non-indigenous species cause negative effects, and impacts vary substantially across species and contexts 45 . Our framework provides spatially explicit predictions of where species are likely to establish based on environmental conditions, but decisions about management priorities require integration of species-specific impact assessments alongside habitat suitability. This approach enables evidence-based, targeted resource allocation rather than categorical responses to all non-indigenous species. Stacking individual species projections allows the delineation of priority areas for NIS management. Under current conditions, these concentrate in the North Sea, with high mean habitat suitability extending along both sides of the English Channel, the northern Bay of Biscay, and through coastal regions to the Kattegat/Skagerrak. In contrast, the Mediterranean Sea exhibits patchier suitability despite experiencing high propagule pressure from shipping traffic 46 and connectivity to tropical source regions via the Suez Canal 47 . These areas of elevated invasion risk reflect current environmental constraints on NIS establishment, while the species-based predictions (e.g., Crepidula fornicata and Acartia (Acanthacartia) tonsa ) illustrate the underlying individual responses to environmental conditions driving species-specific range expansions. Climate change is rapidly reshaping these biogeographic dynamics. Future projections show scenario-dependent changes in habitat suitability across European waters, with high-emissions scenarios (SSP5-8.5) predicting increases across 58.79% of the study region by 2100 (over three times the extent under moderate emissions). A pronounced latitudinal gradient emerges across all scenarios, with Arctic and subarctic regions experiencing the greatest increases. The Barents Sea exhibits a three-fold increase in habitat suitability under SSP5-8.5. These projections indicate potential weakening of thermal barriers that currently constrain poleward expansion 1 , suggesting that climate-driven range shifts will define future invasion dynamics in European seas. This pattern aligns with observed increases in NIS discovery rates in Arctic systems globally 48 , with hotspots emerging in the Iceland Shelf, Barents Sea, Norwegian Sea, Hudson Bay, and Chukchi/Eastern Bering seas 32 , 49 . Together, these findings indicate that Arctic and subarctic marine ecosystems face heightened invasion pressure as warming relaxes historical thermal constraints. Concurrent expansion of Arctic shipping routes driven by sea ice reduction may further increase introduction opportunities through ballast water and biofouling vectors 50 . Hence, the identified sub-/arctic regions should be actively protected from potential introductions in the future. Our habitat suitability projections provide spatially explicit information for multiple biosecurity applications. Ballast water discharge represents one of the primary vectors for marine biological invasions globally, with vessels transporting thousands of species across biogeographic barriers annually 51 , 52 . We demonstrate one decision-support application through identification of ballast water contingency areas in Northern European waters. This approach builds upon recent regulatory developments in the OSPAR commission, where the Intra-North Sea Ballast Water Contingency Area was implemented in 2025 and will remain effective until at least 2030 53 . By overlaying aggregated current habitat suitability map with Marine Protected Areas and Offshore Wind Farms in HELCOM regions, we delineate zones where ballast water discharge presents reduced invasion risk, i.e. areas characterised by low mean suitability and enough distance from sites that could facilitate secondary spread 54 , 55 . This approach offers a practical complement to existing ballast water management regulations, particularly for vessels unable to perform mid-ocean exchange due to safety constraints or operational limitations 56 . Our contingency zone framework provides port authorities and vessel operators with spatially explicit alternatives: if mid-ocean exchange is not feasible, discharge in designated low-risk areas minimises the probability that propagules encounter suitable habitat or colonise ecologically sensitive sites. As climate scenarios shift the geography of habitat suitability, these contingency zones can be updated, ensuring management strategies remain responsive to changing species’ spreads. Additionally, risk assessments at port-level can help prioritise inspection and monitoring efforts by cross-referencing vessel arrivals with projected suitability for species known to be present in source regions 57 . Surveillance optimisation represents another key application. Deploying genetic observatories in areas with high projected suitability but no current detections enables interception during establishment phases when eradication remains feasible 43 . This proactive approach contrasts with reactive monitoring in areas where invasions have already occurred, potentially reducing management costs and ecological impacts. Model projections and genetic monitoring both carry inherent uncertainties that must be acknowledged. First, our taxonomic scope is constrained to benthic fauna and flora detectable on ARMS units using COI and 18S metabarcoding, excluding most fish and pelagic species. This is particularly relevant for the Mediterranean, where our species list likely underrepresents Lessepsian migrants from the Suez Canal, many of which are mobile taxa not always detected on ARMS. Additionally, the geographic distribution of ARMS deployments shapes our species inventory, with undersampled regions potentially harboring undetected NIS. Genetic monitoring faces inherent challenges that may affect detection accuracy. False negatives can occur due to low species abundance, seasonal variability in DNA shedding rates, or limited DNA persistence in the environment 58 , 59 . Metabarcoding identifications are further constrained by incomplete reference databases, cryptic species complexes, and potential sequence misassignments, though our use of manually curated species lists, verified against WRiMS records, substantially mitigates these taxonomic uncertainties. SDMs project habitat suitability based on realised environmental niches captured in available occurrence data, which may not fully represent a species' fundamental niche or account for biotic interactions, dispersal limitations, or rapid evolutionary adaptation 38 . Recent methodological advances combining population genomics with SDMs show promise for incorporating genetic connectivity and local adaptation signals into niche models 60 , though such approaches require genetic data types beyond the scope of metabarcoding-based observatories. Similarly, joint species distribution models (JSDMs) offer complementary approaches by explicitly accounting for species co-occurrences and potential biotic interactions through residual correlation structures 61 , 62 . Therefore, our projections represent potential suitability rather than realised distributions, as we do not explicitly model dispersal, biotic interactions, or propagule pressure. Additionally, our reliance on pseudo-absences rather than confirmed absences introduces additional uncertainty, as background sampling points may occur in locations where species are present but undetected. However, as ARMS-MBON time series lengthen with continued deployment, the accumulation of temporal sampling will allow distinction between true absences and non-detection events, improving the reliability of absence inference from DNA-based datasets and enabling more robust model development 63 . Environmental projections derived from CMIP6 climate models carry inherent uncertainty that increases with projection distance into the future, primarily due to scenario divergence and inter-model variability 64 , 65 . While sea surface temperature projections for European shelf seas have medium-high confidence in IPCC assessments, other oceanographic variables show greater inter-model disagreement. Importantly, projections to 2100 may involve novel climate combinations outside the environmental space captured in our training data, requiring model extrapolation into conditions with no modern analogue 24 . This is particularly relevant under high-emission scenarios (SSP5-8.5), where the magnitude of environmental change increases uncertainty in predicted responses. We partially address these uncertainties through ensemble modelling and coefficient of variation analyses, though our focus on relative spatial patterns (identifying where risk increases most) should be more robust than absolute distribution predictions. Forecasting ecological futures requires humility: marine biological systems are complex and subject to contingencies our models cannot anticipate 66 . However, our modelling approach can accommodate much of the complexity when continuously updated with emerging data and validated against empirical observations. Integrating anthropogenic pressure layers (including port traffic, shipping networks, and aquaculture facilities) would transform habitat suitability into comprehensive invasion risk maps by accounting for introduction pathways alongside environmental suitability. Network analysis linking source populations to European coasts via shipping routes or currents could refine arrival probabilities. In the future, it will also be possible to include oceanographic models to incorporate species dispersal dynamics in this framework, which would enable prediction of spread rates and invasion corridors after establishment. This, however, requires more data on the studied species, especially when considering marine invertebrates which can often be overlooked 67 , and is therefore beyond the scope of this study. Integration with genetic observatory infrastructure represents a critical next step. Recent advances in multi-omics-driven frameworks for invasive species management emphasise the value of integrating molecular detection with predictive modelling to enable proactive, rather than reactive, biosecurity responses 68 . The ARMS-MBON network and similar DNA-based monitoring networks can generate continuous species observations data layers that can validate and refine model predictions on a regular basis. Standardising data pipelines between genetic observatories will enable automated model updates as new detections accumulate, transforming static forecasts into dynamic early warning systems. Our framework supports iterative updating, i.e., as new NIS arrive or established NIS shifts its range, incorporating observations and rerunning projections will track the invasion frontiers. Automated workflows integrating occurrence databases, environmental data streams, and modelling pipelines could enable annual or event-triggered model updates. This adaptive approach treats forecasting as an automated process rather than a one-time analysis. The integration of standardised genetic monitoring with ensemble modelling provides actionable spatial information for proactive biosecurity. Integration into the European Digital Twin of the Ocean infrastructure would enable real-time decision support, connecting invasion forecasts to marine spatial planning and climate adaptation strategies. Such systems require sustained investment in cyberinfrastructure and data standardisation across monitoring networks, but foundational components demonstrated here confirm this vision is achievable and point toward marine biosecurity infrastructures fit for an era of accelerating global change. Material and Methods Data Species selection Species were included if detected in any processed ARMS data from 2018–2024, comprising: (1) the ARMS-MBON network (19 observatories across 14 countries, 2018–2021) 19 , or (2) the Swedish Ports Monitoring program (SPM; 23 sites, 2023–2024) 21 , 22 . Candidate NIS species were initially detected using standardised genetic protocols based on DNA metabarcoding of mitochondrial COI and nuclear 18S rRNA markers, following the methodology established by the ARMS-MBON consortium 17 , 18 , 69 . Initial DNA-based species detections were cross-referenced against records from the World Register of Introduced Marine Species (WRiMS) 70 . Ambiguous or mis-assigned operational taxonomic units (OTUs) were removed through manual curation, retaining only confident species-level identifications and confirmed alien status in Europe. The detailed methodology and finalised ARMS-MBON NIS list has been published in Pagnier et al. 19 . The same method has been applied on ARMS metabarcoding data obtained from the SPM campaigns from 2023 and 2024. All steps and code used to extract NIS occurrences from ARMS metabarcoding datasets have been published by Daraghmeh 71 . This process yielded 81 marine NIS with confirmed genetic detections and verified alien status in at least one European marine region (Supplementary File 1). This complete dataset (all years) was used only for species selection to maximise detection of non-indigenous species present in European waters. However, publication status in GBIF/OBIS at the time of modelling determined whether specific ARMS detections were included in model training or reserved for independent validation (Table 2 ). In fact, our modelling workflow is based on GBIF/OBIS data; published ARMS detections were therefore included in training by default (see Occurrence data compilation and cleaning). Table 2 Genetic observatory datasets and their role in modelling workflow. ARMS-MBON stands for Autonomous Reef Monitoring Structures - Marine Biodiversity Observatory Network, which is a pan-European network sampling and sequencing in a standardised way. SPM stands for Swedish Ports Monitoring, which is a program using similar methods as ARMS MBON for ARMS sampling, sequencing and bioinformatics, but focused on ports located in Sweden. ARMS-MBON and SPM data selected for model training were complemented by other GBIF and OBIS occurrences (see next section). The SPM dataset from 2025 had not been fully processed at the time of species selection and was therefore only used for validation. Dataset Years N sites Publication status (August 2025) Species selection Model training Independent validation ARMS- MBON 2018–2019 15 Published in GBIF/OBIS ✓ ✓ — ARMS- MBON 2020–2021 13 Unpublished ✓ — ✓ SPM 2023 6 Unpublished ✓ — ✓ SPM 2024 11 Published in GBIF/OBIS ✓ ✓ — SPM 2025 10 Unpublished — — ✓ Occurrence data compilation and cleaning Global occurrence records of the 81 selected species were obtained from two databases to maximise spatial coverage: the Global Biodiversity Information Facility (GBIF, www.gbif.org ) using the rgbif R package v.3.8.1 72 , and the Ocean Biodiversity Information System (OBIS, www.obis.org ), using robis package v.2.11.3 73 . ARMS data availability in GBIF/OBIS determined their use in our modelling workflow (Table 2 ). We applied a multi-step filtering procedure to ensure data quality. First, occurrence points on land were removed to exclude specimens stored in research facilities or records with erroneous georeferencing. Duplicates and points with invalid coordinates or geospatial issues were also removed. We only kept occurrences from January 2000 to July 2025 to match the temporal resolution of the environmental predictors used in modelling (see Environmental predictors section). This initial filtering resulted in 226,478 occurrence points across 81 species. To reduce spatial sampling bias, we applied spatial thinning using the spThin R package v.0.2.0 74 to all species with sufficient records, retaining only one occurrence per 10 km × 10 km grid cell using the thin function. After spatial thinning, 25,745 occurrences remained across the 81 species. During model formatting with BIOMOD2 75 , occurrences falling outside the extent of environmental predictors or coinciding with cells containing missing values were excluded. This resulted in 12,300 occurrence points across all species. This important loss was due to the continental shelf restriction of the environmental predictors (see next section). Twelve species with fewer than 10 remaining occurrences were excluded from further analysis. The final dataset comprised 69 species with sufficient data for modelling. A table summarising the number of occurrences per species at each stage can be found in Supplementary File 4. All raw and thinned/cleaned occurrences can be accessed on Figshare (see Data availability statement). Environmental predictors Predictors selection We selected 34 environmental variables from Bio-ORACLE v.3.0 36 representing key physiological drivers (temperature, salinity, oxygen, nutrients) and biological productivity (chlorophyll, primary productivity). Variables were downloaded using the biooracler R package v.0.0.0.9000 ( https://github.com/bio-oracle/biooracler ). We generated an additional distance-to-coast layer by calculating the distance from each marine cell to the nearest coastline using the Bio-ORACLE bathymetry raster as reference. All layers had a resolution of 0.05°. To reduce multicollinearity, we applied Variance Inflation Factor (VIF) analysis using the vifstep function from the usdm R package v.2.1–7 76 , iteratively removing variables with VIF > 10. This reduced the predictor set to 19 variables (Supplementary Table 2). To ensure consistency between current and future projections, we retained surface chlorophyll concentration rather than mean depth chlorophyll concentration, as the latter was unavailable for future climate scenarios. Temporal scope We obtained the same 19 environmental predictors for both current conditions (2000–2020 average) and end-of-century projections (2090–2100 average) under three CMIP6-based scenarios: SSP1-2.6 (high mitigation pathway), SSP2-4.5 (moderate mitigation scenario), and SSP5-8.5 (high emission scenario). Bathymetry and distance-to-coast layers remained constant across all scenarios. Spatial processing Bio-ORACLE layers contained missing values in shallow coastal cells where many occurrence records were located. To retain these ecologically relevant areas, we applied focal interpolation using nearest-neighbor values to estimate environmental conditions for coastal cells, with coastlines defined using the medium-scale world map from the rnaturalearth R package v.1.1.0 77 . We then restricted all environmental stacks to the continental shelf (0-200 m depth) based on the bathymetry layer. This spatial constraint was critical for two reasons: (1) it focused pseudo-absence selection on environmentally accessible habitats, reducing bias; and (2) it prevented models from simply learning to distinguish coastal from open-ocean environments, as our target species are predominantly coastal taxa. The final processed stacks comprised 19 environmental layers for four temporal scenarios (current + 3 SSP scenarios). Pseudo-absences generation We used a hybrid pseudo-absence (PA) selection strategy to balance spatial realism with geographic coverage in our models. We generated three independent PA datasets per species, each using a 50:50 mixture of two sampling strategies, to reduce sensitivity to pseudo-absence configuration. Half of the pseudo-absences were sampled using a disk-based approach: points were selected at distances between 20 and 100 km from observed presences, ensuring that pseudo-absences represented geographically accessible environments where species were not detected. The remaining half were randomly distributed across the study area (continental shelf only), providing broader environmental coverage and reducing spatial bias inherent in presence-only data. The number of pseudo-absences was scaled to three times the number of presence records per species, following BIOMOD2 recommendations 78 . A summary of the number of pseudo-absences generated per species can be found in Supplementary File 4. Before model fitting, we implemented a deduplication procedure to remove spatially redundant pseudo-absence points. When multiple pseudo-absences from different replicates overlapped spatially, we retained unique coordinates while tracking which replicates used each location. Modelling Model construction Species distribution models (SDMs) were constructed using functions from the BIOMOD2 R package v.4.2-6-2 75 . For each species, we implemented an ensemble modelling approach, combining outputs from multiple algorithms including Generalized Additive Models (GAMs), Multivariate Adaptive Regression Splines (MARS), Maximum Entropy (MAXNET), Random Forest (RF), and Extreme Gradient Boosting (XGBOOST). These algorithms were selected to represent diverse modelling approaches (parametric vs. machine learning; linear vs. non-linear) and capture different aspects of species-environment relationships, with ensemble predictions reducing bias associated with any single algorithm 79 . Occurrence data were randomly divided into five folds, with each fold serving once as validation data while the remaining four folds trained the model. This process was repeated for all three PA datasets, resulting in 75 cross-validated models per species (5 algorithms × 5 folds × 3 PA datasets). Additionally, 15 fold-averaged models (5 algorithms × 3 PA datasets) and 5 full-dataset models (trained on all data) were produced, yielding 95 total models per species. Models were then evaluated using two complementary metrics: the True Skill Statistic (TSS), which accounts for both sensitivity and specificity while remaining independent of prevalence 80 , and the Area Under the ROC Curve (AUC), which measures discrimination ability across all thresholds 81 . TSS and AUC were calculated on validation data (20% of occurrences in each fold) to assess model generalisation. To assess the relationship between sample size and model performance, we binned species into five occurrence categories ( 500 records) and calculated mean performance metrics and standard deviations for each bin. LOESS smoothing was applied with the geom_smooth function to visualize performance trends across the sample size gradient. Ensemble models and current projections For each species, individual models with TSS > 0.6 and AUC > 0.85 (indicating reliable predictive performance) were kept. Ensemble models were produced in BIOMOD2 using the weighted mean approach ( EMwmean ) to combine predictions from individual algorithms. In this ensemble type, each algorithm’s contribution is weighted by TSS during model calibration. This means that models with higher predictive accuracy contribute more strongly to the final ensemble, while models with similar scores receive comparable weights. Current projection maps were built using the BIOMOD_EnsembleForecasting function, the ensemble model and the stack of environmental predictors for the current scenarios. The resulting maps therefore represent the consensus probability of suitable habitat across all algorithms, emphasising the most reliable models while retaining balanced representation of the ensemble. Assessing ensemble models’ uncertainty In addition to the ensemble weighted average projections, we also produced coefficients of variation maps (named EMcv by BIOMOD2, renamed CoV here) and committee averaging maps (named EMca by BIOMOD2, renamed CA here) for each species and scenario (current and future). The CoV represents the coefficient of variation (standard deviation divided by mean) of predicted habitat suitability across all selected algorithms within the ensemble. High CoV values indicate areas where models disagree strongly on species suitability, whereas low values indicate higher inter-model agreement. Note that CoV is inflated in areas with very low mean suitability ( 0.3, displaying low-suitability areas (< 0.3) in white to indicate where CoV estimates are unreliable for uncertainty assessments. CoV values from BIOMOD2 (given as percentages) were divided by 100 for consistency. The CA (committee averaging) represents the proportion of models agreeing on species presence or absence after transforming continuous predictions into binary classifications using thresholds that maximise TSS on validation data. This metric serves both as a prediction and an uncertainty measure: values near 1.0 indicate strong consensus for presence, values near 0.0 indicate consensus for absence, and values near 0.5 indicate high disagreement (approximately half the models predict presence while the other half predict absence). CA was rescaled from 0-1000 to 0–1 for consistency with other outputs. Future habitat suitability projections Weighted ensemble models were projected onto three future climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5) using BIOMOD_EnsembleForecasting with 2090–2099 environmental predictors. Projections used the same spatial extent as model training (0-200m continental shelf). Processing of projections maps To restrict analyses to ecologically relevant marine areas and species' non-indigenous ranges, we applied a multi-step spatial masking procedure on all BIOMOD’s projections outputs. First, all terrestrial cells were converted to NA. Second, we restricted projections to European waters using Marine Ecoregions of the World (MEOW) 37 . Importantly, species were masked to only those MEOWs where they are considered non-indigenous. This species-specific regional masking ensures that hotspot analyses reflect true invasion risk rather than including native populations. The MEOW assignments per species are provided in Supplementary File 5. Finally, we rescaled all current and future habitat suitability maps from BIOMOD2's native 0–1000 scale to a 0–1 range for intuitive interpretation. External validation with independent ARMS detections To assess model performance against spatially and temporally independent data, we compared ensemble predictions to ARMS detections that were excluded from model training. Specifically, we used: (1) ARMS-MBON detections from 2020–2021, and (2) SPM detections from 2023 and 2025 (Table 2 ). These datasets were unpublished in GBIF/OBIS at the time of modelling (August 2025) and therefore represent truly independent validation data, whereas ARMS-MBON 2018–2019 and SPM 2024 data had been published and were included in the training dataset along with other global occurrences. While these data did not allow formal statistical validation due to species-level imbalances in sample sizes and the lack of confirmed absences (DNA-based absences do not confirm true absence 82 ), they provided a valuable real-world test of model predictions. For each independent detection, we extracted the predicted habitat suitability value (0–1) at that location from our current-day ensemble models. Of the 69 modelled species, 51 had independent DNA-based detections available, of which 33 had detections falling within the projection extent (continental shelf). We calculated the proportion of these detections located in areas predicted suitable (> 0.5) and areas predicted highly suitable (> 0.8). Details can be found in Supplementary File 2. Analysis of model predictions We analysed habitat suitability projections using R v.4.3.1 with the terra v.1.8–54 83 , raster v.3.6–32 84 , and dplyr v1.1.4 85 packages for spatial processing, and ggplot2 v.3.5.2 86 for visualisation. Analyses were conducted at two scales: individual species and aggregated across all 69 species. Individual species analyses For each species, we generated five map types to characterise current distributions, uncertainty, and climate-driven changes: Current habitat suitability: Weighted ensemble predictions showing continuous suitability values (0–1 scale) Prediction uncertainty (CoV): Coefficient of variation maps masked to areas with suitability > 0.3, as CoV values are inflated in low-suitability areas due to near-zero denominators. Areas with S ≤ 0.3 were displayed in white to indicate unreliable CoV estimates. Model agreement (CA): Committee averaging maps showing the proportion of models agreeing on presence or absence (see Ensemble Models section) Projected change: Difference maps (ΔS = S future - S current ) for SSP2-4.5 as an illustrative scenario, classified into seven categories: moderate decrease ( 0.20). The ± 0.05 threshold excludes typical modelling noise while capturing ecologically meaningful shifts. Suitability transitions: Binary classification of pixels using S = 0.5 as threshold into four categories: (a) remains unsuitable (current < 0.5, future < 0.5), (b) becomes suitable (current < 0.5, future ≥ 0.5), (c) remains suitable (current ≥ 0.5, future ≥ 0.5), and (d) becomes unsuitable (current ≥ 0.5, future < 0.5). Aggregated multi-species analyses To identify broad-scale NIS spread patterns, we calculated mean ensemble suitability across all 69 species for each grid cell, producing three analysis types: Priority areas for biosecurity: Mean suitability values across species were classified into management priority categories using 0.1 intervals: negligeable (S 0.6). Higher values indicate areas where environmental conditions support a greater proportion of the modelled NIS. Scenario-based change maps: For all three SSP scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5), we calculated change in mean suitability (ΔS = S future - S current ) and classified values using the same seven categories as individual species maps. We quantified the percentage of the study area falling into each change category for each scenario. Regional analysis by ecoregion: We used Marine Ecoregions of the World (MEOW) 37 to analyse 18 ecoregions within European waters: North and East Barents Sea, Northern Norway and Finnmark, Southern Norway, North and East Iceland, South and West Iceland, Faroe Plateau, North Sea, Celtic Seas, South European Atlantic Shelf, Baltic Sea, Black Sea, Adriatic Sea, Ionian Sea, Aegean Sea, Alboran Sea, Western Mediterranean, Levantine Sea, and Tunisian Plateau/Gulf of Sidra. For each ecoregion, we extracted pixel values using terra::extract and calculated both absolute change (ΔSmean) and percentage change (100 × ΔSmean / Scurrent) for all three scenarios. Statistics reported as mean ± standard deviation across all pixels within each ecoregion. Ecoregions were ordered by centroid latitude for visualisation. Ballast water contingency areas (BWCA) Spatial data on Offshore Wind Farms (OWFs) and Marine Protected Areas (MPAs) were downloaded from EMODnet ( www.emodnet.ec.europa.eu ). Distance to OWFs and MPAs were computed as new rasters, with the same method as the one employed above for distance to land. To identify potential Ballast Water Contingency Areas (BWCAs), we applied four spatial criteria: (1) low mean habitat suitability across all 69 modelled NIS, (2) minimum distance d from MPAs, (3) minimum distance d from OWFs, and (4) minimum distance d from coastline. Binary raster layers were created for each criterion using user-defined thresholds. For examples shown in Fig. 7 , thresholds were selected as: suitability = 0.2, d = 7 km. All code used to generate these analyses is fully accessible, allowing users to adjust thresholds, incorporate additional constraints, or tailor the approach to regional management needs and local environmental conditions. See Code Availability Statement. Declarations Data availability All species distribution model outputs, including individual species ensemble projections and uncertainty metrics (CoV and CA), are publicly available on Figshare at https://figshare.com/s/ab27e1dcaee11ba59e88 . The repository includes: (i) habitat suitability maps (GeoTIFF format) for 69 marine non-indigenous species under current conditions and three future climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5 for 2090–2099); (ii) ensemble uncertainty metrics (coefficient of variation and committee averaging maps) for all species and all scenarios; (iii) aggregated mean suitability maps across all species and scenarios; and (iv) processed environmental predictor layers (Bio-ORACLE v.3.0 derived) used for modelling. Species occurrence data used for model training were obtained from the Global Biodiversity Information Facility (GBIF, https://www.gbif.org/ ) and Ocean Biodiversity Information System (OBIS, https://obis.org/ ) and are accessible through these public repositories. Raw GBIF downloads can be accessed via the DOIs found in Supplementary File 4 and are also provided on the Figshare repository alongside the raw OBIS downloads. Final cleaned and spatially thinned occurrence datasets for all 69 modelled species (n = 12,300 occurrence points) are provided in the Figshare repository as CSV files with coordinates. Model performance metrics (TSS and AUC scores) for all 6,555 individual species distribution models (69 species × 5 algorithms × 5 folds × 3 pseudo-absence datasets, plus fold-averaged and full-dataset models) are provided as a comprehensive CSV file in the Figshare repository. ARMS-MBON detection data (2018–2019) are available through GBIF (18S: https://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f ; COI: https://www.gbif.org/dataset/b9afe2d0-b264-4422-bf8c-096b2a53c18f ). ARMS-MBON detection data (2020–2021) got published in December 2025 (post-modelling) and are available through GBIF (18S: https://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f ; COI: https://www.gbif.org/dataset/542b31ca-712f-43dd-8f1d-8eeefb88c9a3 ). Swedish Ports Monitoring data (2024) are available through https://www.gbif.se/ipt/resource?r=nis_monitor_2024&v=1.1 . Unpublished ARMS validation data (ARMS-MBON 2020–2021; SPM 2023, 2025) are available in Supplementary File 2. Code availability All scripts for data collection, species distribution modelling, visualisation and analyses are publicly available via GitHub ( https://github.com/JustinePa/ARMS-marine-alien-SDM/tree/main ) and Figshare ( https://figshare.com/s/ab27e1dcaee11ba59e88 ). Conflict of interest statement The authors declare no conflict of interest. Authors contributions J.P. and M.O. conceptualised the study and designed the research framework. J.P. performed all data collection, species distribution modelling, spatial analyses, and visualisation. T.A. provided critical feedback on modelling methodology and model evaluation approaches. M.G.A. developed the computational framework for identifying ballast water contingency areas based on model outputs and created Fig. 7. J.P. wrote the original manuscript draft. M.O. and T.A. supervised the research. All authors contributed to manuscript revision and approved the final version. Acknowledgements This work was supported by the SciLifeLab & Wallenberg Data Driven Life Science Program (grant: KAW2024.0159). The project was also supported financially by a grant from the DTO-bioflow project (grant no. 101112823), the EU project MARCO BOLO (grant no. 101082021) and Swedish Biodiversity Data Infrastructure (grant no. 2019 − 00242). Computations and data processing was enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022–06725, enabling computing using the Dardel high-performance computing system at the PDC Center for High Performance Computing, KTH Royal Institute of Technology, Stockholm, Sweden ( https://www.pdc.kth.se ). We thank the Biodiversity Data Lab at the University of Uppsala for providing feedback during project development, particularly on the statistical modelling approach. Support and feedback were also received from the Swedish Transport Agency, the Swedish Agency for Marine and Water Management, and the OSPAR/HELCOM Joint Task Group on Ballast Water Management Convention (BWMC) and Biofouling. References Venegas RM, Acevedo J, Treml EA (2023) Three decades of ocean warming impacts on marine ecosystems: A review and perspective. Deep Sea Res Part II Top Stud Oceanogr 212:105318 Alter K et al (2024) Hidden impacts of ocean warming and acidification on biological responses of marine animals revealed through meta-analysis. Nat Commun 15:2885 Wernberg T et al (2025) Marine heatwaves as hot spots of climate change and impacts on biodiversity and ecosystem services. Nat Rev Biodivers 1:461–479 Shi Y, Li Y (2024) Impacts of ocean acidification on physiology and ecology of marine invertebrates: a comprehensive review. Aquat Ecol 58:207–226 Röthig T et al (2023) Human-induced salinity changes impact marine organisms and ecosystems. Glob Change Biol 29:4731–4749 Korpinen S et al (2021) Combined effects of human pressures on Europe’s marine ecosystems. Ambio 50:1325–1336 Katsanevakis S, Zenetos A, Belchior C, Cardoso AC (2013) Invading European Seas: Assessing pathways of introduction of marine aliens. Ocean Coast Manag 76:64–74 Haubrock PJ et al (2025) The spread of non-native species. Biol Rev Vilà M et al (2010) How well do we understand the impacts of alien species on ecosystem services? A pan-European, cross-taxa assessment. Front Ecol Environ 8:135–144 IPBES. IPBES Invasive Alien Species Assessment: Summary for Policymakers (2023) 10.5281/zenodo.11254974 Seebens H et al (2025) Biological invasions: a global assessment of geographic distributions, long-term trends, and data gaps. Biol Rev Camb Philos Soc 100:2542–2583 Turbelin AJ et al (2023) Biological invasions are as costly as natural hazards. Perspect Ecol Conserv 21:143–150 Diagne C et al (2021) High and rising economic costs of biological invasions worldwide. Nature 592:571–576 Pyšek P et al (2020) Scientists’ warning on invasive alien species. Biol Rev 95:1511–1534 Ahmed DA et al (2022) Managing biological invasions: the cost of inaction. Biol Invasions 24:1927–1946 Cuthbert RN et al (2022) Biological invasion costs reveal insufficient proactive management worldwide. Sci Total Environ 819:153404 Obst M et al (2020) A Marine Biodiversity Observation Network for Genetic Monitoring of Hard-Bottom Communities (ARMS-MBON). Front Mar Sci 7:572680 Daraghmeh N et al (2025) A Long-Term Ecological Research Data Set From the Marine Genetic Monitoring Program ARMS-MBON 2018–2020. Mol Ecol Resour 25:e14073 Pagnier J, Daraghmeh N, Obst M (2025) Using the long-term genetic monitoring network ARMS-MBON to detect marine non-indigenous species along the European coasts. Biol Invasions 27:1–26 Piazza A, Mikac B, Colangelo MA, Costantini F (2026) ARMS in ports: monitoring non-indigenous species through Autonomous Reef Monitoring Structures. Mar Pollut Bull 222:118545 Obst M (2024) National monitoring program for non-indigenous species (NIS), Sweden. https://doi.org/10.15468/ckcffa (2025) Sunberg P, Eriksson A-K, Breidenbach M, Panova M, Obst M (2025) Övervakning av främmande marina arter med eDNA i södra Bohuslän 2024 . https://www.lansstyrelsen.se/vastra-gotaland/om-oss/vara-tjanster/publikationer/2025/overvakning-av-frammande-marina-arter-med-edna-i-sodra-bohuslan-2024.html Guisan A, Zimmermann NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135:147–186 Elith J, Leathwick JR (2009) Species Distribution Models: Ecological Explanation and Prediction Across Space and Time. Annu Rev Ecol Evol Syst 40:677–697 Katsanevakis S et al (2023) Marine invasive alien species in Europe: 9 years after the IAS Regulation. Front Mar Sci 10:1271755 Yang T, Liu X, Han Z (2022) Predicting the Effects of Climate Change on the Suitable Habitat of Japanese Spanish Mackerel (Scomberomorus niphonius) Based on the Species Distribution Model. Front Mar Sci 9:927790 Sun Y et al (2024) Simulating the changes of the habitats suitability of chub mackerel ( Scomber japonicus ) in the high seas of the North Pacific Ocean using ensemble models under medium to long-term future climate scenarios. Mar Pollut Bull 207:116873 Gouvêa LP, Krause-Jensen D, Duarte CM, Assis J (2025) Projected impacts of future climate change on the aboveground biomass of seagrasses at global scale. Sci Total Environ 966:178680 Leitão F, Cánovas F (2025) Predicting climate change impacts on marine fisheries, biodiversity and economy in the Canary/Iberia current upwelling system. J Environ Manage 384:125537 De Wysiecki A et al (2025) Using global occurrence data to predict suitable habitats for widely distributed marine species in data-scarce regions. Biodivers Conserv 34:1497–1523 Robinson NM, Nelson WA, Costello MJ, Sutherland JE, Lundquist CJ (2017) A Systematic Review of Marine-Based Species Distribution Models (SDMs) with Recommendations for Best Practice. Front Mar Sci 4:421 Goldsmit J et al (2020) What and where? Predicting invasion hotspots in the Arctic marine realm. Glob Change Biol 26:4752–4771 Melo-Merino SM, Reyes-Bonilla H, Lira-Noriega A (2020) Ecological niche models and species distribution models in marine environments: A literature review and spatial analysis of evidence. Ecol Model 415:108837 Kumschick S et al (2025) Mapping potential environmental impacts of alien species in the face of climate change. Biol Invasions 27:43 Obst M, Huertas C, De Carlo F, DTO-BioFlow (2025) DUC 1 - Data-driven Strategies for Invasive Species Management. https://doi.org/10.5281/zenodo.17484964 Assis J et al (2024) Bio-ORACLE v3.0. Pushing marine data layers to the CMIP6 Earth System Models of climate change research. Glob Ecol Biogeogr 33:e13813 Spalding MD et al (2007) Marine Ecoregions of the World: A Bioregionalization of Coastal and Shelf Areas. Bioscience 57:573–583 Václavík T, Meentemeyer RK (2012) Equilibrium or not? Modelling potential distribution of invasive species in different stages of invasion. Divers Distrib 18:73–83 Early R, Sax DF (2014) Climatic niche shifts between species’ native and naturalized ranges raise concern for ecological forecasts during invasions and climate change. Glob Ecol Biogeogr 23:1356–1365 Srivastava V, Lafond V, Griess VC (2019) Species distribution models (SDM): applications, benefits and challenges in invasive species management. CABI Rev 1–13. 10.1079/PAVSNNR201914020 Hedensjö A, Strand Å, Laugen AT (2025) Habitat Preferences at the Leading Edge of a Marine Bioinvasion. Ecol Evol 15:e72475 Jiménez-Valverde A et al (2011) Use of niche models in invasive species risk assessments. Biol Invasions 13:2785–2797 Holman LE et al (2019) Detection of introduced and resident marine species using environmental DNA metabarcoding of sediment and water. Sci Rep 9:11559 Morisette J et al (2021) Strategic considerations for invasive species managers in the utilization of environmental DNA (eDNA): steps for incorporating this powerful surveillance tool. Manag Biol Invasions Int J Appl Res Biol Invasions 12:747–775 Sax DF, Schlaepfer MA, Olden JD (2022) Valuing the contributions of non-native species to people and nature. Trends Ecol Evol 37:1058–1066 Ulman A et al (2019) A Hitchhiker’s guide to Mediterranean marina travel for alien species. J Environ Manage 241:328–339 Katsanevakis S et al (2014) Invading the Mediterranean Sea: biodiversity patterns shaped by human activities. Front Mar Sci 1:32 Chan FT et al (2019) Climate change opens new frontiers for marine species in the Arctic: Current trends and future invasion risks. Glob Change Biol 25:25–38 Goldsmit J, McKindsey C, Archambault P, Howland K (2019) L. Ecological risk assessment of predicted marine invasions in the Canadian Arctic. PLoS ONE 14:e0211815 Aksenov Y et al (2017) On the future navigability of Arctic sea routes: High-resolution projections of the Arctic Ocean and sea ice. Mar Policy 75:300–317 Bailey SA (2015) An overview of thirty years of research on ballast water as a vector for aquatic invasive species to freshwater and marine environments. Aquat Ecosyst Health Manag 18:261–268 Seebens H, Schwartz N, Schupp PJ, Blasius B (2016) Predicting the spread of marine species introduced by global shipping. Proc. Natl. Acad. Sci. U. S. A. 113, 5646–5651 OSPAR. Intra North Sea Ballast Water Contingency and Compliance Area in accordance with BWM.2/Circ.62 and MEPC.387(81) (2025) Burfeind DD, Pitt KA, Connolly RM, Byers JE (2013) Performance of non-native species within marine reserves. Biol Invasions 15:17–28 Adams TP, Miller RG, Aleynik D, Burrows MT (2014) Offshore marine renewable energy devices as stepping stones across biogeographical boundaries. J Appl Ecol 51:330–338 Gollasch S, David M (2019) Ballast Water: Problems and Management. World Seas: An Environmental Evaluation. Elsevier, pp 237–250. doi: 10.1016/B978-0-12-805052-1.00014-0 . Seebens H, Gastner MT, Blasius B (2013) The risk of marine bioinvasion caused by global shipping. Ecol Lett 16:782–790 Blackman R et al (2024) Environmental DNA: The next chapter. Mol Ecol 33:e17355 Çevik T, Çevik N, Environmental (2025) DNA (eDNA): A review of ecosystem biodiversity detection and applications. Biodivers Conserv 34:2999–3035 Rius M, Pascual M (2025) Genomics-informed Modelling: Advancing Our Understanding of Non-indigenous Species’ Colonization and Spread. in Invasion Genomics 162–174 Pollock LJ et al (2014) Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods Ecol Evol 5:397–406 Tikhonov G, Abrego N, Dunson D, Ovaskainen O (2017) Using joint species distribution models for evaluating how species-to-species associations depend on the environmental context. Methods Ecol Evol 8:443–452 Guillera-Arroita G (2017) Modelling of species distributions, range dynamics and communities under imperfect detection: advances, challenges and opportunities. Ecography 40:281–295 Hawkins E, Sutton R (2009) The Potential to Narrow Uncertainty in Regional Climate Predictions. Bull Am Meteorol Soc 90:1095–1108 Frölicher TL, Rodgers KB, Stock CA, Cheung W (2016) W. L. Sources of uncertainties in 21st century projections of potential ocean ecosystem stressors. Glob Biogeochem Cycles 30:1224–1243 Vucetich JA, Hoy SR, Peterson R (2025) O. More reason for humility in our relationships with ecological communities. Bioscience 75:163–171 Chen EY-S (2021) Often Overlooked: Understanding and Meeting the Current Challenges of Marine Invertebrate Conservation. Front Mar Sci 8:690704 Zhan A (2025) Multi-Omics-Driven Adaptive Management of Biological Invasions: Toward a Proactive, Predictive, and Integrative Framework. Biol Divers 2 Pagnier J et al (2025) A long-term ecological research dataset from the marine genetic monitoring programme ARMS-MBON 2020–2021. Biodivers Data J 13:e148981 Costello MJ et al (2026) World Register of Introduced Marine Species (WRiMS). https://doi.org/10.14284/347 Daraghmeh N (2024) ARMS-MBON 18S rRNA and COI gene metabarcoding: scanning for non-indigenous species Chamberlain S et al (2024) rgbif: Interface to the Global Biodiversity Information Facility API Provoost P, Bosch S, Appeltans W, OBIS (2022) &. robis: Ocean Biodiversity Information System (OBIS) Client Aiello-Lammens ME, Boria RA, Radosavljevic A, Vilela B, Anderson RP (2015) spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography 38:541–545 Thuiller W, Lafourcade B, Engler R, Araújo M (2009) B. BIOMOD – a platform for ensemble forecasting of species distributions. Ecography 32:369–373 Naimi B, usdm (2023) Uncertainty Analysis for Species Distribution Models Massicotte P, South A, Hufkens K (2025) rnaturalearth: World Map Data from Natural Earth Guéguen M, Blancheteau H, Lemaire-Patin R, Thuiller W (2025) Pseudo-absences. https://biomodhub.github.io/biomod2/articles/vignette_pseudoAbsences.html Araújo MB, New M (2007) Ensemble forecasting of species distributions. Trends Ecol Evol 22:42–47 Allouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 43:1223–1232 Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38–49 Duarte S, Vieira PE, Lavrador AS, Costa FO (2021) Status and prospects of marine NIS detection and monitoring through (e)DNA metabarcoding. Sci Total Environ 751:141729 Hijmans RJ et al (2025) terra: Spatial Data Analysis Hijmans RJ et al (2025) raster: Geographic Data Analysis and Modeling Wickham H et al (2023) dplyr: A Grammar of Data Manipulation Wickham H et al (2025) ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics Additional Declarations There is NO Competing Interest. Supplementary Files SupplementaryFile1Specieslistanddetails.xlsx Supplementary File 1 SupplementaryFile2Externalvalidationdataandresults.xlsx Supplementary File 2 SupplementaryFile3Futurechangesperecoregions.xlsx Supplementary File 3 SupplementaryFile5MarineEcoregionsperspecies.xlsx Supplementary File 5 PagnieretalSupplementaryInformation.docx Supplementary Information SupplementaryFile4TrackingofoccurrencesPAnumberpremodelling.xlsx Supplementary File 4 Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8702791","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":581425031,"identity":"a918e351-2af3-4635-bd52-c48b5d4b0175","order_by":0,"name":"Justine Pagnier","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/klEQVRIiWNgGAWjYBACxgYgkQDjARlyxGhhbEDWYky0RXCQ2IBdFQIwz0h//uDhDhsGc/bj1x48qLiXPj8igfnDB3xWzMgxbEg8k8Zg2ZNTbpBwpjh3440ENskZ+LUwNiS2HWYwOJCTJpHYlpC7cUYCGzMPXi3pD4Fa/jMYnH8D1PIvId1wRgLz5z94tSQAHdZ2gMHgRvoxicSGhAR5iQQGaXzeZ+x5YzgjsS2Zx3LGGzaJhGMJhht4HrZJ9uDRYtie/uDjzzY7OXP+9GeSP2oS5OXbkw9/+IFPSwOE5jEAIRAwOIAaURhAHsYwYGB/ABHBr2EUjIJRMApGIAAALTxSkcHgI5AAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-6531-1374","institution":"Department of Marine Sciences, SciLifeLab, University of Gothenburg","correspondingAuthor":true,"prefix":"","firstName":"Justine","middleName":"","lastName":"Pagnier","suffix":""},{"id":581425032,"identity":"99c30985-71b8-4fc2-b177-4c004c96736a","order_by":1,"name":"Tobias Andermann","email":"","orcid":"https://orcid.org/0000-0002-0932-1623","institution":"Uppsala University","correspondingAuthor":false,"prefix":"","firstName":"Tobias","middleName":"","lastName":"Andermann","suffix":""},{"id":581425033,"identity":"6ae08f21-9d1d-4b9b-a184-d1709109462b","order_by":2,"name":"Mats Andersson","email":"","orcid":"","institution":"Swedish Veterinary Agency (SVA)","correspondingAuthor":false,"prefix":"","firstName":"Mats","middleName":"","lastName":"Andersson","suffix":""},{"id":581425034,"identity":"f7ceefad-5438-48af-93e9-0ca97db508b4","order_by":3,"name":"Matthias Obst","email":"","orcid":"","institution":"Department of Marine Sciences, University of Gothenburg, Sweden","correspondingAuthor":false,"prefix":"","firstName":"Matthias","middleName":"","lastName":"Obst","suffix":""}],"badges":[],"createdAt":"2026-01-26 17:35:36","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8702791/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8702791/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":101480284,"identity":"f74d3596-67ff-43cb-aadc-a228010dfa0f","added_by":"auto","created_at":"2026-01-30 07:56:45","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":394498,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eModels’ performance metrics across varying sample sizes.\u003c/strong\u003e \u003cstrong\u003e(a)\u003c/strong\u003eRelationship between model performance (AUC and TSS) and number of occurrence records for individual species (n = 69). Each point represents a single SDM. LOESS smoothing was applied to visualise performance trends across the sample size gradient. \u003cstrong\u003e(b)\u003c/strong\u003e Standard deviation of model performance across binned occurrence categories, as a measure of performance consistency.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/6ebedad4f5a7091702dee225.png"},{"id":101480296,"identity":"066374d2-7ad6-47fc-a899-e1d50bd3739c","added_by":"auto","created_at":"2026-01-30 07:56:52","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":973251,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eHabitat suitability, coefficients of variation (CoV), and model agreement (CA) for two non-indigenous species across European seas.\u003c/strong\u003ePanels show ensemble model predictions for \u003cem\u003eCrepidula fornicata\u003c/em\u003e (a-c) and \u003cem\u003eAcartia (Acanthacartia) tonsa\u003c/em\u003e (d-f). (a, d) Habitat suitability from ensemble models, with values ranging from 0 (unsuitable, yellow) to 1 (highly suitable, dark blue). (b, e) Coefficient of variation (CoV) displaying prediction uncertainty only in areas with suitability \u0026gt; 0.3, with white areas indicating low suitability where CoV is not shown. Lower CoV values (light colors) indicate higher model consensus, while higher CoV values (dark red) indicate greater disagreement among algorithms. (c, f) Committee averaging showing the proportion of models agreeing on habitat suitability, with blue indicating high agreement among models and red indicating low agreement.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/b5a9c9dba0631b50311df8de.png"},{"id":101480301,"identity":"ee0889cb-329e-4319-b6fb-3fe4aa205e51","added_by":"auto","created_at":"2026-01-30 07:56:54","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":653401,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eProjected changes in habitat suitability and transitions for two marine non-indigenous species in European waters under SSP2-4.5 climate scenario by 2100. \u003c/strong\u003e(a, c) Projected change in habitat suitability between present conditions and 2100 for \u003cem\u003eCrepidula fornicata\u003c/em\u003e (a) and \u003cem\u003eAcartia (Acanthacartia) tonsa\u003c/em\u003e(c). Blue colours indicate decreasing suitability, while red/orange colours indicate increasing suitability. (b, d) Suitability transitions based on a threshold of S = 0.5 for \u003cem\u003eC. fornicata\u003c/em\u003e (b) and \u003cem\u003eA. tonsa\u003c/em\u003e (d). Grey areas remain unsuitable in both periods (S \u0026lt; 0.5), red areas become newly suitable, orange areas persist as suitable, and blue areas become unsuitable.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/e07c586a1a0dc53bc44caf70.png"},{"id":101480332,"identity":"08666775-8afe-4691-ae74-3f9575dc1c0d","added_by":"auto","created_at":"2026-01-30 07:57:08","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":461256,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePriority areas for non-indigenous species management in European seas under current climate conditions.\u003c/strong\u003e This map shows classified mean habitat suitability based on aggregated ensemble predictions from 69 marine non-indigenous species (NIS). Colors represent management priority levels derived from average ensemble suitability values (S), ranging from Negligible (S\u0026lt;0.1, light blue) to Severe (S≥0.6, dark red).\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/40c28078ccd5ed83555037f8.png"},{"id":101480276,"identity":"f341f0ec-893b-427b-a33b-37074053ac59","added_by":"auto","created_at":"2026-01-30 07:56:42","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":525930,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eProjected changes in mean habitat suitability (69 NIS) in European seas. \u003c/strong\u003eProjected changes in mean habitat suitability by 2100 under three climate scenarios: (a) SSP1-2.6, (b) SSP2-4.5, and (c) SSP5-8.5. Changes are calculated as the difference between future and current mean suitability (positive values indicate increased suitability, negative values indicate decreased suitability). Colors represent magnitude of change, classified as: moderate decrease (\u0026lt; -0.10), slight decrease (-0.10 to -0.05), no change (-0.05 to 0.05), slight increase (0.05 to 0.10), moderate increase (0.10 to 0.15), substantial increase (0.15 to 0.20), and major increase (\u0026gt; 0.20).\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/c2ee646bd2714da0f02d147b.png"},{"id":101480340,"identity":"c39ea360-8651-478a-9bbe-71218f8c6b9a","added_by":"auto","created_at":"2026-01-30 07:57:09","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":200460,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRegional changes in mean habitat suitability (69 marine NIS) under three climate change scenarios.\u003c/strong\u003e \u003cstrong\u003e(a)\u003c/strong\u003e Relative change in mean habitat suitability from current to 2100 across three emission scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5). \u003cstrong\u003e(b)\u003c/strong\u003e Absolute change in mean habitat suitability for the same regions and scenarios. Error bars represent spatial variability (standard deviation) within each ecoregion. Ecoregions are ordered by latitude from north (top) to south (bottom).\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/84631ad43df2f648ba9f04b9.png"},{"id":101480302,"identity":"90c70bf1-cf50-46e5-872c-fd5c911c1363","added_by":"auto","created_at":"2026-01-30 07:57:02","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":695492,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIdentification of ballast water contingency areas based on marine alien species’ habitat suitability and potential stepping stones (Marine Protected Areas and Offshore WindFarms).\u003c/strong\u003e \u003cstrong\u003e(a)\u003c/strong\u003e Overlay of current mean habitat suitability for non-indigenous species with offshore wind farms (OWF, yellow) and Marine Protected Areas (MPA, teal). \u003cstrong\u003e(b)\u003c/strong\u003e Proposed ballast water contingency areas (brown) corresponding to identified cold spots where low habitat suitability for alien species suggests reduced ecological risk for ballast water discharge or exchange operations.\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/a49954bd380f9890d7219b6e.png"},{"id":101480418,"identity":"b2ade15d-4c35-4b47-8401-83c1f036f479","added_by":"auto","created_at":"2026-01-30 07:57:22","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3838447,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/22b28d7a-6d92-43e7-ac92-b775fb4fac32.pdf"},{"id":101480282,"identity":"babcc001-f08d-4b03-bbfd-a5fc504bb945","added_by":"auto","created_at":"2026-01-30 07:56:44","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":13143,"visible":true,"origin":"","legend":"Supplementary File 1","description":"","filename":"SupplementaryFile1Specieslistanddetails.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/87865a5abee90ef583aea021.xlsx"},{"id":101480328,"identity":"66f722c0-a0fb-4338-bedb-69e4ae0ed91a","added_by":"auto","created_at":"2026-01-30 07:57:06","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":24542,"visible":true,"origin":"","legend":"Supplementary File 2","description":"","filename":"SupplementaryFile2Externalvalidationdataandresults.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/6cd77f4ee12df5d02bbc7b3f.xlsx"},{"id":101480326,"identity":"b32d972b-0866-4c65-be82-86cf082f73a0","added_by":"auto","created_at":"2026-01-30 07:57:05","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":9238,"visible":true,"origin":"","legend":"Supplementary File 3","description":"","filename":"SupplementaryFile3Futurechangesperecoregions.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/fe189c68d4b6e416d9cc5162.xlsx"},{"id":101480331,"identity":"744cf80e-c9b3-45e3-964a-424bc55a9c95","added_by":"auto","created_at":"2026-01-30 07:57:07","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":7475,"visible":true,"origin":"","legend":"Supplementary File 5","description":"","filename":"SupplementaryFile5MarineEcoregionsperspecies.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/17d02a0e817633eb0f10bf90.xlsx"},{"id":101480317,"identity":"27518b69-192c-4a82-8471-f84a7a99305e","added_by":"auto","created_at":"2026-01-30 07:57:03","extension":"docx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":2840326,"visible":true,"origin":"","legend":"Supplementary Information","description":"","filename":"PagnieretalSupplementaryInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/c605586aaba9d176e952f05a.docx"},{"id":101480294,"identity":"ad1f8ffe-7426-4aa8-bba7-ac9430b0a6d0","added_by":"auto","created_at":"2026-01-30 07:56:52","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":20173,"visible":true,"origin":"","legend":"Supplementary File 4","description":"","filename":"SupplementaryFile4TrackingofoccurrencesPAnumberpremodelling.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8702791/v1/a993364f653518df2a8ad620.xlsx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"The role of genetic observatory networks in the detection and forecasting of marine non-indigenous species","fulltext":[{"header":"Introduction","content":"\u003cp\u003eIn a rapidly changing world, the ability to predict ecological dynamics and potential disruptions is essential. Marine ecosystems are increasingly threatened by rising temperatures\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, ocean acidification\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e, salinity changes\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e, and intensifying human activities\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. Among these pressures, biological invasions by non-indigenous species (NIS) represent a persistent and accelerating driver of ecological change. These species, also called \u003cem\u003ealien\u003c/em\u003e, \u003cem\u003enon-native\u003c/em\u003e, or \u003cem\u003eintroduced\u003c/em\u003e species, are often transported through shipping, aquaculture, or drifting marine debris\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. Some become \u003cem\u003einvasive\u003c/em\u003e, outcompeting native organisms, restructuring food webs, and altering ecosystem services\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. Biological invasions across marine, terrestrial, and freshwater ecosystems are now recognized as one of the top five drivers of global biodiversity loss\u003csup\u003e\u003cspan additionalcitationids=\"CR10\" citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. Their global economic costs have risen by 702% from 1980\u0026ndash;1999 to 2000\u0026ndash;2019\u003csup\u003e12,13\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eAnticipating and mitigating invasions has become a priority for managing and preserving ocean ecosystems\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Biological invasions happen through distinct, yet coupled, stages of transport, introduction, establishment, and spread\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e, each presenting a separate opportunity for intervention. Yet, current marine biosecurity remains largely reactive rather than preventive, with countries typically deploying expensive post-invasion responses that rarely prove effective\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e,\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eDetection efforts, although effective, operate after introduction has already occurred. They allow responses during the establishment and spread phases, when eradication becomes exponentially more difficult and costly. However, species detection in marine environments presents significant logistical challenges. The ocean's vast scale and dynamic nature make systematic surveillance difficult, and traditional survey methods remain spatially patchy, resource-intensive, and challenging to standardise across regions. In this context, DNA-based monitoring with metabarcoding offers a solution, enabling rapid, standardised detection of biodiversity from small environmental samples. The ARMS-MBON network (Autonomous Reef Monitoring Structures - Marine Biodiversity Observation Network) illustrates this approach: through standardised deployments of ARMS units at marine observatories across Europe and adjacent regions since 2018, this network provides comparable DNA-based biodiversity data that capture a wide spectrum of benthic organisms\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. Recent studies demonstrate that ARMS-MBON can detect marine NIS effectively and at early stages\u003csup\u003e\u003cspan additionalcitationids=\"CR19\" citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. Additionally, many of the standards and methods developed by ARMS-MBON are now implemented in national monitoring programs (\u003cem\u003ee.g\u003c/em\u003e., Swedish Ports Monitoring program\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e). Yet, while improved detection accelerates response time, it remains inherently reactive.\u003c/p\u003e \u003cp\u003ePrediction, on the other hand, enables proactive intervention before introduction occurs. Species Distribution Models (SDMs)\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e,\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e, also known as Habitat Suitability or Ecological Niche-based Models, provide a way to forecast where species are likely to establish under current and future conditions. By identifying which regions provide suitable habitat for potential invaders, we can target prevention efforts at the transport stage. For example, this enables strengthening biosecurity at high-risk ports, prioritising surveillance in vulnerable ecosystems, and allocating management resources where they will be most effective. This is particularly crucial in marine systems, where dispersal can span hundreds of kilometers\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. SDMs have been increasingly used to forecast climate-driven range shifts\u003csup\u003e\u003cspan additionalcitationids=\"CR27 CR28\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e, identify suitable habitats in data-poor regions\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e, and assess invasion risk\u003csup\u003e\u003cspan additionalcitationids=\"CR32 CR33\" citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e. One key application of spatial risk forecasts is the designation of Ballast Water Contingency Areas (BWCAs), which are zones where ships can perform ballast water exchange in case of contingencies. These areas must be carefully defined not to favour the spread of NIS into new areas.\u003c/p\u003e \u003cp\u003eHowever, current marine biosecurity lacks integrated analytical frameworks that couple these predictive models with sensor networks and management decision tools. This disconnect, where monitoring data, ecological forecasts, and policy interventions remain poorly integrated, creates time lags that prevent proactive NIS management. The convergence of genetic monitoring and species distribution modelling offers a solution: a continuous feedback loop from detection to risk assessment to intervention. Such integrated frameworks align with emerging digital infrastructures like Digital Twins of the Ocean (DTOs)\u003csup\u003e\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e, which combine oceanographic, biological, and socioeconomic data into dynamic systems for scenario exploration and early warning.\u003c/p\u003e \u003cp\u003eIn this study, we demonstrate the validity of coupling DNA-based monitoring with species distribution modelling for NIS risk assessment and management. Specifically, we address three questions: (i) can SDMs trained on global occurrence data accurately predict current distributions of NIS detected by genetic observatories? (ii) can ensemble models identify spatial hotspots with elevated establishment potential (i.e., areas where multiple NIS find suitable habitat simultaneously) under current and future ocean conditions? and (iii) how can such spatially explicit forecasts support preventive NIS management?\u003c/p\u003e \u003cp\u003eHere, we modelled habitat suitability for 69 marine NIS detected by genetic observatories across European seas using global occurrence records from the Global Biodiversity Information Facility (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003ca href=\"https://orcid.org/0000-0002-6531-1374\" target=\"_blank\"\u003ewww.gbif.org\u003c/a\u003e\u003c/span\u003e\u003cspan address=\"http://www.gbif.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) and the Ocean Biodiversity Information System (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003ca href=\"https://orcid.org/0000-0002-6531-1374\" target=\"_blank\"\u003ewww.obis.org\u003c/a\u003e\u003c/span\u003e\u003cspan address=\"http://www.obis.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), coupled with environmental predictors from Bio-ORACLE\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e. We then validate predictions against available independent DNA-based detections. Our models predict where NIS can establish based on environmental conditions. This provides a foundation for risk assessment that can be integrated with species-specific impact evaluations to prioritise management actions. By projecting suitability under multiple climate scenarios, we assess how alien spread may shift as ocean conditions change, providing a blueprint for a sensor-model-decision framework for evidence-based marine biosecurity in a changing ocean. We illustrate how these spatial forecasts can be integrated with existing marine and digital infrastructure to support ecosystem-based management in the future.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eWe successfully modelled habitat suitability across current and future ocean conditions for various marine NIS detected by genetic observatory networks (ARMS-MBON\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e, Swedish Ports Monitoring\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e). These species span 10 major taxonomic groups (22 arthropods, 11 annelids, 10 molluscs, 7 rhodophytes, 7 chordates, 4 bryozoans, 3 cnidarians, 3 ochrophytes, one ctenophore, and one dinoflagellate, see Supplementary File 1), revealing species-specific predictions of suitable habitat as well as invasion pressure range-shift under climate change across European waters.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eModel assessments\u003c/h2\u003e \u003cp\u003eIndividual models\u003c/p\u003e \u003cp\u003eOf the 81 NIS initially selected, 69 passed quality criteria and produced viable models, with 95 individual models built per species (6,555 SDMs total): 75 cross-validation models (5 algorithms \u0026times; 5 k-folds \u0026times; 3 pseudo-absences datasets), 15 fold-averaged models per pseudo-absences dataset, and 5 full-dataset models, trained on global occurrence data from GBIF and OBIS. Models were evaluated using 5-fold cross-validation, with performance assessed using TSS and ROC-AUC metrics.\u003c/p\u003e \u003cp\u003eAll algorithms demonstrated strong discrimination ability on validation data (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), with AUC values ranging from 0.859\u0026thinsp;\u0026plusmn;\u0026thinsp;0.132 (GAM) to 0.917\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0827 (RF) and 0.916\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0747 (MAXNET). Threshold-dependent performance showed greater variability, with TSS scores ranging from 0.611\u0026thinsp;\u0026plusmn;\u0026thinsp;0.244 (GAM) to 0.687\u0026thinsp;\u0026plusmn;\u0026thinsp;0.188 (MAXNET).\u003c/p\u003e \u003cp\u003eAs expected, calibration metrics were uniformly higher than validation metrics across all algorithms (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), indicating train-test gaps of varying magnitude between algorithms. For example, RF showed potential overfitting with near-perfect calibration (AUC\u0026thinsp;\u0026asymp;\u0026thinsp;1.00, TSS\u0026thinsp;=\u0026thinsp;0.989\u0026thinsp;\u0026plusmn;\u0026thinsp;0.00983) compared to validation scores (AUC\u0026thinsp;=\u0026thinsp;0.917\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0827, TSS\u0026thinsp;=\u0026thinsp;0.647\u0026thinsp;\u0026plusmn;\u0026thinsp;0.213). In contrast, MAXNET exhibited the smallest train-test gap (calibration TSS\u0026thinsp;=\u0026thinsp;0.826\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0644 vs. validation TSS\u0026thinsp;=\u0026thinsp;0.687\u0026thinsp;\u0026plusmn;\u0026thinsp;0.188), while other algorithms showed intermediate patterns. Despite these gaps, validation metrics remained high across all algorithms (mean AUC\u0026thinsp;=\u0026thinsp;0.892\u0026thinsp;\u0026plusmn;\u0026thinsp;0.101, mean TSS\u0026thinsp;=\u0026thinsp;0.644\u0026thinsp;\u0026plusmn;\u0026thinsp;0.214), indicating robust discriminatory ability. The ensemble modelling framework, which combines predictions from algorithms with complementary strengths and weaknesses, provides additional regularisation for final projections (see next section).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003e\u003cb\u003eModels evaluation metrics.\u003c/b\u003e Average (\u0026plusmn;\u0026thinsp;standard deviation) values across all species, pseudo-absences\u0026rsquo; datasets, and model runs, for each algorithm and for all combined.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGAM\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMARS\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eMAXNET\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eRF\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eXGBOOST\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eAll\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCalibration TSS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e0.813\u0026thinsp;\u0026plusmn;\u0026thinsp;0.118\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e0.841\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0877\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e0.826\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0644\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.989 \u0026plusmn; 0.00983\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.861\u0026thinsp;\u0026plusmn;\u0026thinsp;0.088\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c7\"\u003e \u003cp\u003e0.866\u0026thinsp;\u0026plusmn;\u0026thinsp;0.104\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eValidation TSS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e0.611\u0026thinsp;\u0026plusmn;\u0026thinsp;0.244\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e0.648\u0026thinsp;\u0026plusmn;\u0026thinsp;0.201\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e0.687\u0026thinsp;\u0026plusmn;\u0026thinsp;0.188\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.647\u0026thinsp;\u0026plusmn;\u0026thinsp;0.213\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.625 \u0026plusmn; 0.210\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c7\"\u003e \u003cp\u003e0.644\u0026thinsp;\u0026plusmn;\u0026thinsp;0.214\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCalibration AUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e0.948\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0397\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e0.961\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0288\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e0.961\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0209\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e1.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.000673\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.968\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0247\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c7\"\u003e \u003cp\u003e0.968\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0314\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eValidation AUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e0.859\u0026thinsp;\u0026plusmn;\u0026thinsp;0.132\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e0.883\u0026thinsp;\u0026plusmn;\u0026thinsp;0.102\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e0.916\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0747\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.917\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0827\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.888\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0893\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c7\"\u003e \u003cp\u003e0.892\u0026thinsp;\u0026plusmn;\u0026thinsp;0.101\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe number of occurrence records used per species ranged from 10 (minimum allowed) to 847 (median: 118), with a clear positive relationship between sample size and model performance. Species with fewer than 50 occurrence records showed the highest variability and lowest mean performance (AUC: 0.822\u0026thinsp;\u0026plusmn;\u0026thinsp;0.15; TSS: 0.491\u0026thinsp;\u0026plusmn;\u0026thinsp;0.3), while species with 50\u0026ndash;200 occurrences demonstrated substantial improvement (AUC: 0.906\u0026thinsp;\u0026plusmn;\u0026thinsp;0.06; TSS: 0.67\u0026thinsp;\u0026plusmn;\u0026thinsp;0.14) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ea). Model performance continued to improve with increasing sample size, reaching optimal and stable performance in species with more than 100 occurrences (AUC: \u0026gt;0.94; TSS: \u0026gt;0.75).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eBeyond improving performance, larger occurrences datasets substantially enhanced model performance consistency. Standard deviation for TSS decreased 84% (from 0.300 to 0.048) and for AUC decreased 87% (from 0.149 to 0.019) across the sample size gradient (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eb), indicating that models built with \u0026gt;\u0026thinsp;100 occurrences produce more consistent predictions across different algorithms and validation folds.\u003c/p\u003e \u003cp\u003eEnsemble models\u003c/p\u003e \u003cp\u003eGiven the variable performance across individual algorithms, we constructed ensemble models to use the strengths of multiple approaches. To create robust ensemble predictions for each species, we retained only individual models that met minimum performance thresholds (TSS\u0026thinsp;\u0026gt;\u0026thinsp;0.6 and AUC\u0026thinsp;\u0026gt;\u0026thinsp;0.85).\u003c/p\u003e \u003cp\u003eFour species had fewer than 10 models available for ensemble construction and were flagged for cautious interpretation: \u003cem\u003eApionsoma (Apionsoma) misakianum\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;9 models) \u003cem\u003eHerdmania momus\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;5), \u003cem\u003eTharyx setigera\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;4), and \u003cem\u003ePseudocalanus acuspes\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;4). Figures showing the performance distribution across species and algorithms are available in Supplementary Figs.\u0026nbsp;1 and 2.\u003c/p\u003e \u003cp\u003eEnsemble models showed improved performance compared to individual algorithms, with average scores of AUC\u0026thinsp;=\u0026thinsp;0.976\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0137 and TSS\u0026thinsp;=\u0026thinsp;0.863\u0026thinsp;\u0026plusmn;\u0026thinsp;0.055 across all 69 species (calibration values, as ensemble models do not have validation values). Performance varied by species, with AUC ranging from 0.927 to 0.997 and TSS from 0.707 to 0.970.\u003c/p\u003e \u003cp\u003eOverall, ensemble models improved predictive performance, providing robust suitability estimates for subsequent spatial analyses.\u003c/p\u003e \u003cp\u003eExternal validation with independent DNA-based detections\u003c/p\u003e \u003cp\u003eWe used metabarcoding-based species detections from genetic observatory networks as an independent dataset to evaluate our model predictions. These detections from ARMS-MBON deployments (2020 and 2021) and Swedish Ports Monitoring (2023 and 2025) were unpublished in GBIF/OBIS at the time of modelling and therefore excluded from model training. An independent validation dataset was therefore available for 33 species.\u003c/p\u003e \u003cp\u003eOur model predictions showed strong agreement with independent ARMS observations (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) as approximately 90% of detections occurred in areas predicted as suitable (suitability\u0026thinsp;\u0026gt;\u0026thinsp;0.5), and around 47% in areas predicted as highly suitable (suitability\u0026thinsp;\u0026gt;\u0026thinsp;0.8) by our ensemble models (Supplementary Fig.\u0026nbsp;3). Species-specific and combined validation results are provided in Supplementary File 2. Absences were not tested as DNA-based absences do not confirm true absences.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eIndividual suitability maps and uncertainty assessment\u003c/h3\u003e\n\u003cp\u003eWeighted ensemble model predictions provide habitat suitability estimates ranging from 0 (unsuitable) to 1 (highly suitable) for each species. We assessed prediction uncertainty using two complementary metrics: coefficient of variation (\u003cem\u003eEMcv\u003c/em\u003e in BIOMOD2, renamed CoV here) across algorithm predictions, and committee averaging (\u003cem\u003eEMca\u003c/em\u003e in BIOMOD2, renamed CA here) showing the proportion of models agreeing on presence/absence. However, coefficients of variation values are inflated in areas of very low habitat suitability due to division by near-zero mean values, and we therefore recommend interpreting uncertainty with care in these areas.\u003c/p\u003e \u003cp\u003eAll species-specific habitat suitability, CoV and CA maps are available on Figshare (see Data Availability Statement), for current ocean conditions and three 2100 climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5).\u003c/p\u003e \u003cp\u003eHere we present detailed results for two example species spanning different taxonomic groups and geographic patterns: the mollusc \u003cem\u003eCrepidula fornicata\u003c/em\u003e and the copepod \u003cem\u003eAcartia (Acanthacartia) tonsa\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cem\u003eC. fornicata\u003c/em\u003e shows widespread suitable habitat throughout the North Atlantic, North Sea, and Kattegat/Skagerrak regions, with particularly high suitability along the southern and eastern coastline of the North Sea (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea). Despite this broad predicted range, uncertainty remains relatively low across most suitable areas (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb), with CoV values predominantly below 0.5, indicating good consensus among modelling algorithms. Model agreement is high (CA\u0026thinsp;\u0026gt;\u0026thinsp;0.75) throughout suitable areas (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ec), suggesting robust predictions in these areas.\u003c/p\u003e \u003cp\u003e \u003cem\u003eA. tonsa\u003c/em\u003e displays a different geographical pattern, with suitable habitat concentrated in the Baltic Sea, and discrete areas in the Mediterranean and Black Sea (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ed). Uncertainty patterns for this species show moderate CoV values (0.25\u0026ndash;0.75) in key suitable areas, while model agreement remains high (CA\u0026thinsp;\u0026gt;\u0026thinsp;0.75) in core suitable habitats such as the southern and central Baltic Sea (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ef).\u003c/p\u003e \u003cp\u003eFor both species, the coefficient of variation increases with distance to the coast, and these areas are also areas where the models do not fully agree on absences or presences.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eClimate projections under the moderate emissions scenario (SSP2-4.5, 2100) reveal species-specific patterns of habitat redistribution (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cem\u003eC. fornicata\u003c/em\u003e exhibits the most important redistribution, with substantial decreases in suitability (\u0026lt; -0.15) predicted throughout its current core areas in the North Sea, English Channel, and Irish Sea (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea). In parallel, we observe suitability gains in previously unsuitable Arctic waters, particularly the Barents Sea and northern Norwegian coast, suggesting a northward displacement of suitable habitat. For \u003cem\u003eA. tonsa\u003c/em\u003e, projected changes are more geographically restricted (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ec). The species faces moderate to substantial decreases across much of the Baltic Sea, currently a core suitable habitat, as well as scattered decreases in the Mediterranean and Black Sea.\u003c/p\u003e \u003cp\u003eAll species-specific projections for all three SSP scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5) and associated uncertainty metrics are available in the data repository.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eCurrent priority areas for NIS management\u003c/h3\u003e\n\u003cp\u003eBeyond species-specific assessments, identifying geographic areas where multiple NIS may co-occur is critical for prioritising limited biosecurity resources. To identify priority areas for biosecurity alert and intervention, we aggregated suitability predictions across all 69 modelled species, revealing geographic hotspots where environmental conditions support multiple NIS (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). Mean suitability values reflect the average habitat favorability across all 69 modelled species in each grid cell, with higher values indicating that, on average, environmental conditions are more favorable for the species pool. Note that this metric captures average suitability rather than species richness. Areas with high mean values may reflect either many species with moderate suitability or fewer species with very high suitability.\u003c/p\u003e \u003cp\u003eSevere priority areas (S\u0026thinsp;\u0026ge;\u0026thinsp;0.6, dark red) are spatially restricted to small, fragmented coastal patches primarily concentrated along both sides of the English Channel and the northern part of the Bay of Biscay. High priority zones (S\u0026thinsp;=\u0026thinsp;0.5\u0026ndash;0.6, red-orange) extend these hotspots along the North Sea coastlines and up to the Kattegat/Skagerrak region. These zones also comprise the northern Adriatic Sea, the Atlantic side of the Strait of Gibraltar, the northern Black Sea coast, the Tunisian coast, the Egyptian coast around the Suez Canal, parts of the southern French Mediterranean coast and of the northern parts of the Norwegian coastline. Moderate priority areas (S\u0026thinsp;=\u0026thinsp;0.3\u0026ndash;0.5) show broader geographic extent, covering additional Scandinavian coastal regions (including the Baltic), Icelandic coasts, and appearing in scattered locations throughout the Mediterranean and along Atlantic-facing shores. Arctic waters and offshore areas show consistently low suitability (S\u0026thinsp;\u0026lt;\u0026thinsp;0.3), reflecting limited NIS establishment under current conditions.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eFuture changes in NIS suitability and regional patterns\u003c/h3\u003e\n\u003cp\u003eWe projected habitat suitability to 2100 using environmental data from Bio-ORACLE v3.0 based on CMIP6 climate models under three emission scenarios: SSP1-2.6 (high mitigation), SSP2-4.5 (moderate mitigation), and SSP5-8.5 (high emissions). Change in mean suitability was calculated as the difference between future (2090\u0026ndash;2100 average) and current (2000\u0026ndash;2020 average) conditions across all 69 species.\u003c/p\u003e \u003cp\u003eProjections under future climate scenarios revealed predominantly northward increases in mean habitat suitability, with magnitude strongly dependent on emission pathway (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eUnder the low emission scenario (SSP1-2.6), changes in mean suitability were relatively modest with 96% of the total study area (European continental shelf) experiencing minimal change (between \u0026minus;\u0026thinsp;0.05 and +\u0026thinsp;0.05, white on maps). Only 2.8% of the total study area showed slight to moderate increases, primarily in northern latitudes (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea), while less than 1% exhibited slight decreases along the coast of the North Sea and the Kattegat.\u003c/p\u003e \u003cp\u003eThe intermediate emission scenario (SSP2-4.5) projected more pronounced increases, with 16% of the total study area experiencing slight to substantial increases, particularly in the Norwegian and Barents Seas (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eb), and only 0.4% showing slight decreases in small coastal areas.\u003c/p\u003e \u003cp\u003eThe high emission scenario (SSP5-8.5) produced the most dramatic changes: 55% of the total study area showed slight to moderate increases, while 3.6% exhibited important to major increases (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ec). These increasingly suitable areas were concentrated in northern regions, especially in Arctic and sub-Arctic waters.\u003c/p\u003e \u003cp\u003eWe analysed these changes across 18 Marine Ecoregions of the World (MEOW)\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e within European waters to identify regional patterns of climate-driven alien spread risk. Regional analysis revealed a pronounced latitudinal gradient (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, Supplementary Table\u0026nbsp;1). Arctic and subarctic ecoregions showed both the largest relative (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003ea) and absolute (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eb) increases. A detailed table with the computed relative and absolute changes can be found in Supplementary File 3.\u003c/p\u003e \u003cp\u003eArctic regions exhibited the most substantial relative changes under SSP5-8.5: the North and East Barents Sea showed a 301% \u0026plusmn; 135% increase (mean\u0026thinsp;\u0026plusmn;\u0026thinsp;SD across pixels within the ecoregion), followed by Northern Norway and Finnmark (104% \u0026plusmn; 71%), and Icelandic waters (North and East Iceland: 129% \u0026plusmn; 50%; South and West Iceland: 132% \u0026plusmn; 70%). Even under the conservative SSP1-2.6 scenario, the Barents Sea showed a 65% \u0026plusmn; 57% increase.\u003c/p\u003e \u003cp\u003eMid-latitude Atlantic regions (North Sea, Celtic Seas, South European Atlantic Shelf) demonstrated more moderate responses, with increases of 70\u0026ndash;90% under SSP5-8.5 and 10\u0026ndash;15% under SSP1-2.6.\u003c/p\u003e \u003cp\u003eMediterranean ecoregions generally showed modest increases of 10\u0026ndash;30% under SSP5-8.5, with the Adriatic and Ionian Seas (28\u0026ndash;29%) at the higher end and the Alboran Sea showing minimal change. Under lower-emission scenarios, Mediterranean increases were around 10% or below.\u003c/p\u003e \u003cp\u003eThe Black Sea exhibited near-zero or slightly negative changes across all scenarios despite currently high habitat suitability.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eBallast water contingency areas (BWCA)\u003c/h3\u003e\n\u003cp\u003eWe then explored a potential application of the above results for biosecurity. Using the aggregated suitability map for current conditions, we identified potential Ballast Water Contingency Areas in the eastern North Sea and Baltic Sea by applying spatial criteria: we kept areas with mean habitat suitability below 0.2 and located\u0026thinsp;\u0026gt;\u0026thinsp;7 kilometers from Marine Protected Areas, offshore wind farms, and coastlines. These represent invasion cold spots where NIS establishment risk is minimised (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eShips operating in these regions could be directed to these zones during contingency situations requiring ballast water discharge. Note that these thresholds are illustrative and may be adjusted based on specific contingency scenarios and regional biosecurity priorities.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe application of species distribution models (SDMs) to non-indigenous species (NIS) has been contested due to potential violations of the niche-environment equilibrium assumption\u003csup\u003e\u003cspan additionalcitationids=\"CR39\" citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. For example, recent work on the Pacific oyster invasion in Swedish waters has shown that species-habitat associations at invasion fronts may differ from established populations\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e. However, Jim\u0026eacute;nez-Valverde et al.\u003csup\u003e42\u003c/sup\u003e argue that when appropriately constructed (\u003cem\u003ei.e.\u003c/em\u003e using distributional data from all invaded regions, predictors linked to physiological requirements, and careful model evaluation), SDMs provide robust predictions for invasion risk assessment. We show here that a dynamic integration of SDMs with standardised genetic observatories addresses these limitations allowing for continuously updated species distributions and thereby constantly refining estimates of the realised environmental preferences in invaded regions.\u003c/p\u003e \u003cp\u003eBy modelling the habitat suitability of 69 non-indigenous marine metazoan and algal taxa (spanning ten phyla) detected on Autonomous Reef Monitoring Structures across Europe, we obtained species-specific habitat suitability maps (current and future scenarios) and maps of priority areas for NIS management. These species were systematically and manually curated to confirm their introduced status, distinguishing genuine non-indigenous taxa from native species and potential contamination\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. This curation step is essential because while DNA-based monitoring efficiently captures cryptic invasions and early-stage establishments in near real-time\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e, genetic detections alone cannot distinguish introduced from native occurrences.\u003c/p\u003e \u003cp\u003eOur validation demonstrates that ensembles of models trained on global occurrence data effectively predict the distributions of non-indigenous species detected through genetic observatories. Our ensemble modelling approach combines predictions from algorithms with complementary strengths and weaknesses, effectively regularising final projections by balancing conservative and flexible model types with different overfitting tendencies. Across 33 tested species, 90% of their independent DNA-based detections from the ARMS-MBON network and the Swedish Ports Monitoring occurred within areas predicted as suitable habitat (\u0026gt;\u0026thinsp;0.5). Importantly, this framework allows continuous model validation: as new DNA-based detections emerge, they can be used to assess model accuracy before being incorporated into updated training datasets, creating an iterative cycle of prediction, validation, and refinement.\u003c/p\u003e \u003cp\u003eIt is important to note that our habitat suitability projections are impact-neutral: we model environmental suitability for non-indigenous species regardless of whether individual species have demonstrated ecological or economic impacts. Not all non-indigenous species cause negative effects, and impacts vary substantially across species and contexts\u003csup\u003e\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e. Our framework provides spatially explicit predictions of where species are likely to establish based on environmental conditions, but decisions about management priorities require integration of species-specific impact assessments alongside habitat suitability. This approach enables evidence-based, targeted resource allocation rather than categorical responses to all non-indigenous species.\u003c/p\u003e \u003cp\u003eStacking individual species projections allows the delineation of priority areas for NIS management. Under current conditions, these concentrate in the North Sea, with high mean habitat suitability extending along both sides of the English Channel, the northern Bay of Biscay, and through coastal regions to the Kattegat/Skagerrak. In contrast, the Mediterranean Sea exhibits patchier suitability despite experiencing high propagule pressure from shipping traffic\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e and connectivity to tropical source regions via the Suez Canal\u003csup\u003e\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e. These areas of elevated invasion risk reflect current environmental constraints on NIS establishment, while the species-based predictions (e.g., \u003cem\u003eCrepidula fornicata\u003c/em\u003e and \u003cem\u003eAcartia (Acanthacartia) tonsa\u003c/em\u003e) illustrate the underlying individual responses to environmental conditions driving species-specific range expansions.\u003c/p\u003e \u003cp\u003eClimate change is rapidly reshaping these biogeographic dynamics. Future projections show scenario-dependent changes in habitat suitability across European waters, with high-emissions scenarios (SSP5-8.5) predicting increases across 58.79% of the study region by 2100 (over three times the extent under moderate emissions). A pronounced latitudinal gradient emerges across all scenarios, with Arctic and subarctic regions experiencing the greatest increases. The Barents Sea exhibits a three-fold increase in habitat suitability under SSP5-8.5. These projections indicate potential weakening of thermal barriers that currently constrain poleward expansion\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e, suggesting that climate-driven range shifts will define future invasion dynamics in European seas. This pattern aligns with observed increases in NIS discovery rates in Arctic systems globally\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u003c/sup\u003e, with hotspots emerging in the Iceland Shelf, Barents Sea, Norwegian Sea, Hudson Bay, and Chukchi/Eastern Bering seas\u003csup\u003e\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e,\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e. Together, these findings indicate that Arctic and subarctic marine ecosystems face heightened invasion pressure as warming relaxes historical thermal constraints. Concurrent expansion of Arctic shipping routes driven by sea ice reduction may further increase introduction opportunities through ballast water and biofouling vectors\u003csup\u003e\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e. Hence, the identified sub-/arctic regions should be actively protected from potential introductions in the future.\u003c/p\u003e \u003cp\u003eOur habitat suitability projections provide spatially explicit information for multiple biosecurity applications. Ballast water discharge represents one of the primary vectors for marine biological invasions globally, with vessels transporting thousands of species across biogeographic barriers annually\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e,\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e. We demonstrate one decision-support application through identification of ballast water contingency areas in Northern European waters. This approach builds upon recent regulatory developments in the OSPAR commission, where the Intra-North Sea Ballast Water Contingency Area was implemented in 2025 and will remain effective until at least 2030\u003csup\u003e53\u003c/sup\u003e. By overlaying aggregated current habitat suitability map with Marine Protected Areas and Offshore Wind Farms in HELCOM regions, we delineate zones where ballast water discharge presents reduced invasion risk, i.e. areas characterised by low mean suitability and enough distance from sites that could facilitate secondary spread\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e,\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThis approach offers a practical complement to existing ballast water management regulations, particularly for vessels unable to perform mid-ocean exchange due to safety constraints or operational limitations\u003csup\u003e\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e\u003c/sup\u003e. Our contingency zone framework provides port authorities and vessel operators with spatially explicit alternatives: if mid-ocean exchange is not feasible, discharge in designated low-risk areas minimises the probability that propagules encounter suitable habitat or colonise ecologically sensitive sites. As climate scenarios shift the geography of habitat suitability, these contingency zones can be updated, ensuring management strategies remain responsive to changing species\u0026rsquo; spreads. Additionally, risk assessments at port-level can help prioritise inspection and monitoring efforts by cross-referencing vessel arrivals with projected suitability for species known to be present in source regions\u003csup\u003e\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eSurveillance optimisation represents another key application. Deploying genetic observatories in areas with high projected suitability but no current detections enables interception during establishment phases when eradication remains feasible\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e. This proactive approach contrasts with reactive monitoring in areas where invasions have already occurred, potentially reducing management costs and ecological impacts.\u003c/p\u003e \u003cp\u003eModel projections and genetic monitoring both carry inherent uncertainties that must be acknowledged. First, our taxonomic scope is constrained to benthic fauna and flora detectable on ARMS units using COI and 18S metabarcoding, excluding most fish and pelagic species. This is particularly relevant for the Mediterranean, where our species list likely underrepresents Lessepsian migrants from the Suez Canal, many of which are mobile taxa not always detected on ARMS. Additionally, the geographic distribution of ARMS deployments shapes our species inventory, with undersampled regions potentially harboring undetected NIS.\u003c/p\u003e \u003cp\u003eGenetic monitoring faces inherent challenges that may affect detection accuracy. False negatives can occur due to low species abundance, seasonal variability in DNA shedding rates, or limited DNA persistence in the environment\u003csup\u003e\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e,\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e. Metabarcoding identifications are further constrained by incomplete reference databases, cryptic species complexes, and potential sequence misassignments, though our use of manually curated species lists, verified against WRiMS records, substantially mitigates these taxonomic uncertainties.\u003c/p\u003e \u003cp\u003eSDMs project habitat suitability based on realised environmental niches captured in available occurrence data, which may not fully represent a species' fundamental niche or account for biotic interactions, dispersal limitations, or rapid evolutionary adaptation\u003csup\u003e\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e. Recent methodological advances combining population genomics with SDMs show promise for incorporating genetic connectivity and local adaptation signals into niche models\u003csup\u003e\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e\u003c/sup\u003e, though such approaches require genetic data types beyond the scope of metabarcoding-based observatories. Similarly, joint species distribution models (JSDMs) offer complementary approaches by explicitly accounting for species co-occurrences and potential biotic interactions through residual correlation structures\u003csup\u003e\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e,\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e. Therefore, our projections represent potential suitability rather than realised distributions, as we do not explicitly model dispersal, biotic interactions, or propagule pressure. Additionally, our reliance on pseudo-absences rather than confirmed absences introduces additional uncertainty, as background sampling points may occur in locations where species are present but undetected. However, as ARMS-MBON time series lengthen with continued deployment, the accumulation of temporal sampling will allow distinction between true absences and non-detection events, improving the reliability of absence inference from DNA-based datasets and enabling more robust model development\u003csup\u003e\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eEnvironmental projections derived from CMIP6 climate models carry inherent uncertainty that increases with projection distance into the future, primarily due to scenario divergence and inter-model variability\u003csup\u003e\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e,\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e\u003c/sup\u003e. While sea surface temperature projections for European shelf seas have medium-high confidence in IPCC assessments, other oceanographic variables show greater inter-model disagreement. Importantly, projections to 2100 may involve novel climate combinations outside the environmental space captured in our training data, requiring model extrapolation into conditions with no modern analogue\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. This is particularly relevant under high-emission scenarios (SSP5-8.5), where the magnitude of environmental change increases uncertainty in predicted responses. We partially address these uncertainties through ensemble modelling and coefficient of variation analyses, though our focus on relative spatial patterns (identifying where risk increases most) should be more robust than absolute distribution predictions.\u003c/p\u003e \u003cp\u003eForecasting ecological futures requires humility: marine biological systems are complex and subject to contingencies our models cannot anticipate\u003csup\u003e\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e\u003c/sup\u003e. However, our modelling approach can accommodate much of the complexity when continuously updated with emerging data and validated against empirical observations.\u003c/p\u003e \u003cp\u003eIntegrating anthropogenic pressure layers (including port traffic, shipping networks, and aquaculture facilities) would transform habitat suitability into comprehensive invasion risk maps by accounting for introduction pathways alongside environmental suitability. Network analysis linking source populations to European coasts via shipping routes or currents could refine arrival probabilities. In the future, it will also be possible to include oceanographic models to incorporate species dispersal dynamics in this framework, which would enable prediction of spread rates and invasion corridors after establishment. This, however, requires more data on the studied species, especially when considering marine invertebrates which can often be overlooked\u003csup\u003e\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e\u003c/sup\u003e, and is therefore beyond the scope of this study.\u003c/p\u003e \u003cp\u003eIntegration with genetic observatory infrastructure represents a critical next step. Recent advances in multi-omics-driven frameworks for invasive species management emphasise the value of integrating molecular detection with predictive modelling to enable proactive, rather than reactive, biosecurity responses\u003csup\u003e\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e\u003c/sup\u003e. The ARMS-MBON network and similar DNA-based monitoring networks can generate continuous species observations data layers that can validate and refine model predictions on a regular basis. Standardising data pipelines between genetic observatories will enable automated model updates as new detections accumulate, transforming static forecasts into dynamic early warning systems. Our framework supports iterative updating, i.e., as new NIS arrive or established NIS shifts its range, incorporating observations and rerunning projections will track the invasion frontiers. Automated workflows integrating occurrence databases, environmental data streams, and modelling pipelines could enable annual or event-triggered model updates. This adaptive approach treats forecasting as an automated process rather than a one-time analysis.\u003c/p\u003e \u003cp\u003eThe integration of standardised genetic monitoring with ensemble modelling provides actionable spatial information for proactive biosecurity. Integration into the European Digital Twin of the Ocean infrastructure would enable real-time decision support, connecting invasion forecasts to marine spatial planning and climate adaptation strategies. Such systems require sustained investment in cyberinfrastructure and data standardisation across monitoring networks, but foundational components demonstrated here confirm this vision is achievable and point toward marine biosecurity infrastructures fit for an era of accelerating global change.\u003c/p\u003e"},{"header":"Material and Methods","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eData\u003c/h2\u003e \u003cp\u003eSpecies selection\u003c/p\u003e \u003cp\u003eSpecies were included if detected in any processed ARMS data from 2018\u0026ndash;2024, comprising: (1) the ARMS-MBON network (19 observatories across 14 countries, 2018\u0026ndash;2021)\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e, or (2) the Swedish Ports Monitoring program (SPM; 23 sites, 2023\u0026ndash;2024)\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eCandidate NIS species were initially detected using standardised genetic protocols based on DNA metabarcoding of mitochondrial COI and nuclear 18S rRNA markers, following the methodology established by the ARMS-MBON consortium\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e,\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e\u003c/sup\u003e. Initial DNA-based species detections were cross-referenced against records from the World Register of Introduced Marine Species (WRiMS)\u003csup\u003e\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e\u003c/sup\u003e. Ambiguous or mis-assigned operational taxonomic units (OTUs) were removed through manual curation, retaining only confident species-level identifications and confirmed alien status in Europe. The detailed methodology and finalised ARMS-MBON NIS list has been published in Pagnier et al.\u003csup\u003e19\u003c/sup\u003e. The same method has been applied on ARMS metabarcoding data obtained from the SPM campaigns from 2023 and 2024. All steps and code used to extract NIS occurrences from ARMS metabarcoding datasets have been published by Daraghmeh\u003csup\u003e\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e\u003c/sup\u003e. This process yielded 81 marine NIS with confirmed genetic detections and verified alien status in at least one European marine region (Supplementary File 1).\u003c/p\u003e \u003cp\u003eThis complete dataset (all years) was used only for species selection to maximise detection of non-indigenous species present in European waters. However, publication status in GBIF/OBIS at the time of modelling determined whether specific ARMS detections were included in model training or reserved for independent validation (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). In fact, our modelling workflow is based on GBIF/OBIS data; published ARMS detections were therefore included in training by default (see Occurrence data compilation and cleaning).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eGenetic observatory datasets and their role in modelling workflow. ARMS-MBON stands for Autonomous Reef Monitoring Structures - Marine Biodiversity Observatory Network, which is a pan-European network sampling and sequencing in a standardised way. SPM stands for Swedish Ports Monitoring, which is a program using similar methods as ARMS MBON for ARMS sampling, sequencing and bioinformatics, but focused on ports located in Sweden. ARMS-MBON and SPM data selected for model training were complemented by other GBIF and OBIS occurrences (see next section). The SPM dataset from 2025 had not been fully processed at the time of species selection and was therefore only used for validation.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDataset\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYears\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eN sites\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePublication status (August 2025)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSpecies selection\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eModel training\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eIndependent validation\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eARMS-\u003c/p\u003e \u003cp\u003eMBON\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2018\u0026ndash;2019\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePublished in GBIF/OBIS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eARMS-\u003c/p\u003e \u003cp\u003eMBON\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2020\u0026ndash;2021\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUnpublished\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSPM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2023\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUnpublished\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSPM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2024\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePublished in GBIF/OBIS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSPM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2025\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eUnpublished\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u0026mdash;\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e✓\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eOccurrence data compilation and cleaning\u003c/p\u003e \u003cp\u003eGlobal occurrence records of the 81 selected species were obtained from two databases to maximise spatial coverage: the Global Biodiversity Information Facility (GBIF, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003ca href=\"https://orcid.org/0000-0002-6531-1374\" target=\"_blank\"\u003ewww.gbif.org\u003c/a\u003e\u003c/span\u003e\u003cspan address=\"http://www.gbif.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) using the \u003cem\u003ergbif\u003c/em\u003e R package v.3.8.1\u003csup\u003e72\u003c/sup\u003e, and the Ocean Biodiversity Information System (OBIS, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003ca href=\"https://orcid.org/0000-0002-6531-1374\" target=\"_blank\"\u003ewww.obis.org\u003c/a\u003e\u003c/span\u003e\u003cspan address=\"http://www.obis.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), using \u003cem\u003erobis\u003c/em\u003e package v.2.11.3\u003csup\u003e73\u003c/sup\u003e. ARMS data availability in GBIF/OBIS determined their use in our modelling workflow (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eWe applied a multi-step filtering procedure to ensure data quality. First, occurrence points on land were removed to exclude specimens stored in research facilities or records with erroneous georeferencing. Duplicates and points with invalid coordinates or geospatial issues were also removed. We only kept occurrences from January 2000 to July 2025 to match the temporal resolution of the environmental predictors used in modelling (see Environmental predictors section). This initial filtering resulted in 226,478 occurrence points across 81 species.\u003c/p\u003e \u003cp\u003eTo reduce spatial sampling bias, we applied spatial thinning using the \u003cem\u003espThin\u003c/em\u003e R package v.0.2.0\u003csup\u003e74\u003c/sup\u003e to all species with sufficient records, retaining only one occurrence per 10 km \u0026times; 10 km grid cell using the \u003cem\u003ethin\u003c/em\u003e function. After spatial thinning, 25,745 occurrences remained across the 81 species.\u003c/p\u003e \u003cp\u003eDuring model formatting with BIOMOD2\u003csup\u003e75\u003c/sup\u003e, occurrences falling outside the extent of environmental predictors or coinciding with cells containing missing values were excluded. This resulted in 12,300 occurrence points across all species. This important loss was due to the continental shelf restriction of the environmental predictors (see next section). Twelve species with fewer than 10 remaining occurrences were excluded from further analysis. The final dataset comprised 69 species with sufficient data for modelling.\u003c/p\u003e \u003cp\u003eA table summarising the number of occurrences per species at each stage can be found in Supplementary File 4. All raw and thinned/cleaned occurrences can be accessed on Figshare (see Data availability statement).\u003c/p\u003e \u003cp\u003eEnvironmental predictors\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003ePredictors selection\u003c/h2\u003e \u003cp\u003eWe selected 34 environmental variables from Bio-ORACLE v.3.0\u003csup\u003e36\u003c/sup\u003e representing key physiological drivers (temperature, salinity, oxygen, nutrients) and biological productivity (chlorophyll, primary productivity). Variables were downloaded using the \u003cem\u003ebiooracler\u003c/em\u003e R package v.0.0.0.9000 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/bio-oracle/biooracler\u003c/span\u003e\u003cspan address=\"https://github.com/bio-oracle/biooracler\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). We generated an additional distance-to-coast layer by calculating the distance from each marine cell to the nearest coastline using the Bio-ORACLE bathymetry raster as reference. All layers had a resolution of 0.05\u0026deg;.\u003c/p\u003e \u003cp\u003eTo reduce multicollinearity, we applied Variance Inflation Factor (VIF) analysis using the \u003cem\u003evifstep\u003c/em\u003e function from the \u003cem\u003eusdm\u003c/em\u003e R package v.2.1\u0026ndash;7\u003csup\u003e76\u003c/sup\u003e, iteratively removing variables with VIF\u0026thinsp;\u0026gt;\u0026thinsp;10. This reduced the predictor set to 19 variables (Supplementary Table\u0026nbsp;2). To ensure consistency between current and future projections, we retained surface chlorophyll concentration rather than mean depth chlorophyll concentration, as the latter was unavailable for future climate scenarios.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eTemporal scope\u003c/h2\u003e \u003cp\u003eWe obtained the same 19 environmental predictors for both current conditions (2000\u0026ndash;2020 average) and end-of-century projections (2090\u0026ndash;2100 average) under three CMIP6-based scenarios: SSP1-2.6 (high mitigation pathway), SSP2-4.5 (moderate mitigation scenario), and SSP5-8.5 (high emission scenario). Bathymetry and distance-to-coast layers remained constant across all scenarios.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eSpatial processing\u003c/h2\u003e \u003cp\u003eBio-ORACLE layers contained missing values in shallow coastal cells where many occurrence records were located. To retain these ecologically relevant areas, we applied focal interpolation using nearest-neighbor values to estimate environmental conditions for coastal cells, with coastlines defined using the medium-scale world map from the \u003cem\u003ernaturalearth\u003c/em\u003e R package v.1.1.0\u003csup\u003e77\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eWe then restricted all environmental stacks to the continental shelf (0-200 m depth) based on the bathymetry layer. This spatial constraint was critical for two reasons: (1) it focused pseudo-absence selection on environmentally accessible habitats, reducing bias; and (2) it prevented models from simply learning to distinguish coastal from open-ocean environments, as our target species are predominantly coastal taxa. The final processed stacks comprised 19 environmental layers for four temporal scenarios (current\u0026thinsp;+\u0026thinsp;3 SSP scenarios).\u003c/p\u003e \u003cp\u003ePseudo-absences generation\u003c/p\u003e \u003cp\u003eWe used a hybrid pseudo-absence (PA) selection strategy to balance spatial realism with geographic coverage in our models. We generated three independent PA datasets per species, each using a 50:50 mixture of two sampling strategies, to reduce sensitivity to pseudo-absence configuration. Half of the pseudo-absences were sampled using a disk-based approach: points were selected at distances between 20 and 100 km from observed presences, ensuring that pseudo-absences represented geographically accessible environments where species were not detected. The remaining half were randomly distributed across the study area (continental shelf only), providing broader environmental coverage and reducing spatial bias inherent in presence-only data.\u003c/p\u003e \u003cp\u003eThe number of pseudo-absences was scaled to three times the number of presence records per species, following BIOMOD2 recommendations\u003csup\u003e\u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e78\u003c/span\u003e\u003c/sup\u003e. A summary of the number of pseudo-absences generated per species can be found in Supplementary File 4.\u003c/p\u003e \u003cp\u003eBefore model fitting, we implemented a deduplication procedure to remove spatially redundant pseudo-absence points. When multiple pseudo-absences from different replicates overlapped spatially, we retained unique coordinates while tracking which replicates used each location.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eModelling\u003c/h2\u003e \u003cp\u003eModel construction\u003c/p\u003e \u003cp\u003eSpecies distribution models (SDMs) were constructed using functions from the \u003cem\u003eBIOMOD2\u003c/em\u003e R package v.4.2-6-2\u003csup\u003e75\u003c/sup\u003e. For each species, we implemented an ensemble modelling approach, combining outputs from multiple algorithms including Generalized Additive Models (GAMs), Multivariate Adaptive Regression Splines (MARS), Maximum Entropy (MAXNET), Random Forest (RF), and Extreme Gradient Boosting (XGBOOST). These algorithms were selected to represent diverse modelling approaches (parametric vs. machine learning; linear vs. non-linear) and capture different aspects of species-environment relationships, with ensemble predictions reducing bias associated with any single algorithm\u003csup\u003e\u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e79\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eOccurrence data were randomly divided into five folds, with each fold serving once as validation data while the remaining four folds trained the model. This process was repeated for all three PA datasets, resulting in 75 cross-validated models per species (5 algorithms \u0026times; 5 folds \u0026times; 3 PA datasets). Additionally, 15 fold-averaged models (5 algorithms \u0026times; 3 PA datasets) and 5 full-dataset models (trained on all data) were produced, yielding 95 total models per species.\u003c/p\u003e \u003cp\u003eModels were then evaluated using two complementary metrics: the True Skill Statistic (TSS), which accounts for both sensitivity and specificity while remaining independent of prevalence\u003csup\u003e\u003cspan citationid=\"CR80\" class=\"CitationRef\"\u003e80\u003c/span\u003e\u003c/sup\u003e, and the Area Under the ROC Curve (AUC), which measures discrimination ability across all thresholds\u003csup\u003e\u003cspan citationid=\"CR81\" class=\"CitationRef\"\u003e81\u003c/span\u003e\u003c/sup\u003e. TSS and AUC were calculated on validation data (20% of occurrences in each fold) to assess model generalisation.\u003c/p\u003e \u003cp\u003eTo assess the relationship between sample size and model performance, we binned species into five occurrence categories (\u0026lt;\u0026thinsp;50, 50\u0026ndash;100, 100\u0026ndash;200, 200\u0026ndash;500, \u0026gt;\u0026thinsp;500 records) and calculated mean performance metrics and standard deviations for each bin. LOESS smoothing was applied with the \u003cem\u003egeom_smooth\u003c/em\u003e function to visualize performance trends across the sample size gradient.\u003c/p\u003e \u003cp\u003eEnsemble models and current projections\u003c/p\u003e \u003cp\u003eFor each species, individual models with TSS\u0026thinsp;\u0026gt;\u0026thinsp;0.6 and AUC\u0026thinsp;\u0026gt;\u0026thinsp;0.85 (indicating reliable predictive performance) were kept. Ensemble models were produced in \u003cem\u003eBIOMOD2\u003c/em\u003e using the weighted mean approach (\u003cem\u003eEMwmean\u003c/em\u003e) to combine predictions from individual algorithms. In this ensemble type, each algorithm\u0026rsquo;s contribution is weighted by TSS during model calibration. This means that models with higher predictive accuracy contribute more strongly to the final ensemble, while models with similar scores receive comparable weights.\u003c/p\u003e \u003cp\u003eCurrent projection maps were built using the \u003cem\u003eBIOMOD_EnsembleForecasting\u003c/em\u003e function, the ensemble model and the stack of environmental predictors for the current scenarios. The resulting maps therefore represent the consensus probability of suitable habitat across all algorithms, emphasising the most reliable models while retaining balanced representation of the ensemble.\u003c/p\u003e \u003cp\u003eAssessing ensemble models\u0026rsquo; uncertainty\u003c/p\u003e \u003cp\u003eIn addition to the ensemble weighted average projections, we also produced coefficients of variation maps (named \u003cem\u003eEMcv\u003c/em\u003e by BIOMOD2, renamed CoV here) and committee averaging maps (named \u003cem\u003eEMca\u003c/em\u003e by BIOMOD2, renamed CA here) for each species and scenario (current and future).\u003c/p\u003e \u003cp\u003eThe CoV represents the coefficient of variation (standard deviation divided by mean) of predicted habitat suitability across all selected algorithms within the ensemble. High CoV values indicate areas where models disagree strongly on species suitability, whereas low values indicate higher inter-model agreement. Note that CoV is inflated in areas with very low mean suitability (\u0026lt;\u0026thinsp;0.3) due to near-zero denominators. For visualisation, we therefore masked CoV values to areas with suitability\u0026thinsp;\u0026gt;\u0026thinsp;0.3, displaying low-suitability areas (\u0026lt;\u0026thinsp;0.3) in white to indicate where CoV estimates are unreliable for uncertainty assessments. CoV values from BIOMOD2 (given as percentages) were divided by 100 for consistency.\u003c/p\u003e \u003cp\u003eThe CA (committee averaging) represents the proportion of models agreeing on species presence or absence after transforming continuous predictions into binary classifications using thresholds that maximise TSS on validation data. This metric serves both as a prediction and an uncertainty measure: values near 1.0 indicate strong consensus for presence, values near 0.0 indicate consensus for absence, and values near 0.5 indicate high disagreement (approximately half the models predict presence while the other half predict absence). CA was rescaled from 0-1000 to 0\u0026ndash;1 for consistency with other outputs.\u003c/p\u003e \u003cp\u003eFuture habitat suitability projections\u003c/p\u003e \u003cp\u003eWeighted ensemble models were projected onto three future climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5) using \u003cem\u003eBIOMOD_EnsembleForecasting\u003c/em\u003e with 2090\u0026ndash;2099 environmental predictors. Projections used the same spatial extent as model training (0-200m continental shelf).\u003c/p\u003e \u003cp\u003eProcessing of projections maps\u003c/p\u003e \u003cp\u003eTo restrict analyses to ecologically relevant marine areas and species' non-indigenous ranges, we applied a multi-step spatial masking procedure on all BIOMOD\u0026rsquo;s projections outputs. First, all terrestrial cells were converted to NA. Second, we restricted projections to European waters using Marine Ecoregions of the World (MEOW)\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e. Importantly, species were masked to only those MEOWs where they are considered non-indigenous. This species-specific regional masking ensures that hotspot analyses reflect true invasion risk rather than including native populations. The MEOW assignments per species are provided in Supplementary File 5.\u003c/p\u003e \u003cp\u003eFinally, we rescaled all current and future habitat suitability maps from BIOMOD2's native 0\u0026ndash;1000 scale to a 0\u0026ndash;1 range for intuitive interpretation.\u003c/p\u003e \u003cp\u003eExternal validation with independent ARMS detections\u003c/p\u003e \u003cp\u003eTo assess model performance against spatially and temporally independent data, we compared ensemble predictions to ARMS detections that were excluded from model training. Specifically, we used: (1) ARMS-MBON detections from 2020\u0026ndash;2021, and (2) SPM detections from 2023 and 2025 (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). These datasets were unpublished in GBIF/OBIS at the time of modelling (August 2025) and therefore represent truly independent validation data, whereas ARMS-MBON 2018\u0026ndash;2019 and SPM 2024 data had been published and were included in the training dataset along with other global occurrences. While these data did not allow formal statistical validation due to species-level imbalances in sample sizes and the lack of confirmed absences (DNA-based absences do not confirm true absence\u003csup\u003e\u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e82\u003c/span\u003e\u003c/sup\u003e), they provided a valuable real-world test of model predictions.\u003c/p\u003e \u003cp\u003eFor each independent detection, we extracted the predicted habitat suitability value (0\u0026ndash;1) at that location from our current-day ensemble models. Of the 69 modelled species, 51 had independent DNA-based detections available, of which 33 had detections falling within the projection extent (continental shelf). We calculated the proportion of these detections located in areas predicted suitable (\u0026gt;\u0026thinsp;0.5) and areas predicted highly suitable (\u0026gt;\u0026thinsp;0.8). Details can be found in Supplementary File 2.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eAnalysis of model predictions\u003c/h2\u003e \u003cp\u003eWe analysed habitat suitability projections using R v.4.3.1 with the terra v.1.8\u0026ndash;54\u003csup\u003e83\u003c/sup\u003e, raster v.3.6\u0026ndash;32\u003csup\u003e84\u003c/sup\u003e, and dplyr v1.1.4\u003csup\u003e85\u003c/sup\u003e packages for spatial processing, and ggplot2 v.3.5.2\u003csup\u003e86\u003c/sup\u003e for visualisation. Analyses were conducted at two scales: individual species and aggregated across all 69 species.\u003c/p\u003e \u003cp\u003eIndividual species analyses\u003c/p\u003e \u003cp\u003eFor each species, we generated five map types to characterise current distributions, uncertainty, and climate-driven changes:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eCurrent habitat suitability: Weighted ensemble predictions showing continuous suitability values (0\u0026ndash;1 scale)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePrediction uncertainty (CoV): Coefficient of variation maps masked to areas with suitability\u0026thinsp;\u0026gt;\u0026thinsp;0.3, as CoV values are inflated in low-suitability areas due to near-zero denominators. Areas with S\u0026thinsp;\u0026le;\u0026thinsp;0.3 were displayed in white to indicate unreliable CoV estimates.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eModel agreement (CA): Committee averaging maps showing the proportion of models agreeing on presence or absence (see Ensemble Models section)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eProjected change: Difference maps (ΔS\u0026thinsp;=\u0026thinsp;S\u003csub\u003efuture\u003c/sub\u003e - S\u003csub\u003ecurrent\u003c/sub\u003e) for SSP2-4.5 as an illustrative scenario, classified into seven categories: moderate decrease (\u0026lt;-0.10), slight decrease (-0.10 to -0.05), no change (-0.05 to 0.05), slight increase (0.05 to 0.10), moderate increase (0.10 to 0.15), substantial increase (0.15 to 0.20), and major increase (\u0026gt;\u0026thinsp;0.20). The \u0026plusmn;\u0026thinsp;0.05 threshold excludes typical modelling noise while capturing ecologically meaningful shifts.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eSuitability transitions: Binary classification of pixels using S\u0026thinsp;=\u0026thinsp;0.5 as threshold into four categories: (a) remains unsuitable (current\u0026thinsp;\u0026lt;\u0026thinsp;0.5, future\u0026thinsp;\u0026lt;\u0026thinsp;0.5), (b) becomes suitable (current\u0026thinsp;\u0026lt;\u0026thinsp;0.5, future\u0026thinsp;\u0026ge;\u0026thinsp;0.5), (c) remains suitable (current\u0026thinsp;\u0026ge;\u0026thinsp;0.5, future\u0026thinsp;\u0026ge;\u0026thinsp;0.5), and (d) becomes unsuitable (current\u0026thinsp;\u0026ge;\u0026thinsp;0.5, future\u0026thinsp;\u0026lt;\u0026thinsp;0.5).\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eAggregated multi-species analyses\u003c/p\u003e \u003cp\u003eTo identify broad-scale NIS spread patterns, we calculated mean ensemble suitability across all 69 species for each grid cell, producing three analysis types:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePriority areas for biosecurity: Mean suitability values across species were classified into management priority categories using 0.1 intervals: negligeable (S\u0026thinsp;\u0026lt;\u0026thinsp;0.1), low (0.1\u0026ndash;0.2), low-moderate (0.2\u0026ndash;0.3), moderate (0.3\u0026ndash;0.4), moderate-high (0.4\u0026ndash;0.5), high (0.5\u0026ndash;0.6), severe (\u0026gt;\u0026thinsp;0.6). Higher values indicate areas where environmental conditions support a greater proportion of the modelled NIS.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eScenario-based change maps: For all three SSP scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5), we calculated change in mean suitability (ΔS\u0026thinsp;=\u0026thinsp;S\u003csub\u003efuture\u003c/sub\u003e - S\u003csub\u003ecurrent\u003c/sub\u003e) and classified values using the same seven categories as individual species maps. We quantified the percentage of the study area falling into each change category for each scenario.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eRegional analysis by ecoregion: We used Marine Ecoregions of the World (MEOW)\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e to analyse 18 ecoregions within European waters: North and East Barents Sea, Northern Norway and Finnmark, Southern Norway, North and East Iceland, South and West Iceland, Faroe Plateau, North Sea, Celtic Seas, South European Atlantic Shelf, Baltic Sea, Black Sea, Adriatic Sea, Ionian Sea, Aegean Sea, Alboran Sea, Western Mediterranean, Levantine Sea, and Tunisian Plateau/Gulf of Sidra. For each ecoregion, we extracted pixel values using \u003cem\u003eterra::extract\u003c/em\u003e and calculated both absolute change (ΔSmean) and percentage change (100\u0026thinsp;\u0026times;\u0026thinsp;ΔSmean / Scurrent) for all three scenarios. Statistics reported as mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation across all pixels within each ecoregion. Ecoregions were ordered by centroid latitude for visualisation.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eBallast water contingency areas (BWCA)\u003c/h2\u003e \u003cp\u003eSpatial data on Offshore Wind Farms (OWFs) and Marine Protected Areas (MPAs) were downloaded from EMODnet (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003ca href=\"https://orcid.org/0000-0002-6531-1374\" target=\"_blank\"\u003ewww.emodnet.ec.europa.eu\u003c/a\u003e\u003c/span\u003e\u003cspan address=\"http://www.emodnet.ec.europa.eu\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Distance to OWFs and MPAs were computed as new rasters, with the same method as the one employed above for distance to land.\u003c/p\u003e \u003cp\u003eTo identify potential Ballast Water Contingency Areas (BWCAs), we applied four spatial criteria: (1) low mean habitat suitability across all 69 modelled NIS, (2) minimum distance \u003cem\u003ed\u003c/em\u003e from MPAs, (3) minimum distance \u003cem\u003ed\u003c/em\u003e from OWFs, and (4) minimum distance \u003cem\u003ed\u003c/em\u003e from coastline. Binary raster layers were created for each criterion using user-defined thresholds. For examples shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e, thresholds were selected as: suitability\u0026thinsp;=\u0026thinsp;0.2, d\u0026thinsp;=\u0026thinsp;7 km.\u003c/p\u003e \u003cp\u003eAll code used to generate these analyses is fully accessible, allowing users to adjust thresholds, incorporate additional constraints, or tailor the approach to regional management needs and local environmental conditions. See Code Availability Statement.\u003c/p\u003e \u003c/div\u003e "},{"header":"Declarations","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eData availability\u003c/h2\u003e \u003cp\u003eAll species distribution model outputs, including individual species ensemble projections and uncertainty metrics (CoV and CA), are publicly available on Figshare at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://figshare.com/s/ab27e1dcaee11ba59e88\u003c/span\u003e\u003cspan address=\"https://figshare.com/s/ab27e1dcaee11ba59e88\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. The repository includes: (i) habitat suitability maps (GeoTIFF format) for 69 marine non-indigenous species under current conditions and three future climate scenarios (SSP1-2.6, SSP2-4.5, SSP5-8.5 for 2090\u0026ndash;2099); (ii) ensemble uncertainty metrics (coefficient of variation and committee averaging maps) for all species and all scenarios; (iii) aggregated mean suitability maps across all species and scenarios; and (iv) processed environmental predictor layers (Bio-ORACLE v.3.0 derived) used for modelling.\u003c/p\u003e \u003cp\u003eSpecies occurrence data used for model training were obtained from the Global Biodiversity Information Facility (GBIF, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.org/\u003c/span\u003e\u003cspan address=\"https://www.gbif.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e and Ocean Biodiversity Information System (OBIS, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://obis.org/\u003c/span\u003e\u003cspan address=\"https://obis.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e and are accessible through these public repositories. Raw GBIF downloads can be accessed via the DOIs found in Supplementary File 4 and are also provided on the Figshare repository alongside the raw OBIS downloads. Final cleaned and spatially thinned occurrence datasets for all 69 modelled species (n\u0026thinsp;=\u0026thinsp;12,300 occurrence points) are provided in the Figshare repository as CSV files with coordinates.\u003c/p\u003e \u003cp\u003eModel performance metrics (TSS and AUC scores) for all 6,555 individual species distribution models (69 species \u0026times; 5 algorithms \u0026times; 5 folds \u0026times; 3 pseudo-absence datasets, plus fold-averaged and full-dataset models) are provided as a comprehensive CSV file in the Figshare repository.\u003c/p\u003e \u003cp\u003eARMS-MBON detection data (2018\u0026ndash;2019) are available through GBIF (18S: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f\u003c/span\u003e\u003cspan address=\"https://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e; COI: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.org/dataset/b9afe2d0-b264-4422-bf8c-096b2a53c18f\u003c/span\u003e\u003cspan address=\"https://www.gbif.org/dataset/b9afe2d0-b264-4422-bf8c-096b2a53c18f\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e ). ARMS-MBON detection data (2020\u0026ndash;2021) got published in December 2025 (post-modelling) and are available through GBIF (18S: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f\u003c/span\u003e\u003cspan address=\"https://www.gbif.org/dataset/6f2f07f3-1ef4-4f82-b3d4-d3bd1406650f\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e ; COI: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.org/dataset/542b31ca-712f-43dd-8f1d-8eeefb88c9a3\u003c/span\u003e\u003cspan address=\"https://www.gbif.org/dataset/542b31ca-712f-43dd-8f1d-8eeefb88c9a3\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e).\u003c/span\u003e\u003c/p\u003e \u003cp\u003eSwedish Ports Monitoring data (2024) are available through \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.gbif.se/ipt/resource?r=nis_monitor_2024\u0026amp;v=1.1\u003c/span\u003e\u003cspan address=\"https://www.gbif.se/ipt/resource?r=nis_monitor_2024\u0026amp;v=1.1\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Unpublished ARMS validation data (ARMS-MBON 2020\u0026ndash;2021; SPM 2023, 2025) are available in Supplementary File 2.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eCode availability\u003c/h2\u003e \u003cp\u003eAll scripts for data collection, species distribution modelling, visualisation and analyses are publicly available via GitHub (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/JustinePa/ARMS-marine-alien-SDM/tree/main\u003c/span\u003e\u003cspan address=\"https://github.com/JustinePa/ARMS-marine-alien-SDM/tree/main\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e and Figshare (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://figshare.com/s/ab27e1dcaee11ba59e88\u003c/span\u003e\u003cspan address=\"https://figshare.com/s/ab27e1dcaee11ba59e88\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e).\u003c/span\u003e\u003c/p\u003e \u003c/div\u003e\u003ch2\u003eConflict of interest statement\u003c/h2\u003e \u003cp\u003eThe authors declare no conflict of interest.\u003c/p\u003e\u003ch2\u003eAuthors contributions\u003c/h2\u003e \u003cp\u003eJ.P. and M.O. conceptualised the study and designed the research framework. J.P. performed all data collection, species distribution modelling, spatial analyses, and visualisation. T.A. provided critical feedback on modelling methodology and model evaluation approaches. M.G.A. developed the computational framework for identifying ballast water contingency areas based on model outputs and created Fig.\u0026nbsp;7. J.P. wrote the original manuscript draft. M.O. and T.A. supervised the research. All authors contributed to manuscript revision and approved the final version.\u003c/p\u003e\u003ch2\u003eAcknowledgements\u003c/h2\u003e \u003cp\u003eThis work was supported by the SciLifeLab \u0026amp; Wallenberg Data Driven Life Science Program (grant: KAW2024.0159). The project was also supported financially by a grant from the DTO-bioflow project (grant no. 101112823), the EU project MARCO BOLO (grant no. 101082021) and Swedish Biodiversity Data Infrastructure (grant no. 2019\u0026thinsp;\u0026minus;\u0026thinsp;00242). Computations and data processing was enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022\u0026ndash;06725, enabling computing using the Dardel high-performance computing system at the PDC Center for High Performance Computing, KTH Royal Institute of Technology, Stockholm, Sweden (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.pdc.kth.se\u003c/span\u003e\u003cspan address=\"https://www.pdc.kth.se\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). We thank the Biodiversity Data Lab at the University of Uppsala for providing feedback during project development, particularly on the statistical modelling approach. Support and feedback were also received from the Swedish Transport Agency, the Swedish Agency for Marine and Water Management, and the OSPAR/HELCOM Joint Task Group on Ballast Water Management Convention (BWMC) and Biofouling.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eVenegas RM, Acevedo J, Treml EA (2023) Three decades of ocean warming impacts on marine ecosystems: A review and perspective. Deep Sea Res Part II Top Stud Oceanogr 212:105318\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlter K et al (2024) Hidden impacts of ocean warming and acidification on biological responses of marine animals revealed through meta-analysis. Nat Commun 15:2885\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWernberg T et al (2025) Marine heatwaves as hot spots of climate change and impacts on biodiversity and ecosystem services. Nat Rev Biodivers 1:461\u0026ndash;479\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShi Y, Li Y (2024) Impacts of ocean acidification on physiology and ecology of marine invertebrates: a comprehensive review. Aquat Ecol 58:207\u0026ndash;226\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eR\u0026ouml;thig T et al (2023) Human-induced salinity changes impact marine organisms and ecosystems. Glob Change Biol 29:4731\u0026ndash;4749\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKorpinen S et al (2021) Combined effects of human pressures on Europe\u0026rsquo;s marine ecosystems. Ambio 50:1325\u0026ndash;1336\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKatsanevakis S, Zenetos A, Belchior C, Cardoso AC (2013) Invading European Seas: Assessing pathways of introduction of marine aliens. Ocean Coast Manag 76:64\u0026ndash;74\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaubrock PJ et al (2025) The spread of non-native species. Biol Rev\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVil\u0026agrave; M et al (2010) How well do we understand the impacts of alien species on ecosystem services? A pan-European, cross-taxa assessment. Front Ecol Environ 8:135\u0026ndash;144\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIPBES. IPBES Invasive Alien Species Assessment: Summary for Policymakers (2023) \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.5281/zenodo.11254974\u003c/span\u003e\u003cspan address=\"10.5281/zenodo.11254974\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeebens H et al (2025) Biological invasions: a global assessment of geographic distributions, long-term trends, and data gaps. Biol Rev Camb Philos Soc 100:2542\u0026ndash;2583\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTurbelin AJ et al (2023) Biological invasions are as costly as natural hazards. Perspect Ecol Conserv 21:143\u0026ndash;150\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDiagne C et al (2021) High and rising economic costs of biological invasions worldwide. Nature 592:571\u0026ndash;576\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePyšek P et al (2020) Scientists\u0026rsquo; warning on invasive alien species. Biol Rev 95:1511\u0026ndash;1534\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAhmed DA et al (2022) Managing biological invasions: the cost of inaction. Biol Invasions 24:1927\u0026ndash;1946\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCuthbert RN et al (2022) Biological invasion costs reveal insufficient proactive management worldwide. Sci Total Environ 819:153404\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eObst M et al (2020) A Marine Biodiversity Observation Network for Genetic Monitoring of Hard-Bottom Communities (ARMS-MBON). Front Mar Sci 7:572680\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDaraghmeh N et al (2025) A Long-Term Ecological Research Data Set From the Marine Genetic Monitoring Program ARMS-MBON 2018\u0026ndash;2020. Mol Ecol Resour 25:e14073\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePagnier J, Daraghmeh N, Obst M (2025) Using the long-term genetic monitoring network ARMS-MBON to detect marine non-indigenous species along the European coasts. Biol Invasions 27:1\u0026ndash;26\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePiazza A, Mikac B, Colangelo MA, Costantini F (2026) ARMS in ports: monitoring non-indigenous species through Autonomous Reef Monitoring Structures. Mar Pollut Bull 222:118545\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eObst M (2024) National monitoring program for non-indigenous species (NIS), Sweden. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.15468/ckcffa\u003c/span\u003e\u003cspan address=\"10.15468/ckcffa\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2025)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSunberg P, Eriksson A-K, Breidenbach M, Panova M, Obst M (2025) \u003cem\u003e\u0026Ouml;vervakning av fr\u0026auml;mmande marina arter med eDNA i s\u0026ouml;dra Bohusl\u0026auml;n 2024\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.lansstyrelsen.se/vastra-gotaland/om-oss/vara-tjanster/publikationer/2025/overvakning-av-frammande-marina-arter-med-edna-i-sodra-bohuslan-2024.html\u003c/span\u003e\u003cspan address=\"https://www.lansstyrelsen.se/vastra-gotaland/om-oss/vara-tjanster/publikationer/2025/overvakning-av-frammande-marina-arter-med-edna-i-sodra-bohuslan-2024.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuisan A, Zimmermann NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135:147\u0026ndash;186\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eElith J, Leathwick JR (2009) Species Distribution Models: Ecological Explanation and Prediction Across Space and Time. Annu Rev Ecol Evol Syst 40:677\u0026ndash;697\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKatsanevakis S et al (2023) Marine invasive alien species in Europe: 9 years after the IAS Regulation. Front Mar Sci 10:1271755\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang T, Liu X, Han Z (2022) Predicting the Effects of Climate Change on the Suitable Habitat of Japanese Spanish Mackerel (Scomberomorus niphonius) Based on the Species Distribution Model. Front Mar Sci 9:927790\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun Y et al (2024) Simulating the changes of the habitats suitability of chub mackerel (\u003cem\u003eScomber japonicus\u003c/em\u003e) in the high seas of the North Pacific Ocean using ensemble models under medium to long-term future climate scenarios. Mar Pollut Bull 207:116873\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGouv\u0026ecirc;a LP, Krause-Jensen D, Duarte CM, Assis J (2025) Projected impacts of future climate change on the aboveground biomass of seagrasses at global scale. Sci Total Environ 966:178680\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeit\u0026atilde;o F, C\u0026aacute;novas F (2025) Predicting climate change impacts on marine fisheries, biodiversity and economy in the Canary/Iberia current upwelling system. J Environ Manage 384:125537\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDe Wysiecki A et al (2025) Using global occurrence data to predict suitable habitats for widely distributed marine species in data-scarce regions. Biodivers Conserv 34:1497\u0026ndash;1523\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobinson NM, Nelson WA, Costello MJ, Sutherland JE, Lundquist CJ (2017) A Systematic Review of Marine-Based Species Distribution Models (SDMs) with Recommendations for Best Practice. Front Mar Sci 4:421\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldsmit J et al (2020) What and where? Predicting invasion hotspots in the Arctic marine realm. Glob Change Biol 26:4752\u0026ndash;4771\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMelo-Merino SM, Reyes-Bonilla H, Lira-Noriega A (2020) Ecological niche models and species distribution models in marine environments: A literature review and spatial analysis of evidence. Ecol Model 415:108837\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKumschick S et al (2025) Mapping potential environmental impacts of alien species in the face of climate change. Biol Invasions 27:43\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eObst M, Huertas C, De Carlo F, DTO-BioFlow (2025) DUC 1 - Data-driven Strategies for Invasive Species Management. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.5281/zenodo.17484964\u003c/span\u003e\u003cspan address=\"10.5281/zenodo.17484964\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAssis J et al (2024) Bio-ORACLE v3.0. Pushing marine data layers to the CMIP6 Earth System Models of climate change research. Glob Ecol Biogeogr 33:e13813\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSpalding MD et al (2007) Marine Ecoregions of the World: A Bioregionalization of Coastal and Shelf Areas. Bioscience 57:573\u0026ndash;583\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eV\u0026aacute;clav\u0026iacute;k T, Meentemeyer RK (2012) Equilibrium or not? Modelling potential distribution of invasive species in different stages of invasion. Divers Distrib 18:73\u0026ndash;83\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEarly R, Sax DF (2014) Climatic niche shifts between species\u0026rsquo; native and naturalized ranges raise concern for ecological forecasts during invasions and climate change. Glob Ecol Biogeogr 23:1356\u0026ndash;1365\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSrivastava V, Lafond V, Griess VC (2019) Species distribution models (SDM): applications, benefits and challenges in invasive species management. CABI Rev 1\u0026ndash;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1079/PAVSNNR201914020\u003c/span\u003e\u003cspan address=\"10.1079/PAVSNNR201914020\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHedensj\u0026ouml; A, Strand \u0026Aring;, Laugen AT (2025) Habitat Preferences at the Leading Edge of a Marine Bioinvasion. Ecol Evol 15:e72475\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJim\u0026eacute;nez-Valverde A et al (2011) Use of niche models in invasive species risk assessments. Biol Invasions 13:2785\u0026ndash;2797\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHolman LE et al (2019) Detection of introduced and resident marine species using environmental DNA metabarcoding of sediment and water. Sci Rep 9:11559\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMorisette J et al (2021) Strategic considerations for invasive species managers in the utilization of environmental DNA (eDNA): steps for incorporating this powerful surveillance tool. Manag Biol Invasions Int J Appl Res Biol Invasions 12:747\u0026ndash;775\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSax DF, Schlaepfer MA, Olden JD (2022) Valuing the contributions of non-native species to people and nature. Trends Ecol Evol 37:1058\u0026ndash;1066\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUlman A et al (2019) A Hitchhiker\u0026rsquo;s guide to Mediterranean marina travel for alien species. J Environ Manage 241:328\u0026ndash;339\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKatsanevakis S et al (2014) Invading the Mediterranean Sea: biodiversity patterns shaped by human activities. Front Mar Sci 1:32\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChan FT et al (2019) Climate change opens new frontiers for marine species in the Arctic: Current trends and future invasion risks. Glob Change Biol 25:25\u0026ndash;38\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldsmit J, McKindsey C, Archambault P, Howland K (2019) L. Ecological risk assessment of predicted marine invasions in the Canadian Arctic. PLoS ONE 14:e0211815\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAksenov Y et al (2017) On the future navigability of Arctic sea routes: High-resolution projections of the Arctic Ocean and sea ice. Mar Policy 75:300\u0026ndash;317\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBailey SA (2015) An overview of thirty years of research on ballast water as a vector for aquatic invasive species to freshwater and marine environments. Aquat Ecosyst Health Manag 18:261\u0026ndash;268\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeebens H, Schwartz N, Schupp PJ, Blasius B (2016) Predicting the spread of marine species introduced by global shipping. \u003cem\u003eProc. Natl. Acad. Sci. U. S. A.\u003c/em\u003e 113, 5646\u0026ndash;5651\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOSPAR. Intra North Sea Ballast Water Contingency and Compliance Area in accordance with BWM.2/Circ.62 and MEPC.387(81) (2025)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBurfeind DD, Pitt KA, Connolly RM, Byers JE (2013) Performance of non-native species within marine reserves. Biol Invasions 15:17\u0026ndash;28\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAdams TP, Miller RG, Aleynik D, Burrows MT (2014) Offshore marine renewable energy devices as stepping stones across biogeographical boundaries. J Appl Ecol 51:330\u0026ndash;338\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGollasch S, David M (2019) Ballast Water: Problems and Management. World Seas: An Environmental Evaluation. Elsevier, pp 237\u0026ndash;250. doi:\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/B978-0-12-805052-1.00014-0\u003c/span\u003e\u003cspan address=\"10.1016/B978-0-12-805052-1.00014-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSeebens H, Gastner MT, Blasius B (2013) The risk of marine bioinvasion caused by global shipping. Ecol Lett 16:782\u0026ndash;790\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBlackman R et al (2024) Environmental DNA: The next chapter. Mol Ecol 33:e17355\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Ccedil;evik T, \u0026Ccedil;evik N, Environmental (2025) DNA (eDNA): A review of ecosystem biodiversity detection and applications. Biodivers Conserv 34:2999\u0026ndash;3035\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRius M, Pascual M (2025) Genomics-informed Modelling: Advancing Our Understanding of Non-indigenous Species\u0026rsquo; Colonization and Spread. in Invasion Genomics 162\u0026ndash;174\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePollock LJ et al (2014) Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods Ecol Evol 5:397\u0026ndash;406\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTikhonov G, Abrego N, Dunson D, Ovaskainen O (2017) Using joint species distribution models for evaluating how species-to-species associations depend on the environmental context. Methods Ecol Evol 8:443\u0026ndash;452\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuillera-Arroita G (2017) Modelling of species distributions, range dynamics and communities under imperfect detection: advances, challenges and opportunities. Ecography 40:281\u0026ndash;295\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHawkins E, Sutton R (2009) The Potential to Narrow Uncertainty in Regional Climate Predictions. Bull Am Meteorol Soc 90:1095\u0026ndash;1108\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFr\u0026ouml;licher TL, Rodgers KB, Stock CA, Cheung W (2016) W. L. Sources of uncertainties in 21st century projections of potential ocean ecosystem stressors. Glob Biogeochem Cycles 30:1224\u0026ndash;1243\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVucetich JA, Hoy SR, Peterson R (2025) O. More reason for humility in our relationships with ecological communities. Bioscience 75:163\u0026ndash;171\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen EY-S (2021) Often Overlooked: Understanding and Meeting the Current Challenges of Marine Invertebrate Conservation. Front Mar Sci 8:690704\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhan A (2025) Multi-Omics-Driven Adaptive Management of Biological Invasions: Toward a Proactive, Predictive, and Integrative Framework. Biol Divers 2\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePagnier J et al (2025) A long-term ecological research dataset from the marine genetic monitoring programme ARMS-MBON 2020\u0026ndash;2021. Biodivers Data J 13:e148981\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCostello MJ et al (2026) World Register of Introduced Marine Species (WRiMS). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.14284/347\u003c/span\u003e\u003cspan address=\"10.14284/347\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDaraghmeh N (2024) ARMS-MBON 18S rRNA and COI gene metabarcoding: scanning for non-indigenous species\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChamberlain S et al (2024) rgbif: Interface to the Global Biodiversity Information Facility API\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eProvoost P, Bosch S, Appeltans W, OBIS (2022) \u0026amp;. robis: Ocean Biodiversity Information System (OBIS) Client\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAiello-Lammens ME, Boria RA, Radosavljevic A, Vilela B, Anderson RP (2015) spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography 38:541\u0026ndash;545\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThuiller W, Lafourcade B, Engler R, Ara\u0026uacute;jo M (2009) B. BIOMOD \u0026ndash; a platform for ensemble forecasting of species distributions. Ecography 32:369\u0026ndash;373\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNaimi B, usdm (2023) Uncertainty Analysis for Species Distribution Models\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMassicotte P, South A, Hufkens K (2025) rnaturalearth: World Map Data from Natural Earth\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGu\u0026eacute;guen M, Blancheteau H, Lemaire-Patin R, Thuiller W (2025) Pseudo-absences. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://biomodhub.github.io/biomod2/articles/vignette_pseudoAbsences.html\u003c/span\u003e\u003cspan address=\"https://biomodhub.github.io/biomod2/articles/vignette_pseudoAbsences.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAra\u0026uacute;jo MB, New M (2007) Ensemble forecasting of species distributions. Trends Ecol Evol 22:42\u0026ndash;47\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAllouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 43:1223\u0026ndash;1232\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38\u0026ndash;49\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuarte S, Vieira PE, Lavrador AS, Costa FO (2021) Status and prospects of marine NIS detection and monitoring through (e)DNA metabarcoding. Sci Total Environ 751:141729\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHijmans RJ et al (2025) terra: Spatial Data Analysis\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHijmans RJ et al (2025) raster: Geographic Data Analysis and Modeling\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWickham H et al (2023) dplyr: A Grammar of Data Manipulation\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWickham H et al (2025) ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8702791/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8702791/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMarine biological invasions threaten ocean health, yet management remains reactive rather than proactive. Predictive tools, such as species distribution models, have the potential to provide indications about which areas are particularly likely to be colonised by non-indigenous species, allowing a more proactive management approach. Here we introduce an integrated framework utilising DNA-based monitoring data from genetic observatory networks to identify non-indigenous species for modelling and to independently validate the species distribution models forecasting invasion risk areas across European seas. We modelled habitat suitability for 69 marine non-indigenous species using global occurrence data and an ensemble modelling approach based on 6,555 individual species distribution models built with five different algorithms. Model validation against independent DNA-based detections from observatory networks showed 90% of observations occurred in predicted suitable habitat, confirming robust predictive capacity. Under current conditions, models identify invasion hotspots in the North Sea, North Atlantic, Mediterranean, and Black Sea. Climate projections to 2100 reveal pronounced vulnerability in Arctic and subarctic regions (up to 300% increase in habitat suitability under SSP5-8.5 scenario), while Mediterranean regions show modest change. We further demonstrate how the models can be applied in preventive action by supporting decisions in ballast water management. By coupling standardised and spatiotemporally consistent molecular monitoring with predictive modelling, we provide a scalable approach for marine biosecurity forecasts in a rapidly changing ocean.\u003c/p\u003e","manuscriptTitle":"The role of genetic observatory networks in the detection and forecasting of marine non-indigenous species","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-30 07:55:32","doi":"10.21203/rs.3.rs-8702791/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-communications","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"NCOMMS","sideBox":"Learn more about [Nature Communications](http://www.nature.com/ncomms/)","snPcode":"","submissionUrl":"https://mts-ncomms.nature.com/","title":"Nature Communications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Communications","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"a774d229-a95c-4ac9-bddd-d7aa3545ed3a","owner":[],"postedDate":"January 30th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":61855838,"name":"Biological sciences/Ecology/Biodiversity"},{"id":61855839,"name":"Biological sciences/Ecology/Invasive species"},{"id":61855840,"name":"Biological sciences/Ecology/Ecological modelling"},{"id":61855841,"name":"Biological sciences/Biological techniques/Genetic techniques"}],"tags":[],"updatedAt":"2026-02-24T14:52:54+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-30 07:55:32","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8702791","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8702791","identity":"rs-8702791","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.