How to optimally allocate sampling effort in experimental ecology | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article How to optimally allocate sampling effort in experimental ecology Andreas H. Schweiger, Aron Garthen, Michael Bahn, David Chalcraft, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7921667/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 13 Feb, 2026 Read the published version in Scientific Reports → Version 1 posted 11 You are reading this latest preprint version Abstract A major aim of experimental ecology is to quantify responses to environmental change. Study designs which optimally capture response patterns are currently debated. A key point in the discussion is how a limited total number of samples should ideally be allocated to replication versus the number of locations along the environmental gradient. Here, we assess how to optimally allocate sampling effort for maximizing prediction accuracy in gradient designs. For this we performed artificial data simulations for different sampling approaches with or without a priori knowledge of the underlying patterns, and applied a set of commonly observed response shapes. Overall, unreplicated sampling with equidistant, systematic placement along the gradient of interest at as many locations or levels as affordable turned out to be the best approach for unknown response shapes. Replication was found to be beneficial when a priori knowledge exists about the underlying, simple (e.g. linear or humped) response shape. Biological sciences/Ecology Earth and environmental sciences/Ecology Physical sciences/Mathematics and computing global change experiments ecological models experimental design non-linear responses polynomial fits prediction success Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction A key aspect of ecological research is to quantify ecological responses to environmental change. Ecological responses are often characterized by high degrees of non-linearity, which challenge research approaches (Schweiger 2017 ; Kreyling et al. 2018 ). Gradient studies, which analyze ecological responses along continuous environmental gradients, have been proposed as the most appropriate approach for detecting response patterns with high prediction accuracy (Schweiger et al. 2016 ; Kreyling et al. 2018 ; Manning 2019 ). Response patterns deduced from such gradient studies may not only advance ecological understanding, but can also inform models used for projecting possible future ecosystem responses as an important tool for climate change impact assessments (De Boeck et al. 2020 ). Gradient designs and analyses are frequently used in subdisciplines of ecology such as vegetation ecology, biogeography or macroecology. However, they remain underrepresented in other fields such as experimental ecology. While the quantification of ecological response patterns is of key interest in experimental ecology, during the last decades researchers have predominately applied suboptimal sampling designs and analytical approaches when studying these responses (i.e. analysis of group contrasts, black bars in Fig. 1 and Figure S1 for temporal development). Replication is a crucial component of sampling design, especially when contrasts between different groups are analyzed (e.g. in classical treatment vs. control experimental designs). This holds true also for gradient designs, where a high number of replicates for each sampling location along the investigated gradient would be beneficial for prediction accuracy if resources are unlimited for sampling. However, given the limited number of samples which can be realistically taken, an inevitable trade-off emerges between replication and the number of sampling locations which can be considered to cover the environmental gradient of interest. This trade-off has provoked discussions about the optimal sampling procedure, i.e. whether to put more emphasis on the number of sampling locations or experimental levels at the expense of local replication or whether to put more emphasis on replication but with fewer locations or levels sampled along the gradient of interest (Oksanen 2001 ; Davies & Gray 2015 ; Schweiger et al. 2016 ; Kreyling et al. 2018 ; Chalcraft 2019 ). What is known from previous, quantitative studies on the role of replication in gradient studies is that without a priori knowledge of the underlying response shape and random sampling locations along the gradient, unreplicated designs outperform replicated designs in their prediction accuracy for a given sampling effort (Kreyling et al. 2018 ). Systematic sampling along a gradient, however, can result in an advantage of replicated designs for simple response shapes such as linear or quadratic (humped) relationships when response shape and gradient length are a priori known (Chalcraft 2019 ). These contrasting perspectives on the importance of replication differ in two aspects, (1) the different metrics of prediction accuracy used in the two studies, (2) differences in sampling strategy, i.e. random placement of sampling locations along the investigated gradient vs. an equidistant, systematic sampling and (3) the assumption on whether the underlying response shape to be analyzed is a priori known or unknown. These apparent differences lead to the hypothesis that sampling strategy, i.e. the decision where to locate sampling along the investigated gradient, might be decisive for defining whether replication at the expense of sampling locations has positive or negative effects on prediction accuracy. Sampling strategy is further expected to be strongly affected by a priori knowledge on the underlying response shape as well as the length of the investigated gradient. However, understanding the effects of sampling strategy is so far lacking. To address this lack of knowledge and to reconcile the contrasting perspectives, we extended the artificial data simulations from the previous studies of Kreyling et al. ( 2018 ) and Chalcraft ( 2019 ) for different sampling approaches, combining different sampling procedures (i.e. how to trade off number of sampling locations against number of replicates per location) and sampling strategies (i.e. where to place sample locations along the investigated gradient) used in regression-type analyses (for methodological details see the Online Methods section). We tested this for a set of six response shapes representing typical shapes commonly observed in ecology (Fig. 2 ). We focused on the effects of different sampling approaches on the prediction accuracy in relation to the effects of replication. We furthermore tested for the effects of model assumptions, i.e. whether the underlying response shape is a priori known. We investigated these effects for the two different measures of prediction accuracy used by Kreyling et al. ( 2018 ) and Chalcraft ( 2019 ) and an additional measure based on the root mean square error (RMSE) to achieve a balanced view on different perspectives on prediction accuracy. We finally investigated the effects of gradient length, i.e. whether the response of interest along the underlying environmental gradient is partly or fully sampled. Based on our previous evidence, we hypothesized that the sampling procedure in interaction with the sampling strategy would have significant effects on prediction accuracy in gradient studies. More specifically, we expected that replication would increase prediction accuracy when the underlying response shape is a priori known and, thus, also the locations of the critical response points (for definition see methods below) along the environmental gradient are known and accounted for in the sampling approach. We expected this to be especially relevant for non-linear response shapes, i.e. shapes that cannot mathematically be described as a simple linear model in the form of y = a∙x + b . However, we expected replication to yield lower prediction accuracy when the underling response shape is unknown and, thus, critical response points are unknown and therefore not accounted for in the sampling approach. We defined critical response points as locations along the environmental gradient where sampling is crucial for an accurate prediction of the studied response along this gradient. These can be response extremes, i.e. maximum or minimum response values along the studied gradient, or regions of strong response changes, i.e. regions of maximum slope of the response pattern – or a combination of both (Fig. 2 ). Based on our analyses, we derived recommendations on how to optimize sampling approaches in observational and experimental ecological research, including when to replicate and when to apply non-replicated study designs. Furthermore, we provide recommendations on optimized sampling strategies, i.e. where to best sample along gradients. Results Comparing the three measures of prediction accuracy, multiple R 2 showed significantly higher sensitivity to variations in the different simulation settings than RMSE or Chalcraft’s prediction success (Chalcraft 2019) (see Conditional R 2 values for the four different models and 20% as well as 100% noise summarized in Table 2; Supplementary Material Figure S3 and S4 as well as Tables S1 to S6 for more details). In the following we will show and discuss the results obtained for all three measures of prediction success. The number of replicates in combination with the a priori knowledge on the underlying response shape (i.e. Model 3) turned out to be the best set of explanatories for prediction accuracy based on multiple R 2 (see Marginal R 2 for 20% and 100% noise in Table 2). The other three sets of predictors describing prediction accuracy, i.e. replication (i.e. Model 1), replication * gradient length (i.e. Model 2) and replication* total sample size (i.e. Model 4) explained significantly less variation in prediction accuracy (i.e. multiple R 2 ) and were not statistically different from each other. For RMSE and Chalcraft’s prediction success, all models showed very low explanatory power with no significant differences between the different models, except for Chalcraft’s prediction success at 100% noise, where the replication-only model (i.e. model 1) explained significantly less variation than the other models where replication interacted with the other explanatories. Details on the individual model statistics for the different sampling strategies and response shapes are available in Table S1 to S6. For the best explanatory model based on multiple R 2 (i.e. Model 3), replication turned out to be the main explanatory of prediction accuracy, with 23 ± 31% (arithmetic mean ± standard deviation) of explained total variation for 20% of noise and 31 ± 32% for 100% of noise (Figure S5 and S8). Explanatory power of a priori knowledge of the underlying response shape and its shared predictive power with replication was significantly lower (i.e. 2.0 ± 4.2% with 20% noise as well as 0.70 ± 1.3% for 100% noise; in all cases p<0.001 in multi-comparison tests). This strong, individual effect of replication on prediction accuracy vanished when combined with gradient length (Model 2) or total sample size (Model 4) as interacting predictors (see Figures S5 and S8). For RMSE and Chalcraft’s prediction success, relative contributions of replicates, a priori knowledge of the underlying response shape and their combination were all minor (see Supplementary Figures S6, S7, S9 and S10). Replication showed significant negative effects on prediction accuracy for 94% (multiple R 2 ), 33% (Chalcraft’s prediction success) and 39% (RMSE) of all tested cases at 20% noise (results of a linear mixed model accounting for knowledge on the underling response shape, Figure 3 and Table S7 for details). Positive effects of replication were detected in 0% (R 2 and Chalcraft) and 3% (RMSE), whereas non-significant effects were observable in 6, 58 and 67% of all tested cases for multiple R 2 , RMSE and Chalcraft’s prediction success, respectively. Similar patterns were observable for 100% of noise despite for Chalcraft’s prediction success, for which no significant effects of replication were observable (see Supplementary Fig. S11 and table S7 for more details). Prediction accuracy differed among the different sampling strategies with systematic sampling yielding highest prediction accuracy. This was consistent across the three different measures of prediction accuracy, the different noise levels as well as independent of a priori knowledge of the underlying response pattern (Supplementary Table S8-10 as well as Fig. S 12-14). Discussion Detailed understanding of the effects of the sampling approach on prediction accuracy is paramount when aiming for scientifically sound and cost-efficient study designs. Here, we assessed how to optimize the number of sampling locations along gradients relative to the number of replicates per location for a given number of available samples. Our premise was that more samples are generally beneficial to increase statistical power, but they come at a higher cost in terms of resources and personnel. We furthermore investigated how to optimally place the samples along the investigated gradient to maximize prediction accuracy in gradient studies for the major response shapes commonly observed in ecology. Previous, quantitative studies on the role of replication have shown that replication is not necessary or can even have negative effects on prediction accuracy in scenarios where the underlying response patterns are unknown (Kreyling et al. 2018 ). Systematic sampling along a gradient, however, can result in an advantage of replicated designs for simple response shapes such as linear or quadratic (humped) relationships when response shape and gradient length are a priori known (Chalcraft 2019 ). These two studies differed in their underlying assumptions, in that Kreyling et al. assumed a lack of existing a priori knowledge of the underlying response shape, while Chalcraft assumed the underlying response shape to be known (see Supplement Material of Chalcraft 2019 ). By including a priori knowledge of the response shape in our simulations, we were able to reproduce the findings of Chalcraft for the response shapes he investigated (i.e. linear and centered hump). This resolves the apparent discrepancy between the two studies by showing that existing or lacking a priori knowledge on the underlying response shape is decisive for whether replication is beneficial for increasing prediction accuracy or not. Unreplicated designs yielded higher prediction accuracies in our study when the response shape was unknown or the known response shape was more complex (i.e. exponential or logistic). However, replication turned out to be beneficial when the underlying response shape was a priori known and rather simple (i.e. linear or hump, see Fig. 4 for a summary). We detected a positive interaction between replication and knowledge on the underlying response shape, meaning that the negative effect of replication tends to be higher for unknown response shapes and the negative effect of missing a priori knowledge about the underlying response shape is stronger for stronger replication. However, this interactive effect was rather weak (see Fig S5 a, S6a and S7a). Furthermore, systematic (equidistant) sampling along the investigated gradient turned out to be generally superior to all other tested sampling strategies. This second finding might be a major relief for researchers who are worried how to best sample responses along environmental gradients or decide upon experimental treatment levels without any a priori knowledge on the underlying response shapes. Still, it might be difficult to implement systematic, equidistant sampling in practice, as scaling between the investigated driver and response might often be non-linear (e.g. metabolic rates double every 10 degrees of temperature increase, or biological/ecological responses to increasing precipitation will often be log-scaled). Under such circumstances it might be not entirely clear what equidistant exactly means and, thus, whether linear or non-linear response shapes have to be assumed or predictor values have to be transformed to obtain linear scaling. In such cases, random placement of samples along the investigated gradient might provide an alternative solution when the location of critical response points are unknown and therefore cannot be accounted for in a preferential sampling strategy. Surprisingly, even if the underlying pattern is known, systematic sampling performed best, or at least not systematically worse, than preferential sampling designs. Preferential sampling covering critical response points such as local extremes or parts of the gradient with strong changes (steep slopes) can become especially beneficial for prediction accuracy when the response shape is a priori known (i.e. logistic response shapes in Fig. 4 ). A priori knowledge on the investigated response shape – which can be obtained from pilot studies or might be inferred from existing literature – can furthermore significantly increase sampling efficiency and prediction accuracy in subsequent studies (see Supplementary Table S8-10). Based on the advanced understanding emerging from our simulations, we derived a set of recommendations to optimize sampling in ecological research (Fig. 4 ). Replication obviously increases the prediction accuracy for any specific location along the environmental gradient in the presence of (white) noise by providing a better estimate through averaging. This is especially crucial when aiming for contrasts between factorial groups such as two-level manipulation experiments and when data variance for specific treatment conditions (i.e. within different treatment groups) should be minimal to increase predictive accuracy at the single location (Chalcraft 2019 ; Li et al. 2020 ). Such classical, replicated experiments are inevitable whenever binary environmental drivers are tested such as presence or absence of specific species or functional groups or the effects of sites or management schemes differing non-continuously or along unknown gradients (Kreyling et al. 2018 ). However, classical experimental and analytical approaches, such as a two-level manipulation of environmental factors, are highlighted as inappropriate when aiming for characterizing such non-linear processes regulating ecosystem responses to multifactor drivers of global change (Rineau et al. 2019 ) or to quantify phenotypic plasticity (Morel-Journel et al. 2020 ). Gradient designs and regression-type, analytical approaches are furthermore better suited when aiming for the characterization of response patterns along environmental gradients. Besides their advantages for mechanistic understanding and model extrapolation, gradient designs capable of capturing non-linear responses are paramount when studying tipping points and thresholds of ecosystems approaching critical regime shifts (Scheffer et al. 2009 , 2012 ; Bardgett & Caruso 2020 ; Berdugo et al. 2020 ; Ingrisch et al. 2023 ). Unreplicated gradient designs can furthermore help to avoid pseudo-replication in less controlled settings such as field experiments or other empirical investigations along environment gradients (Schöps et al. 2020 ), a common criticism of replicated designs under such conditions (Hurlbert 1984 ). Gradient designs will be especially suitable for large-scale citizen science projects where people collect vast amounts of biological information along environmental gradients covering entire countries or continents such as iNaturalist (Taylor & Guralnick 2019 ; Barve et al. 2020 ). By tackling a wide range of different settings in our simulations, we consider our analyses representative for settings of sampling strategies and procedures commonly used in ecological research focusing on gradient analyses. Depending on the length of the gradient along which responses are measured and analyzed, different response shapes might be identified as most accurate to describe the underlying response. This phenomenon of contrasting response patterns being identified for the same underlying process is reported for well-known, functional relationships such as the biodiversity-productivity relationship and can cause misguided discussions about the underlying mechanisms (Guo et al. 2023 ). Although our simulations covered variation in gradient length (i.e. scenarios with and without predictor extremes), we did not explicitly tackle the topic of gradient sampling beyond commonly considered ranges – a fact that calls for future studies in this direction. One empirical approach to resolve this phenomenon of incomplete gradient sampling might be to enlarge the range of investigated, environmental conditions into extreme conditions even beyond the biological limits of the studied responses (e.g. species-specific mortality; Kreyling et al. 2014 ; De Boeck et al. 2020 ). Usually, experiments keep the range of investigated conditions within conservative boundaries, presumably because more extreme (although potentially realistic) treatments may have a catastrophic impact on a studied organism or ecosystem, which potentially results in the loss of costly replicated samples due to e.g. death of organisms when physiological limits are crossed under extreme environmental conditions (Rineau et al. 2019 ). Unreplicated gradient designs will allow for such extensions into the extremes without losing too many samples (cf. Kreyling et al. 2018 ). Gradient studies that realize a wide range of environmental conditions will furthermore provide understanding on how far a certain response of a specific organism or ecosystem is situated relative to its lower or upper tolerance limit (Rineau et al. 2019 ). Conclusions High prediction success in gradient analyses is determined by two factors: (1) identification of the response shape and (2) precise estimation of model parameters to maximize predictive accuracy across a wide range of environmental conditions. To achieve this, sampling has to be optimized under limited resources, thus, limited total sample size. For gradient studies, we have shown that available resources should be invested into increasing the number of sampling locations at the expense of replication when the underlying response shape is unknown or complex; nevertheless, replication can be beneficial in gradient studies when the response shapes are simple and known. Unreplicated designs will serve for covering investigated gradients more densely and for pushing experimental systems beyond historical and forecasted extremes. The latter will be decisive for global change impact research, as it enhances our understanding of stressor–response relationships and thresholds in state and impact beyond already realized environmental conditions. Our simulations furthermore show that systematic sampling along the gradient of interest generally outperforms all other tested sampling strategies, except for complex, a priori known response shapes, for which preferential sampling of critical response points (i.e. local extremes or parts of the gradient with strong changes) can be beneficial. Such a priori knowledge of the underlying response shape can be instrumental for designing the most informative experiments in the most efficient way. Online Methods Artificial data simulations We performed simulations based on artificial data similar to Schweiger et al. ( 2016 ). To be fully comparable to Chalcraft ( 2019 ), we focused on one-factorial responses. We varied four parameters in our simulations: (1) response shape (i.e. mathematical function underlying the observable response along a gradient of environmental conditions), (2) sampling procedure (i.e. total sample size and how to trade off number of sampling locations against number of replicates per location along the environmental gradient), (3) sampling strategy (i.e. where to place sampling locations along the investigated gradient), and (4) level of stochasticity in the response. We simulated six different linear to highly nonlinear response shapes representing typical and varied shapes frequently used in ecology and other disciplines of natural sciences to describe response patterns (Fig. 2 ). Each response shape simulates a response variable (y; e.g. any numeric biotic variable such as photosynthetic activity, species richness or population viability) in response to a numeric environmental driver (x; e.g. temperature, water availability, soil pH or nutrient availability). The specific responses of y to changes in x were formulated as linear or nonlinear functions of the form y i = f i (x i ) (Schweiger et al. 2016 ; Kreyling et al. 2018 ; Chalcraft 2019 ). For transferability and to allow general conclusions, we scaled our variables in arbitrary units (cf. Schweiger et al. 2016 ; Kreyling et al. 2018 ). Different sampling procedures were realized by systematically varying the number of total samples drawn from the underlying response shape and the number of locations at which these samples were placed. We used 6 to 96 samples to cover the range of total sample sizes commonly realized in univariate ecological experiments. This total number of possible observations can either be used for covering the study gradient with many sampling locations and, thus, reducing (or completely abolishing) replicates at each sampling location, or on the contrary for increasing the number of replicates sampled from fewer sampling locations along the gradient. A single sample at a location corresponds to no replication, irrespective of the total number of samples drawn, because we define replicates as the number of samples taken at a single sampling location along the driver gradient (cf. Schweiger et al. 2016 ; Kreyling et al. 2018 ). Our sampling procedures are therefore combinations of 3 to 96 locations sampled and 1 to 32 replicates per location. We considered six sampling strategies, differing in the way the number of sampling locations were placed along the driver gradient. Besides random selection of sampling locations (cf. Kreyling et al. 2018 ) and a systematic, equidistant sampling (with the two ends of the gradient always sampled; cf. Chalcraft 2019 ), we applied a set of four preferential sampling strategies to account for critical response points. Criticality of sampling locations was quantified (1) as the slope of the respective response curves at the specific location (the higher the slope at a particular location the higher the criticality of the response value at this location), (2) as the extremeness of the response value at a specific location (relative to the arithmetic mean of the minimum and maximum of all response values along the studied gradient), and (3) as a combination of both factors equally weighted. These criticality values were standardized and used to quantify sampling probability of the locations along the investigated gradient in an adjusted, randomized sampling. To account for a non-linear scaling of the environmental gradient, we additionally applied a log systematic sampling strategy, where we sampled equidistantly distributed on a log-scale along the driver gradient. Such non-linear scaling of the driver gradient is a common feature e.g. in water or light response curves conceptualized in ecophysiological reaction norms. Preferential sampling as realized in these simulations might appear to be unrealistic in practice when a priori knowledge on the underlying response shape is entirely missing. However, such a priori knowledge can be obtained from pre-studies or literature, allowing for the design of preferential sampling strategies also under real-world conditions. Different levels of stochasticity in the response variable were tested for all different sampling procedures and strategies applied to the different response shapes by allowing the sampled response values (y) at each sampled location to scatter around the ‘true’ response values with a normal distribution corresponding to 20% (i.e. c. 85% percentile) or 100% (100% percentile) of the absolute response value at this location (cf. Kreyling et al. 2018 ). Information about the levels of random noise in ‘real-world’ data is extremely rare. Richardson et al. ( 2012 ) estimated random noise to reach a maximum of 23% of total variation for eddy flux measurements – a highly uncertain method in environmental science. Kelly et al. ( 2009 ) reported similar levels of random noise for assessments based on species community composition, which ranged between 3 and 22% of total variation (on average 11.3 ± 4,6%). Based on these observations, we assume our 20% noise scenario as a close-to-real world scenario, which will be of practical use in many situations, whereas our 100% noise scenario has to be considered as a very extreme, ‘high-noise’ scenario. Analysis of simulations We used polynomial regression for pattern prediction and interpolation. For the unknown response shape scenario, we allowed the algorithm to choose the best fitting model for the sampled test data from a set of polynomial equations (1st to 4th order) based on minimal AIC. For the scenario where we assumed the response shape to be a priori known, we selected the prediction model which represented the respective response shape. For each test data set / each selected prediction model, we checked the ability to reveal the true underlying response shape by quantifying prediction accuracy through plotting predicted against the true response values. We quantified prediction accuracy by using the two methods of Kreyling et al. ( 2018 ) and Chalcraft ( 2019 ) plus an additional measure for prediction accuracy based on root mean square error (RMSE) and compared the three approaches. According to Kreyling et al. ( 2018 ), we quantified prediction accuracy as the deviation between the response shape obtained from the predicted response based on the different sampling strategies and procedures and the ‘true’, known underlying response shape, i.e. multiple R 2 of a linear regression between predicted and ‘true’ response values based on a fixed number of 1000 equidistantly spaced locations. Multiple R² thereby measures how closely predicted values fall along a line, which describes how predicted and true values covary with each other. Chalcraft ( 2019 ) criticized multiple R² for not measuring the degree to which predicted and true values match, but only the degree to which they correlate. We therefore pursued an alternative approach and forced the regression between predicted and ‘true’ response values through zero and on a slope of one, measuring the degree to which predicted and true values perfectly match (see Chalcraft 2019 and simulation R code in the supplementary material). In the third approach, we quantified the root mean square error (RMSE) based on the predicted and ‘true’ response values for a fixed number of 1000 equidistantly spaced locations. For better comparison with the multiple R 2 and Chalcraft approaches, where higher values indicate higher prediction accuracy, we here used negative RMSE as the measure of prediction accuracy. RMSE shows a linear relation to increasing noise, which is in contrast to the non-linear behavior of multiple R 2 and Chalcraft’s prediction success (Supplementary Material Figure S2 ). To evaluate the success of revealing the known underlying response shapes, we repeated sampling 1000 times for each combination of total number of experimental units, number of locations and number of replicates. To test our hypotheses, we analyzed the output of our simulations by comparing four different models (see Table 1). Gradient length covered by sampling locations can potentially influence the quantification of the prediction accuracy, as longer gradients tend to result e.g. in higher R². Furthermore, real ecological gradient studies might lack knowledge about total length of the driver gradient. To account for the effect of the sampled gradient length on the accuracy to predict the response of interest, we repeated all simulations with the two ends of the driver gradient always being sampled (scenario: “with predictor extremes”). Effects of replication on prediction accuracy were tested using robust linear mixed effect models with total sample size and response shape as random effects (random intercept). Gradient length (i.e. with and without predictor extremes) as well as the existence (or lack) of a priori knowledge of the underlying response shape were tested either as fixed effects in interaction with replication or as random effects (random intercepts expressed in the following as 1|x) as formulized in the four different models (Table 1), implemented using the lmer()- command of the lmerTest -R-package (v.3.1-3; Kuzentsova et al. 2017 ). For each model, we calculated marginal and conditional R² using the r.squaredGLMM() function of the MuMIn -R-package (v. 1.43.17; Barton 2020 ). For each model, the relative contribution of each individual predictor as well as the shared contribution of interacting predictors were quantified using the variation partitioning approach proposed by Legendre ( 2008 ) based on marginal R 2 . All simulations and analyses were performed in R (R Core Team 2021 ) with a level of significance being set to alpha = 0.05. Declarations Funding declaration No funding to declare Author Contribution The study idea was developed by AHS and JK with input by all co-authors. AHS performed the artificial data simulations and analyses with input by JK and MB. AG conducted the literature survey. AHS led the writing with significant contribution by all authors. Data Availability All data and script for data analyses will be made freely available on zenodo upon publication. For the review all data and analyses files are attached. References Bardgett, R.D. & Caruso, T. (2020). Soil microbial community responses to climate extremes: resistance, resilience and transitions to alternative states. Philosophical Transactions of the Royal Society B: Biological Sciences , 375, 20190112. Barton, K. (2020). MuMIn: Multi-Model Inference. R package. Barve, V.V., Brenskelle, L., Li, D., Stucky, B.J., Barve, N.V., Hantak, M.M., et al. (2020). Methods for broad-scale plant phenology assessments using citizen scientists’ photographs. Applications in Plant Sciences , 8, e11315. Berdugo, M., Delgado-Baquerizo, M., Soliveres, S., Hernández-Clemente, R., Zhao, Y., Gaitán, J.J., et al. (2020). Global ecosystem thresholds driven by aridity. Science , 367, 787–790. Chalcraft, D.R. (2019). To replicate, or not to replicate – that should not be a question. Ecology Letters , 22, 1174–1175. Davies, G.M. & Gray, A. (2015). Don’t let spurious accusations of pseudoreplication limit our ability to learn from natural experiments (and other messy kinds of ecological monitoring). Ecology and Evolution , 5, 5295–5304. De Boeck, H.J., Bloor, J.M.G., Aerts, R., Bahn, M., Beier, C., Emmett, B.A., et al. (2020). Understanding ecosystems of the future will require more than realistic climate change experiments – A response to Korell et al. Global Change Biology , 26, e6–e7. Guo, Q., Chen, A., Crockett, E.T.H., Atkins, J.W., Chen, X. & Fei, S. (2023). Integrating gradient with scale in ecological and evolutionary studies. Ecology , 104, e3982. Hurlbert, S.H. (1984). Pseudoreplication and the Design of Ecological Field Experiments. Ecological Monographs , 54, 187–211. Ingrisch, J., Umlauf, N. & Bahn, M. (2023). Functional thresholds alter the relationship of plant resistance and recovery to drought. Ecology , 104, e3907. Kelly, M., Bennion, H., Burgess, A., Ellis, J., Juggins, S., Guthrie, R., et al. (2009). Uncertainty in ecological status assessments of lakes and rivers using diatoms. Hydrobiologia , 633, 5–15. Kreyling, J., Jentsch, A. & Beier, C. (2014). Beyond realism in climate change experiments: gradient approaches identify thresholds and tipping points. Ecology Letters , 17, 125-e1. Kreyling, J., Schweiger, A.H., Bahn, M., Ineson, P., Migliavacca, M., Morel‐Journel, T., et al. (2018). To replicate, or not to replicate – that is the question: how to tackle nonlinear responses in ecological experiments. Ecology Letters , 21, 1629–1638. Kuzentsova, A., Brockhoff, P.B. & Christensen, R.H.B. (2017). “lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software , 82, 1–26. Legendre, P. (2008). Studying beta diversity: ecological variation partitioning by multiple regression and canonical analysis. Journal of Plant Ecology , 1, 3–8. Li, J., Peng, P. & Zhao, J. (2020). Assessment of soil nematode diversity based on different taxonomic levels and functional groups. Soil Ecol. Lett. , 2, 33–39. Manning, P. (2019). Piling on the pressures to ecosystems. Science , 366, 801–801. Morel-Journel, T., Thuillier, V., Pennekamp, F., Laurent, E., Legrand, D., Chaine, A.S., et al. (2020). A multidimensional approach to the expression of phenotypic plasticity. Functional Ecology , 34, 2338–2349. Oksanen, L. (2001). Logic of experiments in ecology: is pseudoreplication a pseudoissue? Oikos , 94, 27–38. R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Richardson, A.D., Aubinet, M., Barr, A.G., Hollinger, D.Y., Ibrom, A., Lasslop, G., et al. (2012). Uncertainty Quantification. In: Eddy Covariance: A Practical Guide to Measurement and Data Analysis , Springer Atmospheric Sciences (eds. Aubinet, M., Vesala, T. & Papale, D.). Springer Netherlands, Dordrecht, pp. 173–209. Rineau, F., Malina, R., Beenaerts, N., Arnauts, N., Bardgett, R.D., Berg, M.P., et al. (2019). Towards more predictive and interdisciplinary climate change ecosystem experiments. Nat. Clim. Chang. , 9, 809–816. Scheffer, M., Bascompte, J., Brock, W.A., Brovkin, V., Carpenter, S.R., Dakos, V., et al. (2009). Early-warning signals for critical transitions. Nature , 461, 53–59. Scheffer, M., Hirota, M., Holmgren, M., Nes, E.H.V. & Chapin, F.S. (2012). Thresholds for boreal biome transitions. PNAS , 109, 21384–21389. Schöps, R., Goldmann, K., Korell, L., Bruelheide, H., Wubet, T. & Buscot, F. (2020). Resident and phytometer plants host comparable rhizosphere fungal communities in managed grassland ecosystems. Sci Rep , 10, 919. Schweiger, A.H. (2017). The complex adaptive character of spring fens as model ecosystems. Frontiers of Biogeography , 9. Schweiger, A.H., Irl, S.D.H., Steinbauer, M.J., Dengler, J. & Beierkuhnlein, C. (2016). Optimizing sampling approaches along ecological gradients. Methods in Ecology and Evolution , 7, 463–471. Taylor, S.D. & Guralnick, R.P. (2019). Opportunistically collected photographs can be used to estimate large-scale phenological trends. bioRxiv , 794396. Tables Tables 1 to 2 are available in the Supplementary Files section. Additional Declarations No competing interests reported. Supplementary Files SchweigeretalSamplingSI.docx SchweigeretalSamplingSITabS1toS6.xlsx SchweigeretalSamplingSITabS7.xlsx SchweigeretalSamplingSITabS8toS10.xlsx SchweigeretalsamplingSimulation.r HistoryROIsamplingresults20230318wExtremes.rdata HistoryROIsamplingresults20230318woExtremes.rdata Tables.docx Cite Share Download PDF Status: Published Journal Publication published 13 Feb, 2026 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 18 Nov, 2025 Reviews received at journal 17 Nov, 2025 Reviewers agreed at journal 12 Nov, 2025 Reviews received at journal 01 Nov, 2025 Reviewers agreed at journal 27 Oct, 2025 Reviewers agreed at journal 27 Oct, 2025 Reviewers invited by journal 27 Oct, 2025 Editor invited by journal 27 Oct, 2025 Editor assigned by journal 24 Oct, 2025 Submission checks completed at journal 24 Oct, 2025 First submitted to journal 22 Oct, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7921667","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":538561669,"identity":"5a4f959a-f128-406c-95c9-c39446a48f4a","order_by":0,"name":"Andreas H. Schweiger","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABRElEQVRIie2RMWuDQBTHXwgky5GuJxb9Ck+Ek34bpZAuUgKBTqkNFJol6XyhX6JQcBYEXa6dBQNx6pTBqViQ0DPJoIQQuhXqb7nHu/vx/o8DaGn5m+DhJIe6PwWwIajq3r4TnFNI8Culgtqwe7ZX4Ei5eHn083zigT57j2g+utes5YZlGaw0q7+IsnzkwyCe1hW6isZLHoWA4naocIzNy9S10IZP82r+cWNwTEERzTGJa3ZJT8aghKUEI4erLqPONnReE5epRCqY2HVDr5RyK4NxqZQYPXBFMLmOVNYbppaVss4au1dK56krx0kFcGJTOW6vyI4KuymNXEYyHHcWzyFB4VrfcwwMToZ3lWLKDlPmmBJFNIJpyfUbFF+eps+EiUXp6bQf+koBoYaxTFiUqTaIjz6m/inh6atTeGfuW1paWv4jP+z3fXbT23miAAAAAElFTkSuQmCC","orcid":"","institution":"University of Hohenheim","correspondingAuthor":true,"prefix":"","firstName":"Andreas","middleName":"H.","lastName":"Schweiger","suffix":""},{"id":538561671,"identity":"b3dd3016-2915-4680-84b2-26487342a403","order_by":1,"name":"Aron Garthen","email":"","orcid":"","institution":"Greifswald University","correspondingAuthor":false,"prefix":"","firstName":"Aron","middleName":"","lastName":"Garthen","suffix":""},{"id":538561673,"identity":"45151063-6f3e-4b3d-8e66-a7ad6e303b06","order_by":2,"name":"Michael Bahn","email":"","orcid":"","institution":"University of Innsbruck","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"","lastName":"Bahn","suffix":""},{"id":538561675,"identity":"de40b0e7-7a5f-410c-9c4f-258f8c0983e2","order_by":3,"name":"David Chalcraft","email":"","orcid":"","institution":"East Carolina University","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Chalcraft","suffix":""},{"id":538561676,"identity":"332a8043-4686-420a-b466-a9776e674aca","order_by":4,"name":"Nicolas Schtickzelle","email":"","orcid":"","institution":"Université catholique de Louvain","correspondingAuthor":false,"prefix":"","firstName":"Nicolas","middleName":"","lastName":"Schtickzelle","suffix":""},{"id":538561678,"identity":"690e32da-0121-4b92-a571-c51aae9500a5","order_by":5,"name":"Klaus Steenberg Larsen","email":"","orcid":"","institution":"University of Copenhagen","correspondingAuthor":false,"prefix":"","firstName":"Klaus","middleName":"Steenberg","lastName":"Larsen","suffix":""},{"id":538561679,"identity":"51651edd-b1a2-4095-afc1-9424af40e03e","order_by":6,"name":"Jürgen Kreyling","email":"","orcid":"","institution":"Greifswald University","correspondingAuthor":false,"prefix":"","firstName":"Jürgen","middleName":"","lastName":"Kreyling","suffix":""}],"badges":[],"createdAt":"2025-10-22 09:10:29","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7921667/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7921667/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-026-38541-4","type":"published","date":"2026-02-13T15:57:15+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":95313816,"identity":"3939aab4-783b-498d-8c82-9015e2f5cb00","added_by":"auto","created_at":"2025-11-06 15:52:04","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":692020,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingmainfile.docx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/2c6b886df59b81c457661c4b.docx"},{"id":95268115,"identity":"ca722e05-3968-4531-bace-98eb57a1d797","added_by":"auto","created_at":"2025-11-06 06:30:25","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7624,"visible":true,"origin":"","legend":"","description":"","filename":"7aaf9513866e472f84d5f7df257553ca.json","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/09b671ea40365de5ecd31807.json"},{"id":95268160,"identity":"493eeea5-20ed-4883-a0ca-2fd196a74646","added_by":"auto","created_at":"2025-11-06 06:30:27","extension":"rdata","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":103163325,"visible":true,"origin":"","legend":"","description":"","filename":"HistoryROIsamplingresults20230318wExtremes.rdata","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/f324a45fbb0d3a65678e0312.rdata"},{"id":95268155,"identity":"497303fb-311c-400c-a31e-f6f0cf895d24","added_by":"auto","created_at":"2025-11-06 06:30:27","extension":"rdata","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":104126981,"visible":true,"origin":"","legend":"","description":"","filename":"HistoryROIsamplingresults20230318woExtremes.rdata","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/d42e30ce87a92daf99b4598b.rdata"},{"id":95268147,"identity":"0be1d8d8-4915-41b1-a218-882af71a964e","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"docx","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2468807,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSI.docx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/85221da455153451890affeb.docx"},{"id":95313575,"identity":"6c496e00-c1b6-4dc3-8848-aab7b289319e","added_by":"auto","created_at":"2025-11-06 15:51:42","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":31408,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS1toS6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/f381f554486a68734c99f211.xlsx"},{"id":95313122,"identity":"a51dfba6-abd5-4e5f-bf40-5442d48a90c1","added_by":"auto","created_at":"2025-11-06 15:50:57","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":14409,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS7.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/fb5aecf43a279332e07e9a58.xlsx"},{"id":95313571,"identity":"f3621cf9-69fb-4871-b1a4-509c4496e20d","added_by":"auto","created_at":"2025-11-06 15:51:42","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":17386,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS8toS10.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/80f1ae15a96092ae3fb83aa7.xlsx"},{"id":95268126,"identity":"cdb45935-387d-4da5-963b-343ec23c4ddf","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"r","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":64999,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalsamplingSimulation.r","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/a1619bca82631ec47d993276.r"},{"id":95268129,"identity":"17f8c857-eabc-460f-ae62-57e589d1530f","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"xml","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":94069,"visible":true,"origin":"","legend":"","description":"","filename":"7aaf9513866e472f84d5f7df257553ca1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/037b886c0a0836bfb177e110.xml"},{"id":95313725,"identity":"1902a8a5-5415-4876-8503-ebb6a4091f3e","added_by":"auto","created_at":"2025-11-06 15:51:55","extension":"eps","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":423,"visible":true,"origin":"","legend":"","description":"","filename":"drawingimage1.eps","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/081b031020ce1e130075fc53.eps"},{"id":95313700,"identity":"9ada847d-e4e4-46fd-b957-e9e23474e11a","added_by":"auto","created_at":"2025-11-06 15:51:52","extension":"eps","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":423,"visible":true,"origin":"","legend":"","description":"","filename":"drawingimage2.eps","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/144a3925df0aaf69f4380c45.eps"},{"id":95313244,"identity":"5ef69a36-1b01-4dd7-8a8a-bf56bb12978f","added_by":"auto","created_at":"2025-11-06 15:51:09","extension":"eps","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":423,"visible":true,"origin":"","legend":"","description":"","filename":"drawingimage2.eps","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/1331b7da6349f3538c38d09c.eps"},{"id":95313022,"identity":"09a7b3b7-7218-49c0-b6a5-70e549dd3fb1","added_by":"auto","created_at":"2025-11-06 15:50:46","extension":"eps","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":423,"visible":true,"origin":"","legend":"","description":"","filename":"drawingimage2.eps","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/e895685d692d9a84a8fcb5de.eps"},{"id":95313367,"identity":"b7273154-0f0b-4267-8093-8394f206a222","added_by":"auto","created_at":"2025-11-06 15:51:18","extension":"jpeg","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":47845,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/888c32b665007bb974d56dcd.jpeg"},{"id":95268131,"identity":"5747a8a4-badf-4ae8-a350-c58c5f2e1fa4","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":22602,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/d26eb816ac6e8e4b8edebc46.jpeg"},{"id":95268130,"identity":"e8512640-0e07-486b-94d1-c2df09295b7e","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":55748,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/02063c652f1d3b562ccefe8d.jpeg"},{"id":95268136,"identity":"3a8ce83d-fe25-4eeb-ab20-286d16ab4dd4","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":21522,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/db449aa83d79703ebca63a11.jpeg"},{"id":95268134,"identity":"005fcdec-9731-41c1-b415-8fddefb7bac3","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":62744,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/20eb005dc0c03699f436a686.jpeg"},{"id":95313187,"identity":"a07eec2c-60da-4c53-a832-2586d40068ed","added_by":"auto","created_at":"2025-11-06 15:51:03","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3677,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/ec43712244cd290e4bd1ef28.png"},{"id":95268138,"identity":"f70d8f7d-4929-4998-8cdb-68a87c00e3fe","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":321252,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/b9fc57719efe54f3d6e95a0c.jpeg"},{"id":95268151,"identity":"9659b6bd-9a5d-46c2-b618-11146bdd5ffc","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":108530,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/f481374d16367fecad0c9206.jpeg"},{"id":95268149,"identity":"da394338-6ae3-4750-baad-3d399ee52c06","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"jpeg","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":176394,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/67f4da301b445b02d074173f.jpeg"},{"id":95313245,"identity":"e9303110-f150-4aec-b72e-38ab074dc1d8","added_by":"auto","created_at":"2025-11-06 15:51:09","extension":"png","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7922,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/710e51c5dfb48bcc1c7b88d7.png"},{"id":95268141,"identity":"46e968ee-ff6c-4cb7-a16f-e756cd91b5b5","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"png","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3349,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/bebabda5b58d20cf6906e8c0.png"},{"id":95268154,"identity":"41bb04af-bbf3-4b9a-9155-2321fcaf11b3","added_by":"auto","created_at":"2025-11-06 06:30:27","extension":"png","order_by":25,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":11425,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/f65d4f6b1f2bb86d4300a325.png"},{"id":95313138,"identity":"e8483db1-c017-4dd6-8f2d-000d278d068a","added_by":"auto","created_at":"2025-11-06 15:51:00","extension":"png","order_by":26,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3393,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/ef5c9ded440d33df030fc55f.png"},{"id":95268144,"identity":"43cb59fb-5cd6-4902-89f7-673ff0b6d962","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"png","order_by":27,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":12321,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/ca068e7a5dd2d0ebc7599892.png"},{"id":95313596,"identity":"87956661-575d-43e3-90d7-e8aa10745d5e","added_by":"auto","created_at":"2025-11-06 15:51:43","extension":"png","order_by":28,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3535,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/809149b0898c81d44919ee61.png"},{"id":95268145,"identity":"5b082c0f-4dcf-439f-9b88-76066e03d463","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"png","order_by":29,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":69851,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/78127ae2e2060623e0203b07.png"},{"id":95313851,"identity":"240d2546-1f84-4acd-9b20-19f60ad37e2b","added_by":"auto","created_at":"2025-11-06 15:52:08","extension":"png","order_by":30,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":79041,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/5b4a13a82107f6b06611fd42.png"},{"id":95268140,"identity":"094e844b-ddb7-4bb2-a9fd-a45101f107fa","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"png","order_by":31,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":69909,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/e58239429bf2cc6aa69bd58d.png"},{"id":95268150,"identity":"605c7ee9-03ae-4219-b21a-6eafeca3716d","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"xml","order_by":32,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":94642,"visible":true,"origin":"","legend":"","description":"","filename":"7aaf9513866e472f84d5f7df257553ca1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/193aa9557bc9e8cf86179d87.xml"},{"id":95268142,"identity":"4dd4a683-2ed6-4b75-bcbc-d90274af104f","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"html","order_by":33,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":105395,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/34ad4c3c98e7aa64513b4970.html"},{"id":95523466,"identity":"db1e7f45-149f-414d-97e2-1e7731119c29","added_by":"auto","created_at":"2025-11-10 09:56:10","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":39990,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eAppropriate and inappropriate analytical approaches chosen in ecological experimentation with interest in either group comparison (contrasts) or response patterns. The analysis was based on an annually stratified random sample of 210 papers published between 1988 and 2022. Grey bars indicate an appropriate use of either contrasts or pattern analysis based on the stated hypotheses and applied statistics, i.e. the black bar indicates papers where pattern analysis would have been the better choice according to the hypotheses, but group contrasts were compared. Detailed methodological description for the systematic literature review is provided in the Supplementary Material (Methods S1).\u003c/em\u003e\u003c/p\u003e","description":"","filename":"1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/8d65da1aaf25bb8843956671.jpg"},{"id":95268112,"identity":"ce921c8b-922b-4540-a41e-fa802113b81c","added_by":"auto","created_at":"2025-11-06 06:30:25","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":141025,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eThe six response shapes on which the different sampling strategies (i.e. where to place samples along the investigated gradient) and procedures (i.e. how to trade off number of sampling locations against number of replicates per location) were tested in the artificial data simulations. Besides the systematic (systematic and log systematic) and random sampling strategies, preferential sampling strategies that account for critical response points (slopes and/or extremes) were tested, which would in practice only occur with detailed, \u003c/em\u003ea priori\u003cem\u003e knowledge of the response patterns. Theoretical sampling probabilities of these three optimized sampling strategies along the studied gradient are shown with the three blue-to-red color bars below each response shape. Additionally, distribution of sampling locations for the six different sampling strategies along the predictor gradient are exemplified below each response shape for 12 sampling locations. Note that the sampling locations for all but the systematic and the log systematic sampling strategies differ slightly (preferential sampling) to strongly (random sampling) between single simulations runs.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/89c7380943a58a90ee1276ea.jpg"},{"id":95268111,"identity":"4fd8c9c4-19b9-4ced-93db-69f1a5a05c26","added_by":"auto","created_at":"2025-11-06 06:30:25","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":92050,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eEffect of replication on prediction accuracy for a given number of samples. Effects were quantified for prediction accuracy measures based on multiple R\u003c/em\u003e\u003csup\u003e\u003cem\u003e2\u003c/em\u003e\u003c/sup\u003e\u003cem\u003e (a), Chalcraft’s prediction success \u003c/em\u003e(Chalcraft 2019, b)\u003cem\u003e and negative RMSE (c) for 20% noise. \u003c/em\u003eA priori\u003cem\u003e knowledge of the underlying response shape was accounted for in a linear mixed effect model (Model 3 in Table 1). Results for 100% noise are summarized in Fig. S11.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/7aa327b9bbf9332c07de8438.jpg"},{"id":95268114,"identity":"263650b6-69c6-4d0e-84a2-d619d216434d","added_by":"auto","created_at":"2025-11-06 06:30:25","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":118365,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eDecision tree for ecological experimenters based on general considerations and conclusions drawn from the integration of simulation results for all three measures of prediction accuracy (Tab S4 to S7). Individual decision trees for each of the three measure of prediction accuracy are provided in the Supplement Figures S15 to S17.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/9b1e1f0d1ff94e7564cee0cf.jpg"},{"id":102785172,"identity":"d49f5412-cd58-4746-bb99-a4e60ecf8f75","added_by":"auto","created_at":"2026-02-16 16:01:01","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":870522,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/c0e233ef-fd5d-499c-837d-2ba98794cd88.pdf"},{"id":95313310,"identity":"1f81be5b-1c3b-4a7c-85a2-0058ac5e066c","added_by":"auto","created_at":"2025-11-06 15:51:14","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2468807,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSI.docx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/c91c4750cc2b67e23b9f438b.docx"},{"id":95268116,"identity":"a379b1ae-acd2-41f8-98a7-f3f346c5d871","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":31408,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS1toS6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/ed1032e8cee6e37a8cb2858f.xlsx"},{"id":95268119,"identity":"b2189e79-4c86-4ca1-88ba-708d69121d88","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":14409,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS7.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/5610867dbb49137ff85e634a.xlsx"},{"id":95268123,"identity":"fe186cc4-05fa-4762-b122-91b51f70f7d7","added_by":"auto","created_at":"2025-11-06 06:30:26","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":17386,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalSamplingSITabS8toS10.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/79804debaf0a24bb5342f488.xlsx"},{"id":95313331,"identity":"6d4ac4ef-21a0-42e3-9d24-82a00dac60b1","added_by":"auto","created_at":"2025-11-06 15:51:16","extension":"r","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":64999,"visible":true,"origin":"","legend":"","description":"","filename":"SchweigeretalsamplingSimulation.r","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/0c2e6cc19928e12f8df8ecc0.r"},{"id":95268162,"identity":"f9a29d28-b39d-4007-a9cf-eff1e5068ac6","added_by":"auto","created_at":"2025-11-06 06:30:30","extension":"rdata","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":103163325,"visible":true,"origin":"","legend":"","description":"","filename":"HistoryROIsamplingresults20230318wExtremes.rdata","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/ba0a65c572b20e2702adf811.rdata"},{"id":95268161,"identity":"0a69485c-3c93-4159-89a1-05d659530894","added_by":"auto","created_at":"2025-11-06 06:30:29","extension":"rdata","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":104126981,"visible":true,"origin":"","legend":"","description":"","filename":"HistoryROIsamplingresults20230318woExtremes.rdata","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/74ef07fe5a4cbc1dacf16d4d.rdata"},{"id":95313542,"identity":"e0f99443-4aaf-4350-ad98-0c8698874844","added_by":"auto","created_at":"2025-11-06 15:51:39","extension":"docx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":26280,"visible":true,"origin":"","legend":"","description":"","filename":"Tables.docx","url":"https://assets-eu.researchsquare.com/files/rs-7921667/v1/5ac94dd3f477e26dd23712aa.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"How to optimally allocate sampling effort in experimental ecology","fulltext":[{"header":"Introduction","content":"\u003cp\u003eA key aspect of ecological research is to quantify ecological responses to environmental change. Ecological responses are often characterized by high degrees of non-linearity, which challenge research approaches (Schweiger \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Gradient studies, which analyze ecological responses along continuous environmental gradients, have been proposed as the most appropriate approach for detecting response patterns with high prediction accuracy (Schweiger et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Manning \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). Response patterns deduced from such gradient studies may not only advance ecological understanding, but can also inform models used for projecting possible future ecosystem responses as an important tool for climate change impact assessments (De Boeck et al. \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eGradient designs and analyses are frequently used in subdisciplines of ecology such as vegetation ecology, biogeography or macroecology. However, they remain underrepresented in other fields such as experimental ecology. While the quantification of ecological response patterns is of key interest in experimental ecology, during the last decades researchers have predominately applied suboptimal sampling designs and analytical approaches when studying these responses (i.e. analysis of group contrasts, black bars in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e for temporal development).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eReplication is a crucial component of sampling design, especially when contrasts between different groups are analyzed (e.g. in classical treatment vs. control experimental designs). This holds true also for gradient designs, where a high number of replicates for each sampling location along the investigated gradient would be beneficial for prediction accuracy if resources are unlimited for sampling. However, given the limited number of samples which can be realistically taken, an inevitable trade-off emerges between replication and the number of sampling locations which can be considered to cover the environmental gradient of interest. This trade-off has provoked discussions about the optimal sampling procedure, i.e. whether to put more emphasis on the number of sampling locations or experimental levels at the expense of local replication or whether to put more emphasis on replication but with fewer locations or levels sampled along the gradient of interest (Oksanen \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2001\u003c/span\u003e; Davies \u0026amp; Gray \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Schweiger et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eWhat is known from previous, quantitative studies on the role of replication in gradient studies is that without \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape and random sampling locations along the gradient, unreplicated designs outperform replicated designs in their prediction accuracy for a given sampling effort (Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Systematic sampling along a gradient, however, can result in an advantage of replicated designs for simple response shapes such as linear or quadratic (humped) relationships when response shape and gradient length are \u003cem\u003ea priori\u003c/em\u003e known (Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). These contrasting perspectives on the importance of replication differ in two aspects, (1) the different metrics of prediction accuracy used in the two studies, (2) differences in sampling strategy, i.e. random placement of sampling locations along the investigated gradient vs. an equidistant, systematic sampling and (3) the assumption on whether the underlying response shape to be analyzed is \u003cem\u003ea priori\u003c/em\u003e known or unknown. These apparent differences lead to the hypothesis that sampling strategy, i.e. the decision where to locate sampling along the investigated gradient, might be decisive for defining whether replication at the expense of sampling locations has positive or negative effects on prediction accuracy. Sampling strategy is further expected to be strongly affected by \u003cem\u003ea priori\u003c/em\u003e knowledge on the underlying response shape as well as the length of the investigated gradient. However, understanding the effects of sampling strategy is so far lacking.\u003c/p\u003e\u003cp\u003eTo address this lack of knowledge and to reconcile the contrasting perspectives, we extended the artificial data simulations from the previous studies of Kreyling et al. (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) and Chalcraft (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) for different sampling approaches, combining different sampling procedures (i.e. how to trade off number of sampling locations against number of replicates per location) and sampling strategies (i.e. where to place sample locations along the investigated gradient) used in regression-type analyses (for methodological details see the Online Methods section). We tested this for a set of six response shapes representing typical shapes commonly observed in ecology (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). We focused on the effects of different sampling approaches on the prediction accuracy in relation to the effects of replication. We furthermore tested for the effects of model assumptions, i.e. whether the underlying response shape is \u003cem\u003ea priori\u003c/em\u003e known. We investigated these effects for the two different measures of prediction accuracy used by Kreyling et al. (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) and Chalcraft (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) and an additional measure based on the root mean square error (RMSE) to achieve a balanced view on different perspectives on prediction accuracy. We finally investigated the effects of gradient length, i.e. whether the response of interest along the underlying environmental gradient is partly or fully sampled.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eBased on our previous evidence, we hypothesized that the sampling procedure in interaction with the sampling strategy would have significant effects on prediction accuracy in gradient studies. More specifically, we expected that replication would increase prediction accuracy when the underlying response shape is \u003cem\u003ea priori\u003c/em\u003e known and, thus, also the locations of the critical response points (for definition see methods below) along the environmental gradient are known and accounted for in the sampling approach. We expected this to be especially relevant for non-linear response shapes, i.e. shapes that cannot mathematically be described as a simple linear model in the form of \u003cem\u003ey\u0026thinsp;=\u0026thinsp;a∙x\u0026thinsp;+\u0026thinsp;b\u003c/em\u003e. However, we expected replication to yield lower prediction accuracy when the underling response shape is unknown and, thus, critical response points are unknown and therefore not accounted for in the sampling approach. We defined critical response points as locations along the environmental gradient where sampling is crucial for an accurate prediction of the studied response along this gradient. These can be response extremes, i.e. maximum or minimum response values along the studied gradient, or regions of strong response changes, i.e. regions of maximum slope of the response pattern \u0026ndash; or a combination of both (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Based on our analyses, we derived recommendations on how to optimize sampling approaches in observational and experimental ecological research, including when to replicate and when to apply non-replicated study designs. Furthermore, we provide recommendations on optimized sampling strategies, i.e. where to best sample along gradients.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eComparing the three measures of prediction accuracy, multiple R\u003csup\u003e2\u003c/sup\u003e showed significantly higher sensitivity to variations in the different simulation settings than RMSE or Chalcraft’s prediction success\u0026nbsp;(Chalcraft 2019)\u0026nbsp;(see Conditional R\u003csup\u003e2\u003c/sup\u003e values for the four different models and 20% as well as 100% noise summarized in Table 2; Supplementary Material Figure S3 and S4 as well as Tables S1 to S6 for more details). In the following we will show and discuss the results obtained for all three measures of prediction success.\u003c/p\u003e\n\u003cp\u003eThe number of replicates in combination with the \u003cem\u003ea priori\u003c/em\u003e knowledge on the underlying response shape (i.e. Model 3) turned out to be the best set of explanatories for prediction accuracy based on multiple R\u003csup\u003e2\u003c/sup\u003e (see Marginal R\u003csup\u003e2\u003c/sup\u003e for 20% and 100% noise in Table 2). The other three sets of predictors describing prediction accuracy, i.e. \u003cem\u003ereplication\u003c/em\u003e (i.e. Model 1), \u003cem\u003ereplication * gradient length\u003c/em\u003e (i.e. Model 2) and \u003cem\u003ereplication* total sample size\u003c/em\u003e (i.e. Model 4) explained significantly less variation in prediction accuracy (i.e. multiple R\u003csup\u003e2\u003c/sup\u003e) and were not statistically different from each other. For RMSE and Chalcraft’s prediction success, all models showed very low explanatory power with no significant differences between the different models, except for Chalcraft’s prediction success at 100% noise, where the replication-only model (i.e. model 1) explained significantly less variation than the other models where replication interacted with the other explanatories. Details on the individual model statistics for the different sampling strategies and response shapes are available in Table S1 to S6.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFor the best explanatory model based on multiple R\u003csup\u003e2\u003c/sup\u003e (i.e. Model 3), replication turned out to be the main explanatory of prediction accuracy, with 23\u0026nbsp;± 31% (arithmetic mean\u0026nbsp;±\u0026nbsp;standard deviation) of explained total variation for 20% of noise and 31\u0026nbsp;± 32% for 100% of noise (Figure S5 and S8). Explanatory power of \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape and its shared predictive power with replication was significantly lower (i.e. 2.0\u0026nbsp;±\u0026nbsp;4.2% with 20% noise as well as 0.70\u0026nbsp;±\u0026nbsp;1.3% for 100% noise; in all cases p\u0026lt;0.001 in multi-comparison tests). This strong, individual effect of replication on prediction accuracy vanished when combined with gradient length (Model 2) or total sample size (Model 4) as interacting predictors (see Figures S5 and S8). For RMSE and Chalcraft’s prediction success, relative contributions of replicates, \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape and their combination were all minor (see Supplementary Figures S6, S7, S9 and S10). \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eReplication showed significant negative effects on prediction accuracy for 94% (multiple R\u003csup\u003e2\u003c/sup\u003e), 33% (Chalcraft’s prediction success) and 39% (RMSE) of all tested cases at 20% noise (results of a linear mixed model accounting for knowledge on the underling response shape, Figure 3 and Table S7 for details). Positive effects of replication were detected in 0% (R\u003csup\u003e2\u003c/sup\u003e and Chalcraft) and 3% (RMSE), whereas non-significant effects were observable in 6, 58 and 67% of all tested cases for multiple R\u003csup\u003e2\u003c/sup\u003e, RMSE and Chalcraft’s prediction success, respectively. Similar patterns were observable for 100% of noise despite for Chalcraft’s prediction success, for which no significant effects of replication were observable (see Supplementary Fig. S11 and table S7 for more details).\u003c/p\u003e\n\u003cp\u003ePrediction accuracy differed among the different sampling strategies with systematic sampling yielding highest prediction accuracy. This was consistent across the three different measures of prediction accuracy, the different noise levels as well as independent of \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response pattern (Supplementary Table S8-10 as well as Fig. S 12-14).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eDetailed understanding of the effects of the sampling approach on prediction accuracy is paramount when aiming for scientifically sound and cost-efficient study designs. Here, we assessed how to optimize the number of sampling locations along gradients relative to the number of replicates per location for a given number of available samples. Our premise was that more samples are generally beneficial to increase statistical power, but they come at a higher cost in terms of resources and personnel. We furthermore investigated how to optimally place the samples along the investigated gradient to maximize prediction accuracy in gradient studies for the major response shapes commonly observed in ecology.\u003c/p\u003e\u003cp\u003ePrevious, quantitative studies on the role of replication have shown that replication is not necessary or can even have negative effects on prediction accuracy in scenarios where the underlying response patterns are unknown (Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Systematic sampling along a gradient, however, can result in an advantage of replicated designs for simple response shapes such as linear or quadratic (humped) relationships when response shape and gradient length are \u003cem\u003ea priori\u003c/em\u003e known (Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThese two studies differed in their underlying assumptions, in that Kreyling et al. assumed a lack of existing \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape, while Chalcraft assumed the underlying response shape to be known (see Supplement Material of Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). By including \u003cem\u003ea priori\u003c/em\u003e knowledge of the response shape in our simulations, we were able to reproduce the findings of Chalcraft for the response shapes he investigated (i.e. linear and centered hump). This resolves the apparent discrepancy between the two studies by showing that existing or lacking \u003cem\u003ea priori\u003c/em\u003e knowledge on the underlying response shape is decisive for whether replication is beneficial for increasing prediction accuracy or not. Unreplicated designs yielded higher prediction accuracies in our study when the response shape was unknown or the known response shape was more complex (i.e. exponential or logistic). However, replication turned out to be beneficial when the underlying response shape was \u003cem\u003ea priori\u003c/em\u003e known and rather simple (i.e. linear or hump, see Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e for a summary). We detected a positive interaction between replication and knowledge on the underlying response shape, meaning that the negative effect of replication tends to be higher for unknown response shapes and the negative effect of missing \u003cem\u003ea priori\u003c/em\u003e knowledge about the underlying response shape is stronger for stronger replication. However, this interactive effect was rather weak (see Fig \u003cspan refid=\"MOESM5\" class=\"InternalRef\"\u003eS5\u003c/span\u003ea, S6a and S7a). Furthermore, systematic (equidistant) sampling along the investigated gradient turned out to be generally superior to all other tested sampling strategies. This second finding might be a major relief for researchers who are worried how to best sample responses along environmental gradients or decide upon experimental treatment levels without any \u003cem\u003ea priori\u003c/em\u003e knowledge on the underlying response shapes. Still, it might be difficult to implement systematic, equidistant sampling in practice, as scaling between the investigated driver and response might often be non-linear (e.g. metabolic rates double every 10 degrees of temperature increase, or biological/ecological responses to increasing precipitation will often be log-scaled). Under such circumstances it might be not entirely clear what equidistant exactly means and, thus, whether linear or non-linear response shapes have to be assumed or predictor values have to be transformed to obtain linear scaling. In such cases, random placement of samples along the investigated gradient might provide an alternative solution when the location of critical response points are unknown and therefore cannot be accounted for in a preferential sampling strategy. Surprisingly, even if the underlying pattern is known, systematic sampling performed best, or at least not systematically worse, than preferential sampling designs. Preferential sampling covering critical response points such as local extremes or parts of the gradient with strong changes (steep slopes) can become especially beneficial for prediction accuracy when the response shape is \u003cem\u003ea priori\u003c/em\u003e known (i.e. logistic response shapes in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). \u003cem\u003eA priori\u003c/em\u003e knowledge on the investigated response shape \u0026ndash; which can be obtained from pilot studies or might be inferred from existing literature \u0026ndash; can furthermore significantly increase sampling efficiency and prediction accuracy in subsequent studies (see Supplementary Table S8-10). Based on the advanced understanding emerging from our simulations, we derived a set of recommendations to optimize sampling in ecological research (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eReplication obviously increases the prediction accuracy for any specific location along the environmental gradient in the presence of (white) noise by providing a better estimate through averaging. This is especially crucial when aiming for contrasts between factorial groups such as two-level manipulation experiments and when data variance for specific treatment conditions (i.e. within different treatment groups) should be minimal to increase predictive accuracy at the single location (Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Li et al. \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Such classical, replicated experiments are inevitable whenever binary environmental drivers are tested such as presence or absence of specific species or functional groups or the effects of sites or management schemes differing non-continuously or along unknown gradients (Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). However, classical experimental and analytical approaches, such as a two-level manipulation of environmental factors, are highlighted as inappropriate when aiming for characterizing such non-linear processes regulating ecosystem responses to multifactor drivers of global change (Rineau et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) or to quantify phenotypic plasticity (Morel-Journel et al. \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eGradient designs and regression-type, analytical approaches are furthermore better suited when aiming for the characterization of response patterns along environmental gradients. Besides their advantages for mechanistic understanding and model extrapolation, gradient designs capable of capturing non-linear responses are paramount when studying tipping points and thresholds of ecosystems approaching critical regime shifts (Scheffer et al. \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2009\u003c/span\u003e, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Bardgett \u0026amp; Caruso \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Berdugo et al. \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Ingrisch et al. \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Unreplicated gradient designs can furthermore help to avoid pseudo-replication in less controlled settings such as field experiments or other empirical investigations along environment gradients (Sch\u0026ouml;ps et al. \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), a common criticism of replicated designs under such conditions (Hurlbert \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e1984\u003c/span\u003e). Gradient designs will be especially suitable for large-scale citizen science projects where people collect vast amounts of biological information along environmental gradients covering entire countries or continents such as iNaturalist (Taylor \u0026amp; Guralnick \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Barve et al. \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eBy tackling a wide range of different settings in our simulations, we consider our analyses representative for settings of sampling strategies and procedures commonly used in ecological research focusing on gradient analyses. Depending on the length of the gradient along which responses are measured and analyzed, different response shapes might be identified as most accurate to describe the underlying response. This phenomenon of contrasting response patterns being identified for the same underlying process is reported for well-known, functional relationships such as the biodiversity-productivity relationship and can cause misguided discussions about the underlying mechanisms (Guo et al. \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Although our simulations covered variation in gradient length (i.e. scenarios with and without predictor extremes), we did not explicitly tackle the topic of gradient sampling beyond commonly considered ranges \u0026ndash; a fact that calls for future studies in this direction. One empirical approach to resolve this phenomenon of incomplete gradient sampling might be to enlarge the range of investigated, environmental conditions into extreme conditions even beyond the biological limits of the studied responses (e.g. species-specific mortality; Kreyling et al. \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; De Boeck et al. \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Usually, experiments keep the range of investigated conditions within conservative boundaries, presumably because more extreme (although potentially realistic) treatments may have a catastrophic impact on a studied organism or ecosystem, which potentially results in the loss of costly replicated samples due to e.g. death of organisms when physiological limits are crossed under extreme environmental conditions (Rineau et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). Unreplicated gradient designs will allow for such extensions into the extremes without losing too many samples (cf. Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Gradient studies that realize a wide range of environmental conditions will furthermore provide understanding on how far a certain response of a specific organism or ecosystem is situated relative to its lower or upper tolerance limit (Rineau et al. \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eHigh prediction success in gradient analyses is determined by two factors: (1) identification of the response shape and (2) precise estimation of model parameters to maximize predictive accuracy across a wide range of environmental conditions. To achieve this, sampling has to be optimized under limited resources, thus, limited total sample size. For gradient studies, we have shown that available resources should be invested into increasing the number of sampling locations at the expense of replication when the underlying response shape is unknown or complex; nevertheless, replication can be beneficial in gradient studies when the response shapes are simple and known. Unreplicated designs will serve for covering investigated gradients more densely and for pushing experimental systems beyond historical and forecasted extremes. The latter will be decisive for global change impact research, as it enhances our understanding of stressor\u0026ndash;response relationships and thresholds in state and impact beyond already realized environmental conditions. Our simulations furthermore show that systematic sampling along the gradient of interest generally outperforms all other tested sampling strategies, except for complex, \u003cem\u003ea priori\u003c/em\u003e known response shapes, for which preferential sampling of critical response points (i.e. local extremes or parts of the gradient with strong changes) can be beneficial. Such \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape can be instrumental for designing the most informative experiments in the most efficient way.\u003c/p\u003e"},{"header":"Online Methods","content":"\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003eArtificial data simulations\u003c/h2\u003e\u003cp\u003eWe performed simulations based on artificial data similar to Schweiger et al. (\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). To be fully comparable to Chalcraft (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e), we focused on one-factorial responses. We varied four parameters in our simulations: (1) response shape (i.e. mathematical function underlying the observable response along a gradient of environmental conditions), (2) sampling procedure (i.e. total sample size and how to trade off number of sampling locations against number of replicates per location along the environmental gradient), (3) sampling strategy (i.e. where to place sampling locations along the investigated gradient), and (4) level of stochasticity in the response.\u003c/p\u003e\u003cp\u003eWe simulated six different linear to highly nonlinear response shapes representing typical and varied shapes frequently used in ecology and other disciplines of natural sciences to describe response patterns (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Each response shape simulates a response variable (y; e.g. any numeric biotic variable such as photosynthetic activity, species richness or population viability) in response to a numeric environmental driver (x; e.g. temperature, water availability, soil pH or nutrient availability). The specific responses of y to changes in x were formulated as linear or nonlinear functions of the form y\u003csub\u003ei\u003c/sub\u003e = f\u003csub\u003ei\u003c/sub\u003e(x\u003csub\u003ei\u003c/sub\u003e) (Schweiger et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). For transferability and to allow general conclusions, we scaled our variables in arbitrary units (cf. Schweiger et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eDifferent sampling procedures were realized by systematically varying the number of total samples drawn from the underlying response shape and the number of locations at which these samples were placed. We used 6 to 96 samples to cover the range of total sample sizes commonly realized in univariate ecological experiments. This total number of possible observations can either be used for covering the study gradient with many sampling locations and, thus, reducing (or completely abolishing) replicates at each sampling location, or on the contrary for increasing the number of replicates sampled from fewer sampling locations along the gradient. A single sample at a location corresponds to no replication, irrespective of the total number of samples drawn, because we define replicates as the number of samples taken at a single sampling location along the driver gradient (cf. Schweiger et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Our sampling procedures are therefore combinations of 3 to 96 locations sampled and 1 to 32 replicates per location.\u003c/p\u003e\u003cp\u003eWe considered six sampling strategies, differing in the way the number of sampling locations were placed along the driver gradient. Besides random selection of sampling locations (cf. Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) and a systematic, equidistant sampling (with the two ends of the gradient always sampled; cf. Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e), we applied a set of four preferential sampling strategies to account for critical response points. Criticality of sampling locations was quantified (1) as the slope of the respective response curves at the specific location (the higher the slope at a particular location the higher the criticality of the response value at this location), (2) as the extremeness of the response value at a specific location (relative to the arithmetic mean of the minimum and maximum of all response values along the studied gradient), and (3) as a combination of both factors equally weighted. These criticality values were standardized and used to quantify sampling probability of the locations along the investigated gradient in an adjusted, randomized sampling. To account for a non-linear scaling of the environmental gradient, we additionally applied a log systematic sampling strategy, where we sampled equidistantly distributed on a log-scale along the driver gradient. Such non-linear scaling of the driver gradient is a common feature e.g. in water or light response curves conceptualized in ecophysiological reaction norms. Preferential sampling as realized in these simulations might appear to be unrealistic in practice when \u003cem\u003ea priori\u003c/em\u003e knowledge on the underlying response shape is entirely missing. However, such \u003cem\u003ea priori\u003c/em\u003e knowledge can be obtained from pre-studies or literature, allowing for the design of preferential sampling strategies also under real-world conditions.\u003c/p\u003e\u003cp\u003eDifferent levels of stochasticity in the response variable were tested for all different sampling procedures and strategies applied to the different response shapes by allowing the sampled response values (y) at each sampled location to scatter around the \u0026lsquo;true\u0026rsquo; response values with a normal distribution corresponding to 20% (i.e. c. 85% percentile) or 100% (100% percentile) of the absolute response value at this location (cf. Kreyling et al. \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Information about the levels of random noise in \u0026lsquo;real-world\u0026rsquo; data is extremely rare. Richardson et al. (\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2012\u003c/span\u003e) estimated random noise to reach a maximum of 23% of total variation for eddy flux measurements \u0026ndash; a highly uncertain method in environmental science. Kelly et al. (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2009\u003c/span\u003e) reported similar levels of random noise for assessments based on species community composition, which ranged between 3 and 22% of total variation (on average 11.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4,6%). Based on these observations, we assume our 20% noise scenario as a close-to-real world scenario, which will be of practical use in many situations, whereas our 100% noise scenario has to be considered as a very extreme, \u0026lsquo;high-noise\u0026rsquo; scenario.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eAnalysis of simulations\u003c/h3\u003e\n\u003cp\u003eWe used polynomial regression for pattern prediction and interpolation. For the unknown response shape scenario, we allowed the algorithm to choose the best fitting model for the sampled test data from a set of polynomial equations (1st to 4th order) based on minimal AIC. For the scenario where we assumed the response shape to be \u003cem\u003ea priori\u003c/em\u003e known, we selected the prediction model which represented the respective response shape. For each test data set / each selected prediction model, we checked the ability to reveal the true underlying response shape by quantifying prediction accuracy through plotting predicted against the true response values. We quantified prediction accuracy by using the two methods of Kreyling et al. (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) and Chalcraft (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) plus an additional measure for prediction accuracy based on root mean square error (RMSE) and compared the three approaches. According to Kreyling et al. (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), we quantified prediction accuracy as the deviation between the response shape obtained from the predicted response based on the different sampling strategies and procedures and the \u0026lsquo;true\u0026rsquo;, known underlying response shape, i.e. multiple R\u003csup\u003e2\u003c/sup\u003e of a linear regression between predicted and \u0026lsquo;true\u0026rsquo; response values based on a fixed number of 1000 equidistantly spaced locations. Multiple R\u0026sup2; thereby measures how closely predicted values fall along a line, which describes how predicted and true values covary with each other.\u003c/p\u003e\u003cp\u003eChalcraft (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) criticized multiple R\u0026sup2; for not measuring the degree to which predicted and true values match, but only the degree to which they correlate. We therefore pursued an alternative approach and forced the regression between predicted and \u0026lsquo;true\u0026rsquo; response values through zero and on a slope of one, measuring the degree to which predicted and true values perfectly match (see Chalcraft \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2019\u003c/span\u003e and simulation R code in the supplementary material). In the third approach, we quantified the root mean square error (RMSE) based on the predicted and \u0026lsquo;true\u0026rsquo; response values for a fixed number of 1000 equidistantly spaced locations. For better comparison with the multiple R\u003csup\u003e2\u003c/sup\u003e and Chalcraft approaches, where higher values indicate higher prediction accuracy, we here used negative RMSE as the measure of prediction accuracy. RMSE shows a linear relation to increasing noise, which is in contrast to the non-linear behavior of multiple R\u003csup\u003e2\u003c/sup\u003e and Chalcraft\u0026rsquo;s prediction success (Supplementary Material Figure \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eTo evaluate the success of revealing the known underlying response shapes, we repeated sampling 1000 times for each combination of total number of experimental units, number of locations and number of replicates. To test our hypotheses, we analyzed the output of our simulations by comparing four different models (see Table\u0026nbsp;1).\u003c/p\u003e\u003cp\u003eGradient length covered by sampling locations can potentially influence the quantification of the prediction accuracy, as longer gradients tend to result e.g. in higher R\u0026sup2;. Furthermore, real ecological gradient studies might lack knowledge about total length of the driver gradient. To account for the effect of the sampled gradient length on the accuracy to predict the response of interest, we repeated all simulations with the two ends of the driver gradient always being sampled (scenario: \u0026ldquo;with predictor extremes\u0026rdquo;). Effects of replication on prediction accuracy were tested using robust linear mixed effect models with total sample size and response shape as random effects (random intercept). Gradient length (i.e. with and without predictor extremes) as well as the existence (or lack) of \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying response shape were tested either as fixed effects in interaction with replication or as random effects (random intercepts expressed in the following as 1|x) as formulized in the four different models (Table\u0026nbsp;1), implemented using the lmer()- command of the \u003cem\u003elmerTest\u003c/em\u003e-R-package (v.3.1-3; Kuzentsova et al. \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). For each model, we calculated marginal and conditional R\u0026sup2; using the r.squaredGLMM() function of the \u003cem\u003eMuMIn\u003c/em\u003e-R-package (v. 1.43.17; Barton \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). For each model, the relative contribution of each individual predictor as well as the shared contribution of interacting predictors were quantified using the variation partitioning approach proposed by Legendre (\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2008\u003c/span\u003e) based on marginal R\u003csup\u003e2\u003c/sup\u003e. All simulations and analyses were performed in R (R Core Team \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) with a level of significance being set to alpha\u0026thinsp;=\u0026thinsp;0.05.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eFunding declaration\u003c/h2\u003e\u003cp\u003eNo funding to declare\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eThe study idea was developed by AHS and JK with input by all co-authors. AHS performed the artificial data simulations and analyses with input by JK and MB. AG conducted the literature survey. AHS led the writing with significant contribution by all authors.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eAll data and script for data analyses will be made freely available on zenodo upon publication. For the review all data and analyses files are attached.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eBardgett, R.D. \u0026amp; Caruso, T. (2020). Soil microbial community responses to climate extremes: resistance, resilience and transitions to alternative states. \u003cem\u003ePhilosophical Transactions of the Royal Society B: Biological Sciences\u003c/em\u003e, 375, 20190112.\u003c/li\u003e\n\u003cli\u003eBarton, K. (2020). MuMIn: Multi-Model Inference. R package.\u003c/li\u003e\n\u003cli\u003eBarve, V.V., Brenskelle, L., Li, D., Stucky, B.J., Barve, N.V., Hantak, M.M., \u003cem\u003eet al.\u003c/em\u003e (2020). Methods for broad-scale plant phenology assessments using citizen scientists\u0026rsquo; photographs. \u003cem\u003eApplications in Plant Sciences\u003c/em\u003e, 8, e11315.\u003c/li\u003e\n\u003cli\u003eBerdugo, M., Delgado-Baquerizo, M., Soliveres, S., Hern\u0026aacute;ndez-Clemente, R., Zhao, Y., Gait\u0026aacute;n, J.J., \u003cem\u003eet al.\u003c/em\u003e (2020). Global ecosystem thresholds driven by aridity. \u003cem\u003eScience\u003c/em\u003e, 367, 787\u0026ndash;790.\u003c/li\u003e\n\u003cli\u003eChalcraft, D.R. (2019). To replicate, or not to replicate \u0026ndash; that should not be a question. \u003cem\u003eEcology Letters\u003c/em\u003e, 22, 1174\u0026ndash;1175.\u003c/li\u003e\n\u003cli\u003eDavies, G.M. \u0026amp; Gray, A. (2015). Don\u0026rsquo;t let spurious accusations of pseudoreplication limit our ability to learn from natural experiments (and other messy kinds of ecological monitoring). \u003cem\u003eEcology and Evolution\u003c/em\u003e, 5, 5295\u0026ndash;5304.\u003c/li\u003e\n\u003cli\u003eDe Boeck, H.J., Bloor, J.M.G., Aerts, R., Bahn, M., Beier, C., Emmett, B.A., \u003cem\u003eet al.\u003c/em\u003e (2020). Understanding ecosystems of the future will require more than realistic climate change experiments \u0026ndash; A response to Korell et al. \u003cem\u003eGlobal Change Biology\u003c/em\u003e, 26, e6\u0026ndash;e7.\u003c/li\u003e\n\u003cli\u003eGuo, Q., Chen, A., Crockett, E.T.H., Atkins, J.W., Chen, X. \u0026amp; Fei, S. (2023). Integrating gradient with scale in ecological and evolutionary studies. \u003cem\u003eEcology\u003c/em\u003e, 104, e3982.\u003c/li\u003e\n\u003cli\u003eHurlbert, S.H. (1984). Pseudoreplication and the Design of Ecological Field Experiments. \u003cem\u003eEcological Monographs\u003c/em\u003e, 54, 187\u0026ndash;211.\u003c/li\u003e\n\u003cli\u003eIngrisch, J., Umlauf, N. \u0026amp; Bahn, M. (2023). Functional thresholds alter the relationship of plant resistance and recovery to drought. \u003cem\u003eEcology\u003c/em\u003e, 104, e3907.\u003c/li\u003e\n\u003cli\u003eKelly, M., Bennion, H., Burgess, A., Ellis, J., Juggins, S., Guthrie, R., \u003cem\u003eet al.\u003c/em\u003e (2009). Uncertainty in ecological status assessments of lakes and rivers using diatoms. \u003cem\u003eHydrobiologia\u003c/em\u003e, 633, 5\u0026ndash;15.\u003c/li\u003e\n\u003cli\u003eKreyling, J., Jentsch, A. \u0026amp; Beier, C. (2014). Beyond realism in climate change experiments: gradient approaches identify thresholds and tipping points. \u003cem\u003eEcology Letters\u003c/em\u003e, 17, 125-e1.\u003c/li\u003e\n\u003cli\u003eKreyling, J., Schweiger, A.H., Bahn, M., Ineson, P., Migliavacca, M., Morel‐Journel, T., \u003cem\u003eet al.\u003c/em\u003e (2018). To replicate, or not to replicate \u0026ndash; that is the question: how to tackle nonlinear responses in ecological experiments. \u003cem\u003eEcology Letters\u003c/em\u003e, 21, 1629\u0026ndash;1638.\u003c/li\u003e\n\u003cli\u003eKuzentsova, A., Brockhoff, P.B. \u0026amp; Christensen, R.H.B. (2017). \u0026ldquo;lmerTest Package: Tests in Linear Mixed Effects Models. \u003cem\u003eJournal of Statistical Software\u003c/em\u003e, 82, 1\u0026ndash;26.\u003c/li\u003e\n\u003cli\u003eLegendre, P. (2008). Studying beta diversity: ecological variation partitioning by multiple regression and canonical analysis. \u003cem\u003eJournal of Plant Ecology\u003c/em\u003e, 1, 3\u0026ndash;8.\u003c/li\u003e\n\u003cli\u003eLi, J., Peng, P. \u0026amp; Zhao, J. (2020). Assessment of soil nematode diversity based on different taxonomic levels and functional groups. \u003cem\u003eSoil Ecol. Lett.\u003c/em\u003e, 2, 33\u0026ndash;39.\u003c/li\u003e\n\u003cli\u003eManning, P. (2019). Piling on the pressures to ecosystems. \u003cem\u003eScience\u003c/em\u003e, 366, 801\u0026ndash;801.\u003c/li\u003e\n\u003cli\u003eMorel-Journel, T., Thuillier, V., Pennekamp, F., Laurent, E., Legrand, D., Chaine, A.S., \u003cem\u003eet al.\u003c/em\u003e (2020). A multidimensional approach to the expression of phenotypic plasticity. \u003cem\u003eFunctional Ecology\u003c/em\u003e, 34, 2338\u0026ndash;2349.\u003c/li\u003e\n\u003cli\u003eOksanen, L. (2001). Logic of experiments in ecology: is pseudoreplication a pseudoissue? \u003cem\u003eOikos\u003c/em\u003e, 94, 27\u0026ndash;38.\u003c/li\u003e\n\u003cli\u003eR Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.\u003c/li\u003e\n\u003cli\u003eRichardson, A.D., Aubinet, M., Barr, A.G., Hollinger, D.Y., Ibrom, A., Lasslop, G., \u003cem\u003eet al.\u003c/em\u003e (2012). Uncertainty Quantification. In: \u003cem\u003eEddy Covariance: A Practical Guide to Measurement and Data Analysis\u003c/em\u003e, Springer Atmospheric Sciences (eds. Aubinet, M., Vesala, T. \u0026amp; Papale, D.). Springer Netherlands, Dordrecht, pp. 173\u0026ndash;209.\u003c/li\u003e\n\u003cli\u003eRineau, F., Malina, R., Beenaerts, N., Arnauts, N., Bardgett, R.D., Berg, M.P., \u003cem\u003eet al.\u003c/em\u003e (2019). Towards more predictive and interdisciplinary climate change ecosystem experiments. \u003cem\u003eNat. Clim. \u003c/em\u003e\u003cem\u003eChang.\u003c/em\u003e, 9, 809\u0026ndash;816.\u003c/li\u003e\n\u003cli\u003eScheffer, M., Bascompte, J., Brock, W.A., Brovkin, V., Carpenter, S.R., Dakos, V., \u003cem\u003eet al.\u003c/em\u003e (2009). Early-warning signals for critical transitions. \u003cem\u003eNature\u003c/em\u003e, 461, 53\u0026ndash;59.\u003c/li\u003e\n\u003cli\u003eScheffer, M., Hirota, M., Holmgren, M., Nes, E.H.V. \u0026amp; Chapin, F.S. (2012). Thresholds for boreal biome transitions. \u003cem\u003ePNAS\u003c/em\u003e, 109, 21384\u0026ndash;21389.\u003c/li\u003e\n\u003cli\u003eSch\u0026ouml;ps, R., Goldmann, K., Korell, L., Bruelheide, H., Wubet, T. \u0026amp; Buscot, F. (2020). Resident and phytometer plants host comparable rhizosphere fungal communities in managed grassland ecosystems. \u003cem\u003eSci Rep\u003c/em\u003e, 10, 919.\u003c/li\u003e\n\u003cli\u003eSchweiger, A.H. (2017). The complex adaptive character of spring fens as model ecosystems. \u003cem\u003eFrontiers of Biogeography\u003c/em\u003e, 9.\u003c/li\u003e\n\u003cli\u003eSchweiger, A.H., Irl, S.D.H., Steinbauer, M.J., Dengler, J. \u0026amp; Beierkuhnlein, C. (2016). Optimizing sampling approaches along ecological gradients. \u003cem\u003eMethods in Ecology and Evolution\u003c/em\u003e, 7, 463\u0026ndash;471.\u003c/li\u003e\n\u003cli\u003eTaylor, S.D. \u0026amp; Guralnick, R.P. (2019). Opportunistically collected photographs can be used to estimate large-scale phenological trends. \u003cem\u003ebioRxiv\u003c/em\u003e, 794396.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTables 1 to 2 are available in the Supplementary Files section.\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"global change experiments, ecological models, experimental design, non-linear responses, polynomial fits, prediction success","lastPublishedDoi":"10.21203/rs.3.rs-7921667/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7921667/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eA major aim of experimental ecology is to quantify responses to environmental change. Study designs which optimally capture response patterns are currently debated. A key point in the discussion is how a limited total number of samples should ideally be allocated to replication versus the number of locations along the environmental gradient. Here, we assess how to optimally allocate sampling effort for maximizing prediction accuracy in gradient designs. For this we performed artificial data simulations for different sampling approaches with or without \u003cem\u003ea priori\u003c/em\u003e knowledge of the underlying patterns, and applied a set of commonly observed response shapes. Overall, unreplicated sampling with equidistant, systematic placement along the gradient of interest at as many locations or levels as affordable turned out to be the best approach for unknown response shapes. Replication was found to be beneficial when \u003cem\u003ea priori\u003c/em\u003e knowledge exists about the underlying, simple (e.g. linear or humped) response shape.\u003c/p\u003e","manuscriptTitle":"How to optimally allocate sampling effort in experimental ecology","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-06 06:30:21","doi":"10.21203/rs.3.rs-7921667/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-11-18T07:32:39+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-17T17:55:52+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"126323240832332332679227820598139343197","date":"2025-11-12T19:33:14+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-01T22:44:08+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"3282930411663922781151503709496710726","date":"2025-10-27T13:18:52+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"56453746861614644283954483048466727879","date":"2025-10-27T12:53:37+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-27T12:04:25+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-10-27T11:47:10+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-10-24T09:38:24+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-10-24T09:38:19+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-10-22T09:01:53+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"55f328cc-aac5-407b-a18d-afc8b5d74764","owner":[],"postedDate":"November 6th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":57280944,"name":"Biological sciences/Ecology"},{"id":57280945,"name":"Earth and environmental sciences/Ecology"},{"id":57280946,"name":"Physical sciences/Mathematics and computing"}],"tags":[],"updatedAt":"2026-02-16T16:00:02+00:00","versionOfRecord":{"articleIdentity":"rs-7921667","link":"https://doi.org/10.1038/s41598-026-38541-4","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2026-02-13 15:57:15","publishedOnDateReadable":"February 13th, 2026"},"versionCreatedAt":"2025-11-06 06:30:21","video":"","vorDoi":"10.1038/s41598-026-38541-4","vorDoiUrl":"https://doi.org/10.1038/s41598-026-38541-4","workflowStages":[]},"version":"v1","identity":"rs-7921667","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7921667","identity":"rs-7921667","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.