{"paper_id":"1f2183fe-3445-4894-bf88-217f40edd00c","body_text":"Causal factors of coastal chlorophyll-a dynamics and near future forecasting \n 1 \nTitle: 1 \nExploring causal factors of coastal chlorophyll-a dynamics 2 \nand their potential contributions to near future forecasting 3 \n 4 \nWord count: 5969 words from Introduction to Conclusions, excluding the reference list 5 \n 6 \nAuthors: 7 \nSuixuan Huang1,*, Masayuki Ushio1,* 8 \n 9 \nAuthor Affiliations: 10 \n1Department of Ocean Science, The Hong Kong University of Science and Technology, Clear 11 \nWater Bay, Kowloon, Hong Kong SAR, China 12 \n 13 \nCorresponding author: *Suixuan Huang, Email: shuangch@connect.ust.hk, *Masayuki 14 \nUshio, Department of Ocean Science, The Hong Kong University of Science and 15 \nTechnology, Hong Kong SAR, Email: ong8181@gmail.com, Tel: +852-3469-2938 (ORCID 16 \niD = 0000-0003-4831-7181)  17 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 2 \nAbstract (≤ 250 words) 18 \nHarmful algal blooms have been causing significant damage worldwide, and Hong Kong is 19 \nno exception. To understand the drivers of algal bloom formation and forecast the dynamics 20 \nof chlorophyll-a (Chl-a), a proxy for algal abundance, in Hong Kong waters, this study 21 \nutilized nonlinear time series analysis, called empirical dynamic modeling (EDM), to 22 \ninvestigate Chl-a dynamics using in situ measurements and remote sensing data. We first 23 \nconducted causality tests of EDM to identify environmental factors influencing Chl-a at 24 \ndifferent sites. As for the in situ measurement data, salinity was the strongest causal factor 25 \namong environmental factors. However, inputting the causal factors into the forecasting 26 \nmodel did not greatly improve the forecasting performance for Chl-a, suggesting that factors 27 \nnot included in the current dataset, such as wind direction and current speed, may play a more 28 \ncritical role in Chl-a dynamics. As for the remote sensing data, sea surface temperature (SST) 29 \nshowed significant causal effect on Chl-a at most sites and the multivariate forecasting model 30 \nincluding Chl-a and SST outperformed the univariate model at most sites. This study is the 31 \nfirst to employ EDM to investigate Chl-a dynamics in Hong Kong waters, showcasing its 32 \npotential to identify causal factors and improve forecasting accuracy. The findings provide 33 \nscientific insights into Chl-a dynamics and water quality monitoring and modeling in a 34 \ncoastal region. 35 \n 36 \nKeywords (3–10 words): Causality test; Chlorophyll-a dynamics; Empirical dynamic 37 \nmodeling; Hong Kong waters; Time series analysis; Water quality monitoring 38 \n39 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 3 \n1. Introduction 40 \nHarmful algal blooms (HABs), commonly known as “red tides,” refer to the discoloration of 41 \ncoastal waters caused by the rapid growth and accumulation of harmful microscopic algae 42 \n(phytoplankton) (Zohdi & Abbaspour, 2019). These blooms can produce toxins triggering 43 \nfoodborne illnesses, such as amnesic shellfish poisoning and ciguatera fish poisoning (Lopes 44 \net al., 2019; Pradhan et al., 2022; Wang, 2008). Zooplankton, filter-feeding shellfish and 45 \nherbivorous fish consume these phytoplankton, acting as mediators for toxin transfer within 46 \nthe food web, ultimately affecting humans. Even non-toxic blooms can indirectly harm 47 \nmarine life by depleting oxygen during decomposition (Flewelling et al., 2005). Due to the 48 \ngreat impact on public health concerns, the seafood industry and tourism, the economic loss 49 \nis considerable. In Florida, for instance, red tides have been observed since the 1840s and the 50 \nannual losses reach millions of dollars (Kirkpatrick et al., 2004). To prevent such damage 51 \ncaused by HABs in coastal ecosystems, it is essential to understand the mechanism and 52 \npredict the outbreaks of red tides. 53 \nThe concentration of chlorophyll-a (Chl-a) is a common proxy for assessing 54 \nphytoplankton abundance. Thus, identifying the drivers of Chl-a dynamics contributes not 55 \nonly to developing effective HAB monitoring plans but also to predicting HAB dynamics. 56 \nPrevious attempts to identify drivers of the dynamics of Chl-a or phytoplankton community 57 \ncomposition mainly relied on correlation-based methods, for example, linear regressions, 58 \nCanonical-Correlation Analysis (CCA), and Redundancy Analysis (RDA) (Deconinck et al., 59 \n2025; Li et al., 2023; Yin, 2003). These methods assume that environmental variables are 60 \nindependent and the effects of such variables are separable from each other, which are 61 \nfeatures of linear systems. However, the effectiveness of these methods could be limited 62 \nwhen studying nonlinear systems such as marine ecosystems, where variables are 63 \ninterdependent upon each other (Glaser et al., 2014). Moreover, correlation may misidentify 64 \ncausal variables, as “correlation does not imply causation” (Berkeley, 1988). That is, 65 \ncorrelation can occur without causation, and causation may also occur in the absence of 66 \ncorrelation. 67 \nAs an alternative non-parametric method, Empirical Dynamic Modeling (EDM) was 68 \ndeveloped to analyze complex dynamics found in natural ecosystems (Anderson et al., 2021; 69 \nGlaser et al., 2014; Sugihara et al., 2012; Sugihara & May, 1990). Instead of deriving 70 \nequations to describe how the variables are related or how the system evolves, EDM 71 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 4 \ndelineates a “trajectory” of the system evolution in a high-dimensional state space (i.e., a 72 \nmanifold) (for example, see figures in Ye et al., 2015). EDM was originally developed to 73 \nmake near-future forecasts for deterministic complex systems in the state space built by 74 \nlagged time series of a single variable (Sugihara & May, 1990). Eventually, several tools of 75 \nEDM were developed to analyze the relationships among multiple variables within the same 76 \nsystem (in HAB studies, it would be time series of Chl-a and environmental factors), enabling 77 \ndetection of causalities among variables (Sugihara et al., 2012) and quantification of 78 \ninteraction strengths (Deyle et al., 2016), and EDM tools have been applied to various 79 \necological time series to understand and forecast ecosystem dynamics (Deyle et al., 2022; 80 \nTsai et al., 2024; Ushio, 2022; Ushio et al., 2018). 81 \nHong Kong is a coastal city with a long historical record of HABs. Located to the east of 82 \nthe Pearl River Estuary (PRE) and to the north of the South China Sea (Fig.1a), Hong Kong 83 \nis affected by fresh water and sea water at the same time (Wong et al., 2007). As a result, 84 \nthere are significant spatial and seasonal variations in water circulation, stratification, salinity, 85 \ntemperature, and nutrient levels there. In this study, we aim to detect causal factors of Chl-a 86 \ndynamics in Hong Kong waters and improve the near-future forecasting performance using 87 \nEDM tools. Our work is built on the long-term in situ observation of Chl-a and environmental 88 \nfactors in 76 sites (Fig.1a) conducted by the Environmental Protection Department (EPD) of 89 \nthe Hong Kong government. Additionally, remote sensing measurements collected by sensors 90 \nsuch as Sea-Viewing Wide Field-of-View Sensor (SeaWiFS) and Moderate Resolution 91 \nImaging Spectroradiometers (MODIS) provide great support for monitoring Chl-a and other 92 \nparameters with high coverage and fine spatial resolution (Chen et al., 2013). In Hong Kong, 93 \nMODIS data at 4-kilometer resolution include 176 sites in total, of which 101 are located on 94 \nthe water area (Fig.1b). The spatial resolution of remote sensing data is also fine, allowing for 95 \ndaily or weekly observations. However, frequent occurrence of missing data in daily or finer 96 \ntemporal resolution data complicates the understanding of spatiotemporal variations (Zhang 97 \net al., 2025). Monthly-averaged data may mitigate the effect of missing data, but the 98 \nreliability of remote sensing data in coastal regions such as Hong Kong waters—where 99 \nparticle suspension significantly impacts remote sensing images—has not been thoroughly 100 \ninvestigated for Chl-a monitoring, although the monthly remote sensing data is generally well 101 \ncorrelated with the in situ measurement data (Fig. 1c). Comparing the in situ measurement 102 \ndata and remote sensing data would give us an opportunity to test the reliability of the remote 103 \nsensing data for Chl-a monitoring in coastal regions. 104 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 5 \nHere, using an in situ measurement dataset collected by EPD and remote sensing dataset 105 \ncollected by MODIS, we apply two tools of EDM, unified information-theoretic causality 106 \n(UIC) (Osada et al., 2023) to detect causal environmental factors, and Multiview-distance 107 \nregularized S-map (MDR S-map) (Chang et al., 2021) to quantify interaction strengths of the 108 \ncausal factors, and forecast dynamics of Chl-a in Hong Kong waters. In this study, we first 109 \nidentified causal variables using UIC and quantified the causal effects using MDR S-map. 110 \nThen, we conducted near-future forecast of Chl-a dynamics in Hong Kong waters using the 111 \ntime series of Chl-a and causal variables. For in situ measurement data, salinity showed the 112 \nstrongest causal effect on Chl-a compared to other environmental factors. However, inputting 113 \ncausal variables into the forecast model (i.e., multi-variable model) did not greatly improve 114 \nthe performance compared to the Chl-a only model (i.e., single-variable model). For remote 115 \nsensing data, sea surface temperature (SST) showed a significant causal effect on Chl-a at 116 \nmost of the sites and inputting SST improved the performance of the forecast model 117 \ncompared to univariate model using Chl-a only. Lastly, we compared the results of the in situ 118 \nmonitoring data and the remote sensing data and discuss the reliability of the remote sensing 119 \ndata for Chl-a monitoring in Hong Kong waters. This study is the first to quantify the causal 120 \neffect of environmental factors on Chl-a dynamics in Hong Kong coastal waters using EDM, 121 \nproviding perspectives for real-world environmental monitoring, management, and 122 \nforecasting.  123 \n 124 \n 125 \n 126 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 6 \n  127 \nFigure 1 Study sites of (a) in situ measurements conducted by EPD and (b) \nremote sensing collected by MODIS, represented by black points. In (a), there \nare ten water control zones declared by EPD, namely ①Tolo Harbour and \nChannel, ②Southern, ③Port Shelter, ④Junk Bay, ⑤Deep Bay, ⑥Mirs Bay, \n⑦North Western, ⑧Western, ⑨Eastern and ⑩Victoria Harbour \n(https://cd.epic.epd.gov.hk/EPICRIVER/marine/?lang=en). (c) Comparison of \nChl-a concentration (mg m-3) in the southern Hong Kong waters. The red line \nindicates data from a monitoring site of in situ measurement data, and the \ndashed line indicates remote sensing data from a monitoring site close to the in \nsitu measurement site. Their locations are 22.1917˚N, 114.0790˚E and \n22.18750˚N, 114.1042˚E. Remote sensing data shows a similar trend with the in \nsitu monitoring data. \n  \n \n \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 7 \n2. Materials and methods 128 \n2.1 In situ measurement data 129 \nMonthly time series of Chl-a and environmental factors, including temperature, salinity, pH, 130 \nturbidity, total nitrogen (TN), and total phosphorus (TP) and silica (as SiO2), were collected 131 \nby Environmental Protection Department (EPD) in 76 sites in Hong Kong waters. Data were 132 \ndownloaded from the EPD website (https://www.epd.gov.hk/epd/english/top.html) as of 2024 133 \nJanuary. Each time series started from 2002 August and continued until 2022 July, containing 134 \n240 time points in each site. Sampling was carried out onboard a scientific vessel once a 135 \nmonth and samples were collected at 1 m below the sea surface. According to EPD, 136 \ntemperature, salinity, pH, and turbidity were measured on site by CTD profiler (SEACAT19+ 137 \nConductivity Temperature Depth, Sea-Bird Scientific, US). Chl-a, TN (the sum of Kjeldahl 138 \nnitrogen, nitrite nitrogen, and nitrate nitrogen), TP and silica were measured in the laboratory 139 \nas described in Annual Marine Water Quality Reports of EPD 140 \n(https://www.epd.gov.hk/epd/sites/default/files/epd/english/environmentinhk/water/hkwqrc/fi141 \nles/waterquality/annual-report/marinereport2024.pdf). N/P ratio was defined as TN divided 142 \nby TP. 143 \n 144 \n2.2 Remote sensing data 145 \nRemote sensing data of Chl-a and sea surface temperature (SST) were derived from MODIS 146 \n(Moderate Resolution Imaging Spectroradiometer) onboard the Aqua satellite platform of 147 \nNASA (National Aeronautics and Space Administration). Data were downloaded from the 148 \nofficial website of NASA Ocean Color (https://oceancolor.gsfc.nasa.gov/) as of 2024 January. 149 \nData used were monthly Level 3 data at resolution of 4 km. There were 176 sites in total 150 \nextracted between latitude 22.13°N and 22.58°N, longitude 113.82°E and 114.52°E, among 151 \nwhich 101 marine sites had been selected. Each time series started in August 2002 and 152 \ncontinued until July 2022, consisting of 240 time points for each site. Missing values of Chl-a 153 \nconcentration were replaced with 0 because missing values usually indicate that the Chl-a 154 \nconcentration was too low. 155 \n 156 \n2.3 Unified information-theoretic causality (UIC)  157 \nTo detect causal effects of environmental factors on Chl-a, unified information-theoretic 158 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 8 \ncausality (UIC) was applied (Osada et al., 2023). UIC is a time series-based, nonparametric 159 \ncausality test, which incorporates the advantages of both convergent cross-mapping (CCM; 160 \nSugihara et al., 2012), a causality test in EDM, and transfer entropy (TE; Schreiber, 2000), an 161 \ninformation-theory based causality test. Here, TE from process y to process x, 𝑇𝐸!→#, 162 \nassesses how much uncertainty in predicting future value of y is reduced, given knowledge of 163 \npast values of x. This shares a similar idea with CCM, which regards y as a causal effect of x 164 \nif the neighborhood relationship of time series x is able to predict time series y by nearest 165 \nneighbor regression in the time-delay embedding. UIC quantifies information flow between 166 \nvariables in format of TE determined by conditional probability (we call this measure TE, but 167 \nthe mathematical definition of 𝑇𝐸!→# is different from that of Schreiber; see Osada et al. 168 \n2021 for details), which would be comparison of model performance of cross mapping in this 169 \ncase: 170 \n𝑇𝐸!→# = \t 1\n𝑁 \t' log + 𝑝-𝑦$%$&|𝑥$, 𝑥$'( , … , 𝑥$'(*'+)( , 𝑧$4\n𝑝-𝑦$%$&|𝑥$'( , 𝑥$'-( , … , 𝑥$'(*'+)( , 𝑧$45\n.\n$/+\n, 171 \nwhere x, y, and z are a potential effect variable, causal variable, and conditional variable (if 172 \navailable), respectively. In our case, x, y, and z could be Chl-a, environmental factors, and the 173 \nother potential causal factors, respectively. 𝑝(𝐴|𝐵)\tdescribes conditional probability: the 174 \nprobability of A given B. N is the length of the library (i.e., number of the vectors in the state 175 \nspace) and E is and the optimal embedding dimension when conducting one-step forward 176 \nforecast in the state space. t, tp, and τ are the time point, time lag between effect variable and 177 \ncausal variable, and time lag of the time series, respectively. To identify the delayed effect of 178 \nenvironmental factors on Chl-a, five time-lags, i.e., tp = 0, –1, –2, –3, and –4, were tested. 179 \nThese lags suggest that the causal effects occur within the same month, one, two, three, or 180 \nfour months earlier, respectively. For example, when the optimal embedding dimension is 181 \nfive, there is one causal environmental variable, and tp = 0, the state vector is represented by 182 \nthe information of an environmental factor of the current month (unlagged), one month ago, 183 \ntwo months ago, three months ago, and four months ago, which are used to predict the 184 \ncurrent state of Chl-a (the numerator in the above equation, which is equivalent to “cross-185 \nmapping” in CCM, but is adjusted by the denominator in UIC). 186 \nTo avoid misidentification of causality caused by seasonality (i.e., synchronized 187 \ndynamics), a seasonal surrogate test was carried out. For each Chl-a time series, 1000 188 \nsurrogate time series were generated by computing a seasonal trend (= yearly trend) and 189 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 9 \nshuffling the residuals. If TE from an environmental factor to the original Chl-a time series 190 \nexceeds TE of over 950 surrogate time series, the environmental factor is regarded as having 191 \na significant causal effect on Chl-a (i.e., p < 0.05).  192 \nFor EPD data, causal effects of temperature, salinity, pH, turbidity, N/P ratio, and silica 193 \non Chl-a were examined. For MODIS data, causal effects of SST on Chl-a were examined 194 \ndue to the limitation of the availability of environmental variables. To ensure that all variables 195 \nwith different units have the same level of magnitude for comparison and to avoid 196 \nreconstructing a distorted state space, the time series of all variables were normalized to have 197 \na mean of 0 and a standard deviation of 1. This data preprocessing approach differed from 198 \nthat used for our MDR S-map, where the first-differenced time series were normalized (see 199 \nthe following section). Using the first-differenced and normalized time series for UIC showed 200 \nqualitatively the same results (see Tables S1–S4), but we show the UIC results of the 201 \nnormalized time series in the main text and figures because the interpretation is more 202 \nstraightforward. The computation was conducted using the package “rUIC” (version 0.9.13) 203 \n(Osada & Ushio, 2021) of R. 204 \n 205 \n2.4 Multiview-distance regularized S-map (MDR S-map) 206 \n To conduct the near-future forecast of Chl-a dynamics, Multiview-distance regularized 207 \nS-map (MDR S-map; Chang et al., 2021) was applied. We analyzed the first-differenced and 208 \nnormalized time series to maximize the forecasting accuracy of Chl-a dynamics and to 209 \nmitigate issues arising from temporal autocorrelation. Before conducting MDR S-map, time 210 \nseries of Chl-a were taken first differenced and normalized and UIC was again conducted to 211 \ndetect causal environmental factors on the differenced Chl-a at each site (Tables S1–S4). 212 \nHere, a significance test was conducted by a random-shuffled surrogate method (1000 213 \nsurrogate time series were used to calculate the significance), as the first-differenced time 214 \nseries did not show a clear seasonality. Only sites with significant causal environmental 215 \nfactors were selected for further MDR S-map analysis. 216 \nMDR S-map links two existing EDM methods, multiview embedding (Ye & Sugihara, 217 \n2016) and regularized S-map (Cenci et al., 2019), which has been proposed to reconstruct 218 \nlarge interaction networks when the number of causal variables exceeds the optimal 219 \nembedding dimension (Chang et al., 2021). The first step of MDR S-map is to determine 220 \n“multiview distance” describing the “true” neighboring relationship in a high-dimensional 221 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 10 \nstate space by ensembling various distances measured in numerous low-dimensional state 222 \nspaces at the optimal embedding dimension (Ye & Sugihara, 2016). Euclidean distance 223 \nbetween every pair of the vectors in the low-dimensional state space, which should be 224 \nreasonably reliable, is calculated. This procedure is repeated for numerous low-dimensional 225 \nstate spaces, and “ensembled” Euclidean distances among the vectors are calculated by 226 \ncalculating the weighted average among all these distances (the weight is based on the 227 \nforecast performance of each embedding). The ensembled distance is a good approximation 228 \nof the “true” distance in the high-dimensional system state. 229 \nThe second step of MDR S-map is to construct a local linear model, Sequential locally 230 \nweighted global linear map (S-map, a fundamental tool of EDM; Sugihara, 1994), to fit the 231 \ntime series and make a near-future forecast by including the causal effect from multiple 232 \nvariables, 233 \n𝑦:(𝑡∗ + 𝑡𝑝) = \t 𝐶1 + \t 𝐶>+𝑌+(𝑡∗) + \t 𝐶>-𝑌-(𝑡∗) … + \t 𝐶>* 𝑌* (𝑡∗), 234 \nwhere t*, E, and tp represent the target time point, the embedding dimension, and forecasting 235 \ntime step, respectively. 𝑦:, Yj, and C0 represent the predicted value of y, jth element of the 236 \nembedded time series (e.g., lagged Chl-a time series, environmental variables, and so on), 237 \nand the intercept of the local linear model, respectively. -𝐶>+, 𝐶>-, … , 𝐶>* 4 are local linear 238 \ncoefficients. Such Jacobians of the locally approximated linear functions could be defined as 239 \nthe causal effect (or interaction strength). To avoid an overfitting problem when dimension 240 \n(the number of variables in the linear model) is larger than the time series length, we used 241 \nregularization (e.g., ridge, lasso, or elastic-net; Cenci et al., 2019) to estimate the coefficients 242 \nfor each time point, 𝐶> = (𝐶>+, 𝐶>-, … , 𝐶>* ), as follows: 243 \n𝐶> = 𝑎𝑟𝑔 𝑚𝑖𝑛2 \tFG√𝑾\t\t(𝑌(𝑡 + 𝑡𝑝) − 𝒀(𝑡)𝐶)G-\n-\n+ \t𝜆[𝛼‖𝐶‖-\n- + (1 − 𝛼)‖𝐶‖+]Q , 244 \nwhere C represents local linear coefficients to be solved, λ is the penalized factor set to be 245 \nselected from 0, 0.001, 0.01, 0.1, 0.5, 1, and 2. α is the adjusted parameter set to be 0, 246 \nbalancing the regularization using L1 (||.||1) or L2 (||.||2) norm of the parameter vector, which 247 \nmeans we used the ridge regression. t is the time point, and 𝑾 is the local weight matrix 248 \nbased on the multiview distances. 𝒀(𝑡) = (𝑌+(𝑡), 𝑌-(𝑡), … , 𝑌* (𝑡)) is a N × E data matrix (N 249 \nis the number of time points and E is the number of nodes, i.e., optical dimension) collecting 250 \nthe time series of all network nodes, and 𝑌(𝑡 + 𝑡𝑝) = \t (𝑦(𝑡+ + 𝑡𝑝), 𝑦(𝑡- + 𝑡𝑝), … , 𝑦(𝑁 +251 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 11 \n𝑡𝑝)\t)3 is N × 1 vector representing p-step forward time series data. The solution of the 252 \nequation depends on parameters λ and α. All combinations of the parameter values are tested 253 \n(i.e., grid search) and the best one for each site is determined by normalized mean square 254 \nerror (NMSE) for one-step forward forecast.  255 \nIn practice, univariate models are defined as models with only Chl-a time series (and its 256 \ntime-lagged values). Under this circumstance, the state space is built by Chl-a and its lagged 257 \nvalues. Multivariate models are defined as models with Chl-a and environmental factors (and 258 \ntheir time-lagged values). For in situ measurement data, multivariate models tested included 259 \nChl-a and temperature (“Chl-a + Temp”), Chl-a and salinity (“Chl-a + Sal”), Chl-a and pH 260 \n(“Chl-a + pH”), Chl-a and turbidity (“Chl-a + Turb”), Chl-a and N/P ratio (“Chl-a + N/P”), 261 \nand Chl-a and silica (“Chl-a + Sil”). For remote sensing data, the multivariate model tested 262 \nwas built by Chl-a and sea surface temperature (“Chl-a + SST”). Then, the model 263 \nperformance would be evaluated by NMSE. The time series of all variables were normalized 264 \nto have a mean of 0 and a standard deviation of 1. The computation was conducted using the 265 \npackage “macam” (version 0.1.10) (Ushio, 2025) of R. 266 \n 267 \n2.5 Data and code availability 268 \nAll data used in this study was downloaded from public databases. All scripts and formatted 269 \ndata used in this study are available on Github 270 \n(https://github.com/sxhuang00/causality_forecast_chl). 271 \n 272 \n3. Results 273 \n3.1  Causal effects of temperature on Chl-a for in situ measurement and 274 \nremote sensing data 275 \nFor in situ measurement data, several sites in Victoria Harbor showed significant and strong 276 \ncausal effects of water temperature on the Chl-a dynamics (Fig. 2a; p < 0.05). Some specific 277 \nsites in western, southern, and eastern regions and Tolo Harbor also showed significant causal 278 \neffects of temperature. For remote sensing data, sea surface temperature (SST) exerted 279 \nsignificant and common causal effects on Chl-a dynamics in many monitoring sites (Fig. 2b; 280 \np < 0.05). Some sites in the south and the east that are far from land showed stronger 281 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 12 \ncausality of temperature on Chl-a. 282 \n 283 \n 284 \n3.2  Causal effects of salinity, pH, turbidity, and nutrients on Chl-a for in situ 285 \nmeasurement data 286 \nWe examined the causal effects of other environmental factors for in situ measurement data 287 \nonly, as such data was not available for the remote sensing data. The strengths of the causal 288 \neffects are summarized in Fig. 3a and the spatial patterns of the causal effects of the 289 \nenvironmental factors are shown in Fig. 3b-f. 290 \nFirst, we found that salinity causally influenced the Chl-a dynamics in most monitoring 291 \nsites in the southern regions and Victoria Harbor (Fig. 3b; p < 0.05). The causal effects, 292 \nmeasured by TE, of salinity were generally stronger than those of the other environmental 293 \nfactors (average TE = 0.14 for salinity and average TE were 0.09, 0.05, 0.04, 0.07, and 0.05 294 \nfor temperature, pH, turbidity, N/P ratio, and silica respectively; Fig. 3a). The causal effects 295 \nof pH were predominantly found in the monitoring sites that were close to shorelines and 296 \nwere located far from oligotrophic regions, such as Mirs Bay in the eastern region (Fig. 3c; p 297 \n< 0.05); however, the causal effects of pH were generally weak (Fig. 3a). The causal effects 298 \nFigure 2. The spatial pattern of causal effect of seawater temperature on Chl-a for (a) in \nsitu measurement data (EPD) and (b) remote sensing data (MODIS). Open circles \nindicate that the causal effect is not significant and filled circles indicate that the causal \neffect is significant. The circle size indicates the significance of the causal effect with a \nlarger size representing a more significant effect. The circle color indicates the strength \nof the causal effect (transfer entropy; TE) with lighter color representing a stronger effect \nof temperature on the Chl-a dynamics. \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 13 \nof turbidity were also generally weak but particularly strong in Deep Bay (Fig. 3d; p < 0.05). 299 \nCausal effects of the N/P ratio were common, especially in Tolo Harbor and Mirs Bay (Fig. 300 \n3e; p < 0.05). The causal effects of silica showed a similar spatial pattern to those of salinity, 301 \nparticularly in Victoria Harbor and the southern region, but the effect strength was much 302 \nweaker than that of salinity (Fig. 3f; p < 0.05).  303 \n 304 \n3.3  Forecasting Chl-a dynamics with univariate and multivariate models 305 \nusing in situ measurement data 306 \nUsing the first-differenced Chl-a time series, we tried to maximize the forecasting accuracy 307 \nof Chl-a dynamics using MDR S-map. For in situ measurement data, multivariate MDR S-308 \nmap models with different combinations of embedding variables showed similar forecast 309 \nFigure 3. (a) The strength of significant causal effects (transfer entropy, TE) of \nenvironmental factors on Chl-a and the spatial pattern of causal effects of \nenvironmental factors on chlorophyll-a for in situ measurement data: (b) Salinity, (c) pH, \n(d) Turbidity, (e) N/P ration and (f) silica. Open circle indicates that the causal effect is \nnot significant and filled circle indicates that the causal effect (TE) is significant. The \ncircle size indicates the significance of the causal effect with the larger size representing \na more significant effect. The circle color indicates the strength of the causal effect with \nlighter color representing a stronger effect of temperature on the chlorophyll dynamics. \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 14 \nperformance to the univariate model (Fig. 4). Mean NMSEs of “Chl-a”, “Chl-a + Temp”, 310 \n“Chl-a + Sal”, “Chl-a + pH”, “Chl-a + Tur”, “Chl-a + N/P”, and “Chl-a + Sil” were 0.660, 311 \n0.663, 0.666, 0.708, 0.656, 0.708, and 0.690, respectively. Although some sites showed better 312 \nforecast performance using the multivariate model than using the univariate model, the 313 \nimprovement in the forecast performance when including causal environmental factors was 314 \nlimited.  315 \n 316 \n3.4  Forecasting Chl-a dynamics with univariate and multivariate models 317 \nusing remote sensing data 318 \nAs for remote sensing data, we analyzed sites showing significant causal effect of SST on 319 \ndifferenced Chl-a, and most of them exhibited better forecast performance with the 320 \nmultivariate S-map model than with the univariate S-map (Fig. 5). The mean NMSE of the 321 \nunivariate and multivariate model were 0.632 and 0.574, respectively. Importantly, compared 322 \nto the univariate model, the multivariate model was better at predicting the high peaks (i.e., 323 \nChl-a concentration increases to a high level during algal bloom) and low peaks (i.e., Chl-a 324 \nconcentration decreases to a low level after algal bloom) of Chl-a dynamics without delay or 325 \nFigure 4. Forecast performance of different MDR S-map models using in situ \nmeasurement data. The first column indicates the univariate model with differenced \nChl-a time series input only. The other columns indicate different combinations of \nmultivariate models with time series of Chl-a and lagged environmental factors input. \nEach point indicates a monitoring site. Note that the univariate model includes all in \nsitu sites, while multivariate models only include sites showing significant causal \neffect of specific environmental factors on differenced Chl-a, and thus, the number of \npoints for the multivariate models is smaller than that for the univariate model. \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 15 \nlead (e.g., see Fig. 6). 326 \n 327 \n 328 \nFigure 5. Comparison of forecast performance of different MDR S-map \nmodels using remote sensing measurement data. (a) Jitter plot of NMSE of \nunivariate model (with Chl-a time series input only) and multivariate model \n(with time series of Chl-a and lagged SST input. Note that only the sites with \nsignificant causal effect of SST on differenced Chl-a are shown). (b) \nScatterplot of NMSE of univariate model and multivariate model. The solid line \nindicates the 1:1 line. Points below the 1:1 line indicate the NMSE of \nmultivariate model is smaller than that of univariate model and these sites \nhave better forecast performance using multivariate model.  \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 16 \n 329 \n 330 \n4. Discussion 331 \nIn this study, we performed two statistical analyses to gain deeper insights into Chl-a 332 \ndynamics in Hong Kong waters using Empirical Dynamic Modeling (EDM) tools: (1) 333 \nidentifying causal factors influencing Chl-a dynamics and (2) improving the forecasting 334 \naccuracy of Chl-a dynamics by incorporating these causal factors. Also, we used two datasets, 335 \nin situ measurements and remote sensing data, to briefly assess the reliability of remote 336 \nsensing data in a coastal region characterized by highly dynamic nature and high turbidity, 337 \nwhich often interfere with data accuracy. 338 \n 339 \n4.1 Causal effects of temperature on Chl-a for in situ measurement and 340 \nremote sensing data 341 \nFirst, we found that a causal effect of temperature on Chl-a occurs in some inner corners of 342 \nsemi-closed bays, for example, Deep Bay, Mirs Bay and Tolo Harbor derived from in situ 343 \nFigure 6. An example of predictions of multivariate model and univariate model of \nscaled 1st differenced Chl-a concentration time series with NMSE of 0.411 and 0.580, \nrespectively. The original time series belongs to a site located at Port Shelter \n(22.31250°N, 114.3125°E). To clearly present the data, only the first 100 time \npoints of the time series are displayed here. \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 17 \nmeasurement data. Temperature dynamics might reflect the occurrences of downwelling 344 \ninducing weak exchange between these water bodies and open water and long residence time 345 \nof the bay water (Harrison et al., 2008). Under such circumstances, these water bodies could 346 \nbe considered as an incubator allowing sufficient time for phytoplankton to respond to local 347 \nnutrient inputs and change the dynamics of Chl-a. Additionally, we found that the strength 348 \nand spatial patterns of the causal effects of temperature on the Chl-a dynamics were different 349 \nbetween in situ measurement and remote sensing data (Fig. 2). This discrepancy might be 350 \nbecause in situ measurement and remote sensing monthly data were collected in different 351 \nways. EPD conducted monitoring at irregular intervals, collecting data on different dates in 352 \neach month. In our analysis, this in situ data with irregular intervals was regarded as “regular 353 \ninterval data,” where each data point represents one month. In contrast, monthly remote 354 \nsensing data was created by averaging daily measurements, potentially providing a more 355 \naccurate representation of monthly conditions. Thus, while the accuracy of temperature 356 \nmeasurement should be higher in the in situ data, the constant monitoring intervals of the 357 \nremote sensing data would be better suited for assessing the impact of temperature on Chl-a 358 \ndynamics, as EDM requires time series data of consistent intervals. 359 \n 360 \n4.2 Causal effects of salinity, pH, turbidity, and nutrients on Chl-a for in situ 361 \nmeasurement data 362 \nRegarding the other environmental variables, we used the in situ data only because of the 363 \nlimited data availability. The causal effects of salinity observed were concentrated in the 364 \nsouthern regions and Victoria Harbor (Fig. 3b) in accord with a previous study that showed 365 \nthe establishment of a stable water column by the intrusion of the Pearl River freshwater 366 \nmass, which could promote algal blooms in these areas (Yin, 2003). The effect of salinity on 367 \nChl-a dynamics is stronger than that of other factors (Fig. 3a), suggesting the importance of 368 \nfreshwater discharge and physical processes in algal bloom formation in Hong Kong waters.  369 \npH is related to the availability of inorganic carbon (HCO3–), which is necessary for 370 \nphotosynthesis, and thus, changes in pH could affect algal growth (Liu et al., 2016). Causal 371 \neffects of pH are generally found in sites close to the shorelines (Fig. 3c). This might be 372 \nbecause pH levels in shorelines are less stable due to more intensive biological activities 373 \ncompared to pH levels in the open ocean area (Duarte et al., 2013). 374 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 18 \nPrimary production of phytoplankton is directly proportional to light availability, which 375 \nis influenced by turbidity (Domingues et al., 2011). Our results demonstrate that compared to 376 \nother environmental factors, turbidity generally has a weaker and less common causal effect 377 \non Chl-a dynamics (Fig. 3d and a). This might suggest that light availability is a less 378 \nsignificant bottom-up factor for phytoplankton dynamics when compared to nutrients, at least 379 \nin Hong Kong waters. However, in Deep Bay, an estuarine oyster farming zone, turbidity 380 \nexhibits a notably stronger causal effect. As filter feeders, oysters consume phytoplankton 381 \nand are more efficient at filtering in clear water (Meeuwig et al., 1998). In oyster farming 382 \nareas, the impact of turbidity on Chl-a dynamics could be influenced not only by light 383 \npenetration but also by oyster herbivory, which might make this effect more pronounced in 384 \nDeep Bay than in other regions. 385 \nThe causal effect of the N/P ratio was evident not only in eutrophic areas such as Tolo 386 \nHarbor but also in oligotrophic regions such as Mirs Bay and Port Shelter (Fig. 3e). This 387 \nindicates that nutrient limitation, which is indicated by the N/P ratio, may be a factor 388 \ncontrolling the Chl-a dynamics in these regions. The similar pattern of causal effect of silica 389 \nwith salinity (Fig. 3f and b) could suggest that the availability of silica is affected by fresh 390 \nriver discharge. A previous experimental study demonstrated that available silicon in benthic 391 \nsediments is subjected to release into the overlying water column for plankton uptake in 392 \nestuarine and continental shelf environments with lower salinity (Qin & Weng, 2006). 393 \nOverall, the causal effects of environmental factors on Chl-a are site-dependent. The 394 \nhydrodynamic conditions in each area should be taken into consideration when trying to 395 \nexplain the Chl-a dynamics. As the climate in Hong Kong exhibits distinct dry and wet 396 \nseasons, the effects of rainfall and freshwater discharge (with salinity serving as an indicator) 397 \non Chl-a may be temporally dynamic (Lee et al., 2006). Therefore, future studies could focus 398 \non how changes in rainfall and freshwater discharge in different seasons control algal bloom. 399 \nAdditionally, water column stability is related to tidal flush, upwelling and downwelling 400 \ninduced by strong wind (Yin, 2003). However, tide current speed, wind speed, and wind 401 \ndirection were not explicitly included in our study sites due to the limitation to accessing such 402 \ndata. Thus, the possible causality of horizontal or vertical movement of water induced by tide 403 \nor monsoon could not be directly evaluated here. Further study could focus on datasets such 404 \nas those of Hong Kong Observatory (https://www.hko.gov.hk/en/index.html) for meteorology 405 \nparameters or Hong Kong Tidal Stream Prediction System 406 \n(https://current.hydro.gov.hk/main/download.php?lang=en) for oceanographic parameters to 407 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 19 \ndraw a fuller picture of algal bloom mechanisms in different water districts. 408 \n 409 \n4.3 Forecasting Chl-a dynamics with univariate and multivariate models 410 \nWe combined the UIC-based causality detection and MDR S-map to conduct near-future 411 \nforecasting of Chl-a dynamics. For the in situ measurement data, multivariate models (i.e., 412 \nmodels with Chl-a and other environmental variables) showed similar forecast performance 413 \nwith the univariate model (i.e., Chl-a only model) (Fig. 4). Possible explanations for this 414 \nresult include: 1) The monthly time series data could not effectively capture the fluctuations 415 \nin nutrient levels and the corresponding responses of Chl-a (or phytoplankton). It has been 416 \nshown that nutrients delivered by tidal currents and atmospheric inorganic nitrogen can be 417 \nrapidly consumed by phytoplankton within a timescale of just a few hours (Lo et al., 2025), 418 \nsuggesting that finer temporal resolution data is necessary to make more accurate near-future 419 \nforecasting. 2) Although the duration of physical processes (e.g., stratification induced by 420 \nriver discharge and downwelling caused by monsoons) usually lasts for a few months, neither 421 \nsalinity nor temperature is a direct indicator of these processes. Incorporating “indirect” 422 \nindicators of these physical processes might not be sufficient to improve the forecasting 423 \naccuracy of the univariate model. Future efforts could explore various combinations of 424 \nfactors across different sites at different time scales, considering that the mechanisms behind 425 \nalgal bloom formation may differ in different water bodies. 426 \nFor the remote sensing data, we found a significant improvement in forecast 427 \nperformance of the multivariate MDR S-map compared to the univariate MDR S-map in 428 \nmost of the sites (Fig. 5), which highlights the potential of EDM that utilizes multiple 429 \nvariables and remote sensing data for monitoring and forecasting Chl-a dynamics. The 430 \nmultivariate model with SST input was also better at capturing the high peaks and low peaks 431 \nthan the univariate model (Fig. 6), suggesting that algal bloom outbreaks are affected by 432 \ntemperature dynamics. The clear improvement in the multivariate model performance for 433 \nremote sensing data compared to in situ measurement data may be attributed to the robustness 434 \nof monthly average data derived from daily observations, even in the presence of missing 435 \nvalues. In contrast, the irregular monitoring interval of time series (EPD conducted sampling 436 \non an irregular date of each month) may hinder the performance of EDM. 437 \n 438 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 20 \n4.4 Comparison of the in situ measurement data and remote sensing data, 439 \nand perspectives for future Chl-a monitoring and forecasting 440 \nAlthough Chl-a dynamics of in situ measurement data and remote sensing data at some 441 \nneighboring sites showed similar trends (Fig. 1c), the causal effect of temperature on Chl-a 442 \nshowed different spatial patterns using these two datasets (Fig. 2). The significant causal 443 \neffect of temperature was clearly stronger and more common for remote sensing data. Also, 444 \nthe MDR S-map showed improved forecast performance only for remote sensing data. This 445 \nwas expected since EDM is developed for recovering the trajectories of variables coupling 446 \nwith each other and is supposed to demonstrate higher forecast performance if these variables 447 \nhave stronger causal links. Our results reveal that different methods of collecting monthly 448 \ndata can lead to different patterns. Therefore, a consistent sampling interval is recommended 449 \nto improve forecast performance when applying EDM. 450 \nFuture efforts should focus on integrating other data resources and analysis methods, 451 \nincluding chlorophyll types and/or algae species information and neural network-based 452 \nalgorithms. First, our analysis did not include any functional and/or species information of 453 \nalgae. Different HAB species may have different physiological and population-level 454 \ncharacteristics (Chen et al., 2023), and including them in the model could provide better 455 \nforecasting performance and more detailed information about the algal dynamics (Xi et al., 456 \n2021). In addition, an advanced deep-learning model that utilized Chl-a data across a broad 457 \nspatiotemporal scale in a coastal ocean (Zhang et al., 2025) has provided a potential solution 458 \nfor resolving the missing observation issue of remote sensing data. Further, we could 459 \nintegrate time series data of Chl-a concentrations with historical algal bloom incidents. By 460 \nemploying classifiers such as support vector machines (Keerthi et al., 2001, a machine 461 \nlearning method), we can identify patterns of Chl-a and environmental factors prior to algal 462 \nbloom outbreaks. This approach could ultimately enhance our ability to predict algal bloom 463 \nevents.  464 \n 465 \n5. Conclusions 466 \nIn the present study, we revealed causal effects of environmental factors on Chl-a and 467 \ntheir potential for improving the performance of forecasting Chl-a in Hong Kong waters 468 \nutilizing a nonlinear time series analysis called Empirical Dynamic Modeling (EDM) and two 469 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 21 \nparallel sets of twenty-year time series data from in situ measurements (provided by the 470 \nEnvironmental Protection Department; EPD) and remote sensing data (from MODIS). For in 471 \nsitu measurement data, salinity exhibited the strongest causal effect on Chl-a compared to 472 \nother environmental factors, suggesting the importance of oceanographic processes, such as 473 \nstratification induced by freshwater discharge, in algal bloom formation. As for remote 474 \nsensing data, SST showed a significant causal effect of Chl-a at most sites and a multivariate 475 \nmodel including Chl-a and SST outperformed the univariate model at most sites, highlighting 476 \nthe potential of the multivariate models of EDM. Although the in situ measurement data and 477 \nremote sensing data showed similar Chl-a dynamics in Hong Kong waters, our causal 478 \nanalysis and forecasting model revealed several differences between the two datasets. These 479 \nfindings suggest that accounting for data characteristics (e.g., monitoring intervals) is 480 \nessential for achieving more efficient and effective monitoring. Overall, this study 481 \ndemonstrates the application of nonlinear time series analysis, EDM, to monthly Chl-a 482 \ndynamics derived from in situ measurements and remote sensing and shows how such 483 \napproaches can provide insights into Chl-a dynamics in Hong Kong waters. To enhance the 484 \nwater quality monitoring and improve forecasting HAB occurrence in Hong Kong waters, 485 \nfuture studies may consider incorporating physical dynamics, developing methods to mitigate 486 \nthe effects of irregular sampling intervals, including species and/or population characteristics, 487 \nand exploring the potential of other data analysis approaches such as deep learning.  488 \n  489 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 22 \nReferences 490 \nAnderson, D. M., Fensin, E., Gobler, C. J., Hoeglund, A. E., Hubbard, K. A., Kulis, D. M., 491 \nLandsberg, J. H., Lefebvre, K. A., Provoost, P., Richlen, M. L., Smith, J. L., Solow, A. 492 \nR., & Trainer, V . L. (2021). Marine harmful algal blooms (HABs) in the United 493 \nStates: History, current status and future trends. Harmful Algae, 102, 101975. 494 \nhttps://doi.org/10.1016/j.hal.2021.101975 495 \nBerkeley, G. (1988). Principles of human knowledge (1710). Pinguin Classics. 496 \nCenci, S., Sugihara, G., & Saavedra, S. (2019). Regularized S-map for inference and 497 \nforecasting with noisy ecological time series. Methods in Ecology and Evolution, 498 \n10(5), 650–660. https://doi.org/10.1111/2041-210X.13150 499 \nChang, C.-W., Miki, T., Ushio, M., Ke, P.-J., Lu, H.-P., Shiah, F.-K., & Hsieh, C. (2021). 500 \nReconstructing large interaction networks from empirical time series data. Ecology 501 \nLetters, 24(12), 2763–2774. https://doi.org/10.1111/ele.13897 502 \nChen, D., Shi, Z., Li, R., Li, X., Cheng, Y ., & Xu, J. (2023). Hydrodynamics drives shifts in 503 \nphytoplankton community composition and carbon-to-chlorophyll a ratio in the 504 \nnorthern South China Sea. Frontiers in Marine Science, 10. 505 \nhttps://doi.org/10.3389/fmars.2023.1293354 506 \nChen, J., Zhang, M., Cui, T., & Wen, Z. (2013). A Review of Some Important Technical 507 \nProblems in Respect of Satellite Remote Sensing of Chlorophyll-a Concentration in 508 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 23 \nCoastal Waters. IEEE Journal of Selected Topics in Applied Earth Observations and 509 \nRemote Sensing, 6(5), 2275–2289. https://doi.org/10.1109/JSTARS.2013.2242845 510 \nDeconinck, D., Chan, L. L., Wang, P., & Qiu, J.-W. (2025). Long-term spatio-temporal 511 \nanalysis of red tides in Hong Kong and their environmental drivers and ecological 512 \nimplications. Marine Pollution Bulletin, 214, 117785. 513 \nhttps://doi.org/10.1016/j.marpolbul.2025.117785 514 \nDeyle, E. R., Bouffard, D., Frossard, V ., Schwefel, R., Melack, J., & Sugihara, G. (2022). A 515 \nhybrid empirical and parametric approach for managing ecosystem complexity: Water 516 \nquality in Lake Geneva under nonstationary futures. Proceedings of the National 517 \nAcademy of Sciences, 119(26), e2102466119. 518 \nhttps://doi.org/10.1073/pnas.2102466119 519 \nDeyle, E. R., May, R. M., Munch, S. B., & Sugihara, G. (2016). Tracking and forecasting 520 \necosystem interactions in real time. Proceedings of the Royal Society B: Biological 521 \nSciences, 283(1822), 20152258. https://doi.org/10.1098/rspb.2015.2258 522 \nDomingues, R. B., Anselmo, T. P., Barbosa, A. B., Sommer, U., & Galvão, H. M. (2011). 523 \nLight as a driver of phytoplankton growth and production in the freshwater tidal zone 524 \nof a turbid estuary. Estuarine, Coastal and Shelf Science, 91(4), 526–535. 525 \nhttps://doi.org/10.1016/j.ecss.2010.12.008 526 \nDuarte, C. M., Hendriks, I. E., Moore, T. S., Olsen, Y . S., Steckbauer, A., Ramajo, L., 527 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 24 \nCarstensen, J., Trotter, J. A., & McCulloch, M. (2013). Is Ocean Acidification an 528 \nOpen-Ocean Syndrome? Understanding Anthropogenic Impacts on Seawater pH. 529 \nEstuaries and Coasts, 36(2), 221–236. https://doi.org/10.1007/s12237-013-9594-3 530 \nFlewelling, L. J., Naar, J. P., Abbott, J. P., Baden, D. G., Barros, N. B., Bossart, G. D., 531 \nBottein, M.-Y . D., Hammond, D. G., Haubold, E. M., Heil, C. A., Henry, M. S., 532 \nJacocks, H. M., Leighfield, T. A., Pierce, R. H., Pitchford, T. D., Rommel, S. A., 533 \nScott, P. S., Steidinger, K. A., Truby, E. W., … Landsberg, J. H. (2005). Red tides and 534 \nmarine mammal mortalities. Nature, 435(7043), 755–756. 535 \nhttps://doi.org/10.1038/nature435755a 536 \nGlaser, S. M., Fogarty, M. J., Liu, H., Altman, I., Hsieh, C.-H., Kaufman, L., MacCall, A. D., 537 \nRosenberg, A. A., Ye, H., & Sugihara, G. (2014). Complex dynamics may limit 538 \nprediction in marine fisheries. Fish and Fisheries, 15(4), 616–633. 539 \nhttps://doi.org/10.1111/faf.12037 540 \nHarrison, P. J., Yin, K., Lee, J. H. W., Gan, J., & Liu, H. (2008). Physical–biological coupling 541 \nin the Pearl River Estuary. Continental Shelf Research, 28(12), 1405–1415. 542 \nhttps://doi.org/10.1016/j.csr.2007.02.011 543 \nKeerthi, S. S., Shevade, S. K., Bhattacharyya, C., & Murthy, K. R. K. (2001). Improvements 544 \nto Platt’s SMO Algorithm for SVM Classifier Design. Neural Computation, 13(3), 545 \n637–649. https://doi.org/10.1162/089976601300014493 546 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 25 \nKirkpatrick, B., Fleming, L. E., Squicciarini, D., Backer, L. C., Clark, R., Abraham, W., 547 \nBenson, J., Cheng, Y . S., Johnson, D., Pierce, R., Zaias, J., Bossart, G. D., & Baden, 548 \nD. G. (2004). Literature review of Florida red tide: Implications for human health 549 \neffects. Harmful Algae, 3(2), 99–115. https://doi.org/10.1016/j.hal.2003.08.005 550 \nLee, J. H. W., Harrison, Paul J., KUANG, C., & Yin, K. (2006). Eutrophication Dynamics in 551 \nHong Kong Coastal Waters: Physical and Biological Interactions—HKUST SPD | 552 \nThe Institutional Repository. https://repository.hkust.edu.hk/ir/Record/1783.1-11432 553 \nLi, Y ., Li, Y ., Yang, H., Hong, Q., Huang, G., & Kattel, G. (2023). Century-Scale 554 \nEnvironmental Evolution of a Typical Subtropical Reservoir in the Guangdong–Hong 555 \nKong–Macao Greater Bay Area. Water, 15(20), Article 20. 556 \nhttps://doi.org/10.3390/w15203639 557 \nLiu, N., Yang, Y ., Li, F., Ge, F., & Kuang, Y . (2016). Importance of controlling pH-depended 558 \ndissolved inorganic carbon to prevent algal bloom outbreaks. Bioresource Technology, 559 \n220, 246–252. https://doi.org/10.1016/j.biortech.2016.08.059 560 \nLo, H. W., Yu, X., Chen, H., Chu, W. C., Chung, N. M., Lau, S. W., Li, J., Liang, S., Liao, K., 561 \nThomas, H. C. J., Wang, Z., Zhang, Z., Yu, J. Z., & Thibodeau, B. (2025). Tidal 562 \ncurrents and atmospheric inorganic nitrogen contribute to diurnal variation of 563 \ndissolved nutrients and chlorophyll a concentrations in Mirs Bay, Hong Kong. 564 \nRegional Studies in Marine Science, 81, 103941. 565 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 26 \nhttps://doi.org/10.1016/j.rsma.2024.103941 566 \nLopes, V . M., Costa, P. R., & Rosa, R. (2019). Effects of Harmful Algal Bloom Toxins on 567 \nMarine Organisms. In Ecotoxicology of Marine Organisms. CRC Press. 568 \nMeeuwig, J. J., Rasmussen, J. B., & Peters, R. H. (1998). Turbid waters and clarifying 569 \nmussels: Their moderation of empirical chl:nutrient relations in estuaries in Prince 570 \nEdward Island, Canada. Marine Ecology Progress Series, 171, 139–150. 571 \nhttps://doi.org/10.3354/meps171139 572 \nOsada, Y ., & Ushio, M. (2021). rUIC (Version v0.1.5) [Computer software]. Zenodo. 573 \nhttps://doi.org/10.5281/zenodo.5163234 574 \nOsada, Y ., Ushio, M., & Kondoh, M. (2023). Unified understanding of nonparametric 575 \ncausality detection in time series (p. 2023.04.20.537743). bioRxiv. 576 \nhttps://doi.org/10.1101/2023.04.20.537743 577 \nPradhan, B., Kim, H., Abassi, S., & Ki, J.-S. (2022). Toxic Effects and Tumor Promotion 578 \nActivity of Marine Phytoplankton Toxins: A Review. Toxins, 14(6), Article 6. 579 \nhttps://doi.org/10.3390/toxins14060397 580 \nQin, Y .-C., & Weng, H.-X. (2006). Silicon release and its speciation distribution in the 581 \nsurficial sediments of the Pearl River Estuary, China. Estuarine, Coastal and Shelf 582 \nScience, 67(3), 433–440. https://doi.org/10.1016/j.ecss.2005.11.015 583 \nSchreiber, T. (2000). Measuring Information Transfer. Physical Review Letters, 85(2), 461–584 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 27 \n464. https://doi.org/10.1103/PhysRevLett.85.461 585 \nSugihara, G. (1994). Nonlinear forecasting for the classification of natural time series. 586 \nPhilosophical Transactions of the Royal Society of London. Series A: Physical and 587 \nEngineering Sciences, 348(1688), 477–495. https://doi.org/10.1098/rsta.1994.0106 588 \nSugihara, G., & May, R. M. (1990). Nonlinear forecasting as a way of distinguishing chaos 589 \nfrom measurement error in time series. Nature, 344(6268), 734–741. 590 \nhttps://doi.org/10.1038/344734a0 591 \nSugihara, G., May, R., Ye, H., Hsieh, C., Deyle, E., Fogarty, M., & Munch, S. (2012). 592 \nDetecting Causality in Complex Ecosystems. Science, 338(6106), 496–500. 593 \nhttps://doi.org/10.1126/science.1227079 594 \nTsai, C.-H., Munch, S. B., Masi, M. D., & Stevens, M. H. (2024). Empirical dynamic 595 \nmodeling for sustainable benchmarks of short-lived species. ICES Journal of Marine 596 \nScience, 81(7), 1209–1220. https://doi.org/10.1093/icesjms/fsae080 597 \nUshio, M. (2022). Interaction capacity as a potential driver of community diversity. 598 \nProceedings of the Royal Society B: Biological Sciences, 289(1969), 20212690. 599 \nhttps://doi.org/10.1098/rspb.2021.2690 600 \nUshio, M. (2025). ong8181/macam: V0.1.10 (Version v0.1.10) [Computer software]. Zenodo. 601 \nhttps://doi.org/10.5281/zenodo.15622451 602 \nUshio, M., Hsieh, C., Masuda, R., Deyle, E. R., Ye, H., Chang, C.-W., Sugihara, G., & 603 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 28 \nKondoh, M. (2018). Fluctuating interaction network and time-varying stability of a 604 \nnatural fish community. Nature, 554(7692), 360–363. 605 \nhttps://doi.org/10.1038/nature25504 606 \nWang, D.-Z. (2008). Neurotoxins from Marine Dinoflagellates: A Brief Review. Marine 607 \nDrugs, 6(2), Article 2. https://doi.org/10.3390/md6020349 608 \nWong, K. T. M., Lee, J. H. W., & Hodgkiss, I. J. (2007). A simple model for forecast of 609 \ncoastal algal blooms. Estuarine, Coastal and Shelf Science, 74(1), 175–196. 610 \nhttps://doi.org/10.1016/j.ecss.2007.04.012 611 \nXi, H., Losa, S. N., Mangin, A., Garnesson, P., Bretagnon, M., Demaria, J., Soppa, M. A., 612 \nHembise Fanton d’Andon, O., & Bracher, A. (2021). Global Chlorophyll a 613 \nConcentrations of Phytoplankton Functional Types With Detailed Uncertainty 614 \nAssessment Using Multisensor Ocean Color and Sea Surface Temperature Satellite 615 \nProducts. Journal of Geophysical Research: Oceans, 126(5), e2020JC017127. 616 \nhttps://doi.org/10.1029/2020JC017127 617 \nYe, H., Beamish, R. J., Glaser, S. M., Grant, S. C. H., Hsieh, C., Richards, L. J., Schnute, J. 618 \nT., & Sugihara, G. (2015). Equation-free mechanistic ecosystem forecasting using 619 \nempirical dynamic modeling. Proceedings of the National Academy of Sciences, 620 \n112(13), E1569–E1576. https://doi.org/10.1073/pnas.1417063112 621 \nYe, H., & Sugihara, G. (2016). Information leverage in interconnected ecosystems: 622 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 29 \nOvercoming the curse of dimensionality. Science, 353(6302), 922–925. 623 \nhttps://doi.org/10.1126/science.aag0863 624 \nYin, K. (2003). Influence of monsoons and oceanographic processes on red tides in Hong 625 \nKong waters. Marine Ecology Progress Series, 262, 27–41. 626 \nhttps://doi.org/10.3354/meps262027 627 \nZhang, F., Kung, H., Zhang, F., Yang, C., & Gan, J. (2025). AI-powered spatiotemporal 628 \nimputation and prediction of chlorophyll-a concentration in coastal ecosystems. 629 \nNature Communications, 16(1), 7656. https://doi.org/10.1038/s41467-025-62901-9 630 \nZohdi, E., & Abbaspour, M. (2019). Harmful algal blooms (red tide): A review of causes, 631 \nimpacts and approaches to monitoring and prediction. International Journal of 632 \nEnvironmental Science and Technology, 16(3), 1789–1806. 633 \nhttps://doi.org/10.1007/s13762-018-2108-x 634 \n 635 \n  636 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint \n\nCausal factors of coastal chlorophyll-a dynamics and near future forecasting \n 30 \nData and Code Accessibility: All data used in this study was downloaded from public 637 \ndatabases. All scripts and formatted data used in this study are available on Github 638 \n(https://github.com/sxhuang00/causality_forecast_chl). 639 \n 640 \nDeclaration of generative AI use: We used generative AI tools to polish the English 641 \nlanguage and improve the clarity of the text. All AI-generated suggestions were manually 642 \nreviewed and verified for accuracy and clarity. 643 \n 644 \nAcknowledgments: We thank Takamitsu Ohigashi and Yining Xu for their assistance in data 645 \nanalysis. We thank Mengqiu Wang for her valuable advice on the use of remote sensing data. 646 \nThis research was supported by The Hong Kong University of Science and Technology 647 \nStartup Fund to MU. 648 \n 649 \nAuthor contributions: SH and MU conceived research; SH and MU designed research; SH 650 \nanalyzed the data with help from MU; MU wrote a custom function to perform the MDR S-651 \nmap; SH and MU wrote the first draft, discussed the results, and completed the manuscript. 652 \n 653 \nConflicts of Interest declaration: The authors declare no conflict of interest. 654 \n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted January 8, 2026. ; https://doi.org/10.64898/2026.01.07.698127doi: bioRxiv preprint","source_license":"CC-BY-4.0","license_restricted":false}