19 December 2022 Mapping the probability of freshwater algal blooms with various spectral indices and sources of training data
Tyler King, Stephen Hundt, Konrad Hafen, Victoria Stengel, Scott Ducar
Author Affiliations

Algal blooms are pervasive in many freshwater environments and can pose risks to the health and safety of humans and other organisms. However, monitoring and tracking of potentially harmful blooms often relies on in-person observations by the public. Remote sensing has proven useful in augmenting in situ observations of algal concentration, but many hurdles hinder efficient application by end users. First, numerous approaches to estimate aquatic chlorophyll-a are available and can produce inconsistent results. Second, lack of quantitative in situ observations limits opportunities to train models for specific waterbodies, such that models developed for other systems must be used instead. We (1) implement univariate and multivariate logistic regression models to estimate the probability that aquatic chlorophyll-a concentrations exceed an accepted threshold beyond which harmful effects become likely and (2) evaluate the use of visually classified bloom/no-bloom satellite imagery to augment in situ training data. Using a binary classification of aquatic chlorophyll-a exceeding 10 μg / L, we found that (1) logistic regression models were ∼80 % accurate, (2) univariate models trained with visually classified data produce nearly the same accuracy (79%) as models trained with in situ observations (80%), and (3) augmenting in situ chlorophyll-a observations with visual classifications outperformed (82% accuracy) models trained on in situ observations alone (80% accuracy). These results provide a framework for evaluating multiple spectral indices in retrieving algal bloom presence or absence and illustrate that training data derived directly from satellite imagery can be useful in augmenting in situ observations.



Freshwater algal blooms are a global concern,14 and there is evidence that they are becoming more common in response to climate change.1,57 Because algal blooms can adversely affect public health, economies, and ecosystem services by degrading water quality,8,9 early identification of algal blooms can improve public safety and mitigate economic concerns.

Algal blooms are often identified via visual inspection of a waterbody,10 with reports from both waterbody managers and the public playing a fundamental role in algal bloom monitoring for state environmental monitoring agencies.1116 Although visual inspection by water quality agencies and public health departments is a relatively accurate way to identify the presence of algal blooms,10 the number of waterbodies that can be monitored this way is limited. Further, visual inspection results can be subjective and conclusions might differ between individuals, even when optical recording devices are used.17 As a result, some public health agencies maintain reactionary stances to algal bloom monitoring, waiting for a bloom to be reported before investigating, analyzing, and providing public health guidance.18,19 This stance can result in incomplete monitoring coverage (e.g., omission of algal bloom events unless reported), and delays in public health notices that can have real-world implications on human health and socioeconomics.19

Remote sensing has the potential to augment in situ visual inspection while increasing the spatial scale of coverage. In the past 50 years, considerable research attention has been devoted to developing remote sensing techniques for identifying and tracking algal blooms.20,21 Remote sensing of water quality for inland, freshwater systems has lagged marine applications partially due to the optical complexity of inland waters.22 Despite this lag, nearly 30 years of studies have focused on the development of methods to derive water quality metrics from spectral signatures.22 In the past 20 years, a shift toward operationalizing freshwater water quality remote sensing has occurred.22,23

Identifying cyanobacterial blooms has been the focus of significant investment in remote sensing, with particular focus on the ocean and land color instrument (OLCI) on board the Sentinel-3A and Sentinel-3B satellites.2427 By focusing on spectral features at 665 and 681 nm, this body of work relies on a well characterized two-step approach to identify the presence of phycocyanin and to then quantify the strength of the signal.24,28,29 OLCI collects imagery with a nominal 300-m ground sampling distance, allowing for the monitoring of larger waterbodies and the production of operational cyanobacterial index products at a large spatial scale. However, these products do not have sufficient spatial resolution to monitor the near-shore environment nor narrow waterbodies that are common in the intermountain west, where deep river valleys have been dammed to create reservoirs that produce hydropower and supply irrigation and drinking water.

Satellite-based sensors with spatial resolution sufficient to resolve narrow waterbodies [e.g., the operational land imager (OLI) on Landsat-8 and Landsat-9, and the multispectral instrument (MSI) on Sentinel-2A and Sentinel-2B] do not have the spectral resolution required to implement the cyanobacteria index approach listed above.29,30 Instead, work with these images to identify algal conditions has focused on retrieving chlorophyll-a,31,32 which has been demonstrated to serve as a robust surrogate for cyanobacterial concentrations in conditions dominated by cyanobacteria.33 Focusing on chlorophyll-a precludes differentiation between harmful algal blooms dominated by cyanobacteria and other aquatic photosynthetic growth.28,34,35 This lack of specificity leads to a bias toward public health protection when noncyanobacterial blooms are identified. Further, the 10-m spatial resolution delivered by Sentinel-2 imagery used in this study allows waterbody managers and public health officials to monitor relatively small waterbodies, narrow portions of larger waterbodies (e.g., bays), and near-shore environments where blooms can accumulate due to wind driven transport.36,37 In this work, we evaluate the ability to classify chlorophyll concentrations using higher spatial- but lower spectral- and temporal-resolution imagery from the MSI on board the Sentinel-2A and Sentinel-2B satellites.

Multiple spectral indices have been developed to retrieve chlorophyll-a conditions from a range of passive optical sensors and are presented in the literature.30,3846 However, none of these approaches have been shown to consistently outperform the others in retrieving chlorophyll-a concentrations. Additionally, we typically lack water quality observations for any given waterbody that are coincident with satellite imagery despite large-scale projects to compile such matchups.47 As such, two distinct challenges must be addressed when using satellite imagery to estimate water quality: (1) identifying spectral indices that describe water quality metrics of interest and (2) relating these spectral indices to water quality metrics in the absence of in situ observations. First, we hypothesize that incorporating multiple spectral indices will describe water quality more robustly than selecting a single spectral index. We test this hypothesis by evaluating the accuracy of single variate logistic regression models for each spectral index against multivariate logistic regression models that incorporate multiple spectral indices. Second, we hypothesize that algal blooms can be identified directly from true color composite satellite imagery, obviating the need for in situ observations. We test this hypothesis by training univariate and multivariate logistic regression models of algal bloom presence with bloom observations identified via visual interpretation of satellite imagery. We evaluate the performance of the logistic regression model calibrated with the visual interpretation calibration dataset relative to those calibrated with in situ samples to determine the efficacy of generating training data from satellite imagery. The work presented here differs from previous efforts by combining bloom presence and absence data with logistic regression models to produce bloom presence probabilities from multivariate models.




Study Site

This work was conducted in Brownlee Reservoir, located on the Idaho-Oregon border (Fig. 1). It is the largest reservoir in the Hells Canyon Complex of hydroelectric reservoirs at 61  km2 in surface area, 93 km in length, and 1.8  km3 in volume, with a maximum depth of nearly 100 m near the dam.50 The reservoir is 650  m wide, on average, and is surrounded by hills with 20% to 30% slopes. Brownlee Reservoir has designated beneficial uses of cold-water aquatic life, primary contact recreation, domestic water supply, industrial water supply, irrigation water, livestock watering, salmonid rearing and spawning, resident fish and aquatic life, wildlife and hunting, fishing, boating, aesthetics, and hydropower.51 Brownlee Reservoir is listed as impaired for excess nutrients associated with nuisance algae growth and has a history of cyanobacteria blooms.51 The reservoir is an active recreation destination with 20,000 nights of camping along the shore of the reservoir in 2013.11 Additionally, discharge from Brownlee Reservoir flows into the Hells Canyon National Recreational Area, which has been estimated to have more than 50,000 boaters visit per year making it a significant economic resource where the populace can be impacted by water quality.50

Fig. 1

Study area map: (a) overview of Brownlee Reservoir (blue polygon) with a Sentinel 2 true color background (red, band 4; green, band 3; and blue, band 2) from July 5, 2019.48 (b) Locations where in situ samples were collected. Red “+” symbols and blue “x” symbols indicate sites where chlorophyll-a concentrations were above or below 10  μg/L, respectively. (c) Locations of manual digitization of bloom presence and absence. Red “+” symbols and blue “x” symbols represent “bloom” and “no-bloom” classifications, respectively.49



Field Data Collection

Water samples were collected by Idaho Power Company personnel from Brownlee Reservoir. Samples were collected from predetermined locations within the reservoir with known coordinates to match sample collection locations with pixels in the associated satellite imagery. Samples were collected within 2 m of the surface, immediately placed on ice, and delivered to the analysis laboratory within 24 h. Samples were spectrophotometrically analyzed for total chlorophyll-a, corrected for pheophytin following standard method 10200H.2.52 Only results from samples collected on the same date as Sentinel-2 satellite imagery were included in this analysis.

The World Health Organization has identified chlorophyll-a concentrations exceeding 10  μg/L to be associated with a transition from slight to moderate risk of adverse health effects from primary contact in cases where Microcystis dominates the chlorophyll-a concentration.53 Although the dominant taxa are not identified in this work, “bloom” and “bloom conditions” are defined in this work to represent chlorophyll-a concentrations greater than or equal to 10  μg/L.


Visual Bloom Identification

To evaluate the efficacy of developing training datasets directly from satellite imagery, points representing the distinct presence or absence of an algal bloom were visually interpreted and digitized (Fig. 2) from a series of 26 Sentinel-2 satellite images obtained from the Copernicus Application Programming Interface.48,49 Digitization was conducted in the Geographic Information System ArcMAP 10.8.1 from Environmental Systems Research Institute, Inc. (Redlands, California) where true-color (red, band 4; green, band 3; and blue, band 2) Sentinel-2 images were displayed. Minimum and maximum values in the visualization were set to the equivalent to 0% and 100% reflectance, respectively. Locations associated with algal blooms were visually identified as pixels with elevated reflectance in the green band arranged in continuous shapes associated with algal blooms [Figs. 2(b) and 2(d)]. Points representing no-bloom conditions were identified based on low reflectance in the red, green, and blue bands to provide class balance in training data [Figs. 2(c) and 2(e)]. Bloom and no-bloom conditions were assigned without knowledge of in situ observations to reduce identification bias. Incorporating these data in the evaluation of spectral indices leverages the information that is readily available within historic satellite imagery via conventional image interpretation and is similar to approaches used to develop training data for pixel-based supervised image classification of land cover.54,55

Fig. 2

Example Sentinel-2 imagery48 of (a) Brownlee Reservoir visualized in true color (red, band 4; green, band 3; and blue, band 2) with manually digitized points representing (b), (d) bloom and (c), (e) no-bloom conditions. The northern red boxes in (a) correspond with the extent of (b) and (c). The southern red box in (a) corresponds with the extent of (d) and (e).



Satellite Imagery

Level 1C top of atmosphere imagery collected with the multispectral instrument (MSI) sensors on the Sentinel-2A and Sentinel-2B satellites for tile 11TMK was obtained from the European Space Agency through the Copernicus Application Programming Interface.48 Top of atmosphere imagery was atmospherically corrected using the dark spectrum fitting algorithm approach implemented in the Atmospheric Correction for OLI ‘lite’ generic processor version (v20190326.0) to produce aquatic reflectance products.56 See Ref. 57 for a full description of the dark spectrum fitting approach. Default settings were used in the atmospheric correction with the exception that waterbody elevation was set to 610 m above sea level to account for atmospheric path length.

At each location where an in situ or visually identified observation was made, aquatic reflectance values for each band were extracted from all pixels with centers within 50 m of the observation’s location.49 A 50-m buffer was used to spatially smooth reflectance values and to account for potential positional error in sample collection location. The median reflectance values within the 50-m buffer were used to represent each band’s value at the specified location. The median statistic was used rather than the mean to reduce the impact of outliers on the resulting aquatic reflectance values.


Spectral Index Evaluation

Seventeen spectral indices that were expected to be sensitive to chlorophyll-a concentrations were selected from the literature and evaluated30,3846 (Table 1). Spectral indices developed for sensors other than the MSI sensor used in this work were selected if the central wavelengths of all bands used in index development [i.e., bands from OLI or the medium resolution imaging spectrometer (MERIS)] fell within unique MSI bands. MSI bands were defined for this work by their central wavelengths and full-width half-maximums.60

Table 1

Sentienl-2 spectral indices evaluated.

Index IDIndex nameEquations using Sentinel-2 bandsCitationSensor used in citation
S01Be162Bsubb5b4Beck et al.39MSI
S02BR232b3O’Reilly et al.44OLI and MSI
S03BR54b5b4Gons et al.40MSI
S04BR8a4b8ab4Tebb et al.45OLI
S05Go04MCIb5b6Gower et al.58MERIS
S06KIVUb2b4b3Beck et al.30OLI
S07L83BDA(1b21b4)*b3Beck et al.30OLI
S08FLHvioletb3(b4+(b1b4))Beck et al.30OLI
S09MCIb5b4(b6b4)*(704.1664.6740.5664.6)Le et al.41MERIS
S10Moses3b(1b41b5)*b6Moses et al.43MSI
S11NDCI54(b5b4b5+b4)Mishra et al.42MSI
S12NDCI8a4(b8ab4b8a+b4)Beck et al.30OLI
S13S23BDA(1b41b5)*8Beck et al.30MSI
S14FLHblueb3(b4+(b2b4))Beck et al.30MSI
S16Tomingb5(b4+b62)Toming et al.46MSI
S17ZhFLHb8a(b5+(b4b5))Zhao et al.59MERIS


Binary Logistic Regression

The probability that an algal bloom was present for each pixel in a Sentinel-2 image was determined by relating the presence (chlorophyll-a concentration >10  μg/L) or absence (chlorophyll-a concentration <10  μg/L) of an algal bloom to the value for one or more spectral indices using a binary logistic regression approach. Binary logistic regression was implemented as follows:

Eq. (1)

p=eβ0+β1*X1+  βn*Xn1+eβ0+β1*X1+  βn*Xn,
where p is the probability that chlorophyll-a concentration exceeded 10  μg/L, β0 is an intercept calibration term, and β1 through βn are the parameter effects for spectral indices X1 through Xn. To address class imbalances in calibration data (i.e., more observations of bloom versus nonbloom conditions), weights were applied to the observations as

Eq. (2)


Eq. (3)

where Wp and #P are the weights for and number of bloom condition observations, respectively, and WN and #N are the weights for and number of nonbloom condition observations, respectively.

Univariate and multivariate logistic regression models were developed to assess performance of individual spectral indices and combinations of spectral indices to identify algal blooms. Additionally, logistic regression models were trained and tested with different combinations of in situ and visually identified observations to evaluate the impact of different training data sources on model performance (Table 2).

Table 2

Logistic model regression scenarios for the univariate and multivariate models.

Calibration scenarioTraining dataTesting data
Gaged80% in situ observations20% in situ observations
Ungaged100% manually classified points20% in situ observations
Augmented80% in situ observations + 100% manually classified points20% in situ observations

The “gaged” calibration scenario reflects a widely used approach to model calibration using in situ observations.30 The “ungaged” scenario evaluates the efficacy of training a model based on visually identified bloom occurrences when in situ observations are too sparse or not available. The “augmented” scenario evaluates the utility of augmenting in situ observations with visually identified blooms.

Each modeling scenario and the associated training and testing data are described in the following sections. Performance between and among univariate and multivariate models was evaluated using the accuracy metrics described at the end of this section. All analyses were conducted in version 3.6.0 of the R statistical programming language 61 using RStudio v1.2.1335.62


Univariate logistic regression models

Logistic regression models were developed for each of the 17 spectral indices listed in Table 1 and each of the calibration scenarios listed in Table 2. Performance of the resulting univariate models was quantified by assessing the accuracy of each individual index in identifying algal blooms; these results provided benchmarks to compare multivariate models.


Multivariate logistic regression models

Multivariate logistic regressions were produced to test the hypothesis that classifications based on multiple spectral indices are more robust than classifications from single spectral indices. Three multivariate logistic regression models were developed, one for each of the gaged, ungaged, and augmented scenarios in Table 2, to assess how in situ and visually identified training data affect accuracy of algal bloom identification from Sentinel-2 imagery.

Multivariate logistic regressions were produced from the spectral indices listed in Table 1 using a three-step approach. First, highly correlated spectral indices were identified based on their variance inflation factor (VIF) values and removed one at a time to achieve a subset of spectral indices where the VIF for each index was <10.63,64 This was done by removing the index with the highest VIF, recomputing VIF for all remaining indices and removing the subsequent index with the highest VIF. This process was repeated until no indices had VIF values above ten. Second, the scenario-specific training dataset identified in Table 2 was selected. Third, multivariate logistic regressions were calibrated using all spectral indices identified through the VIF-based variable selection process. During the calibration procedure, parsimonious multivariate models were identified using stepwise variate selection with the objective of minimizing the Akaike information criterion (AIC).65 This procedure was repeated for all three calibration scenarios in Table 2.


Accuracy Assessment

The accuracy of the logistic regression models was evaluated using a 10-fold cross validation approach with an 80% calibration, 20% validation split (Table 2). For each iteration, 80% of the in situ data were randomly selected as the training dataset, and the remaining 20% were used to test model accuracy. Performance was evaluated using four metrics: precision, recall, F1 score, and overall accuracy. Precision is a measure of how many of a model’s positive predictions (e.g., above threshold) were correct [Eq. (4)], whereas recall measures how many of the positive observations were identified as such in the model [Eq. (5)]. These are given as

Eq. (4)


Eq. (5)

where #TP is the number of true positives, #FP is the number of false positives, and #FN is the number of false negatives. The F1 statistic was used as a multiple-criterion metric to evaluate the performance of logistic regressions that accounts for the trade-off between precision and recall.66 The F1 statistic was computed as

Eq. (6)


Accuracy, defined here as the percent of observations that were correctly classified, was calculated to provide a more intuitive and familiar evaluation of model performance. Accuracy was calculated as the number of true positive and true negative results divided by total number of observations in the validation dataset.67

An exceedance probability of 50% (0.5) was used to classify model output as exceeding 10  μg/L. Figure 3 provides a graphical example of the four possible outcomes, true positive, true negetive, false positive, and false negative, for each validation data point relative to the 50% and 10  μg/L thresholds.

Fig. 3

Schematic of result quadrant to illustrate the four possible outcomes for each validation data point. Observed data are classified as those that fall above or below 10  μg/L of chlorophyll-a. Model results are divided between those predicting more or less than 50% probability of exceeding 10  μg/L.





Field Observations

Twenty-four in situ observations from 15 sites along Brownlee Reservoir were used in the analysis (Fig. 1). Chlorophyll-a concentrations in these samples ranged from 1.2 to 241  μg/L with a median value of 7  μg/L. There were 10 observations (42%) with concentrations of 10  μg/L or higher, indicating relative parity in observations above and below the 10  μg/L threshold. An additional 195 points were manually digitized from 26 Sentinel-2 images (Fig. 1). Of the manually digitized points, 109 (56%) were classified as blooming conditions. Data are available in Ref. 49.

Reflectance spectra from extract from imagery at bloom locations showed elevated reflectance in bands three (559  nm) and five (704  nm) for both in situ and visually identified bloom locations (Fig. 4). The reflectance values were similar between visually identified and in situ observations for the nonbloom conditions while reflectance values were higher for bands 3 (560  nm), 5 (704  nm), 6 (740  nm), 7 (783  nm), 8 (833  nm), and 8a (865  nm) for the visually identified data under bloom conditions than for the in situ data.

Fig. 4

Reflectance profiles for (a) visually identified points that were manually digitized and (b) in situ monitoring locations (bottom) separated into bloom (black) and no-bloom (white) conditions.



Univariate Model Performance

With the gaged calibration approach, the relationship between four spectral indices (S02, S08, S10, and S13) and chlorophyll concentration exceeding 10  μg/L were statistically significant (p<0.05). Of these four models, the univariate models based on S10 and S13 had the highest classification accuracies of 80% and F1 scores of 0.74 (Table 3). The univariate models established with the gaged calibration approach that used indices S09 and S16 were the highest performing with clear separation in exceedance probability between concentrations above and below the 10  μg/L threshold (Fig. 5) but were not found to be statistically significant (i.e., the model β1 term had p>0.05). Misclassified observations for models based on S10 and S13 had concentrations within 2.5  μg/L of the 10  μg/L threshold on average illustrating that for the best performing models, cases of misclassification were limited to conditions near the 10  μg/L threshold (Fig. 5).

Table 3

Univariate model performance using the gaged calibration approach.


aIndex developed for MSI (Table 1).

Fig. 5

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the gaged approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).


When training with the visually identified dataset and testing on the in situ observations in the ungaged approach, all spectral indices, except those based on S06 and S17, had statistically significant relationships (p<0.05) with the probability of bloom occurrence. Of these models, those based on S08 and S14 were the highest performers with accuracy rates of 79% and F1 scores of 0.67 (Table 4). However, separation in exceedance probabilities across concentrations was less clear (Fig. 6) when compared to the gaged calibration approach (Fig. 5). Misclassified observations for models based on S08 and S14 had concentrations within 4  μg/L of the 10  μg/L threshold on average, suggesting that cases of misclassification were limited to conditions near the 10  μg/L threshold for the best-performing models (Fig. 6).

Table 4

Univariate model performance using the Ungaged Calibration Approach.

IndexModel IDβ0β1β1p-valueAccuracyPrecisionRecallF1

aIndex developed for MSI (Table 1).

Fig. 6

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the ungaged approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).


When in situ observations are augmented with visually identified observations in the augmented calibration approach, all models except those based on S06 and S17 had statistically significant relationships with the probability of bloom occurrence (p<0.05). Models based on S05, S08, and S14 had the highest F1 scores (0.58) and the highest accuracy (74%, Table 5). Accuracy and F1 values for these highly correlated indices were lower than for the top performing models under the gaged and ungaged calibration approaches because of decreases in precision driven by an increase in false negatives (Fig. 7). Misclassified observations for models based on S05, S08, and S14 had concentrations within 3  μg/L of the 10  μg/L threshold on average, indicating that for the best performing models in the augmented calibration approach the cases of misclassification are limited to conditions near the 10  μg/L threshold (Fig. 7).

Table 5

Univariate model performance using the augmented calibration approach.

IndexModel IDβ0β1β1p-valueAccuracyPrecisionRecallF1

aIndex developed for MSI (Table 1).

Fig. 7

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the augmented approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).



Multivariate Model Performance

Of the 17 spectral indices examined for Sentinel-2 (Table 1), S01, S03, S04, S05, S09, S10, S11, S12, S14, S16, and S17 were found to be the most highly correlated with other indices (Fig. 8) and were removed in the stepwise, VIF-based variable selection process. The remaining six indices, S02, S06, S07, S08, S11, and S15, had VIF values <10 at the end of the stepwise removal process and were selected for evaluation in the multivariate regression approach.

Fig. 8

Spectral index correlation matrix. Pearson correlation coefficients are provided in the upper right. Positive correlation values are in blues, negatives are in reds. Indices with “*” annotations had VIF values <10 at the end of the stepwise removal process and were included in the multivariate model calibration process.


The best performing multivariate models for the gaged (MG), ungaged (MU), and augmented (MA) model calibration approaches had accuracies of 0.80, 0.79, and 0.82, respectively (Table 6). For the multivariate models, the augmented calibration approach also had the highest F1 statistic (0.73), although it is rather similar to the F1 score of 0.72 for the gaged calibration approach. Misclassified observations for the gaged, ungaged, and augmented multivariate models had concentrations within 3  μg/L of the 10  μg/L threshold, on average, suggesting that for all multivariate models the cases of misclassification are limited to conditions near the 10  μg/L threshold (Fig. 9).

Table 6

Performance of the multivariate and top performing univariate models.

Calibration approachModelAccuracyPrecisionRecallF1
GagedS10, S130.800.640.880.74
UngagedS08, S140.790.50>0.990.67
AugmentedS05, S08, S140.740.41>0.990.58
MG, MU, and MA refer to top performing multivariate models calibrated with the gaged, ungaged, and augmented approaches, respectively.

Fig. 9

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) for multivariate models calibrated with the (a) gaged, (b) ungaged, and (c) augmented approaches. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).


The spectral indices included in the best performing multivariate models varied by calibration scenario (Table 7). The multivariate model calibrated with the gaged approach (MG) selected two model members. The stepwise parameter selection process for the ungaged multivariate model calibration approach (Mu) resulted in a univariate model (S08), as a balance between model parsimony and maximum model likelihood. The multivariate model calibrated with the augmented dataset incorporated all potential spectral indices except for S02 and S06. In all cases, the models with the lowest AIC also had the highest F1 scores.

Table 7

Multivariate model parameters for each calibration approach.

TermIndex nameGaged (MG)Ungaged (MU)Augmented (MA)

aIndex developed for MSI.



We developed models that can be applied to identify algal blooms from satellite imagery by evaluating different data sources describing the presence and absence of algal bloom conditions against multiple spectral indices designed to identify chlorophyll presence.


Correlation of Spectral Indices

Although 17 spectral indices were identified in the literature and evaluated in this work, many were found to be highly correlated. Through an iterative index removal process, 11 indices were removed before the remaining indices had VIFs <10. This result indicates that only six of the evaluated 17 spectral indices are required to represent the observed variability in chlorophyll-a concentrations. Reducing the search space by more than 60% is valuable as it reduces the number of indices that require evaluation.


Spectral Indices

As expected, spectral indices developed for and evaluated on MSI imagery outperformed those developed for other sensors when used in isolation in the univariate models calibrated with in situ observations (Tables 1 and 3). Specifically, the top-performing univariate models with the gaged calibration approaches, S10 and S13, were developed for the MSI.30,43 They also both focus on band 5 (704 nm) relative to band 4 (665 nm) normalized by bands 6 (740 nm) or 8a (865 nm) thus illustrating the importance of the “red edge” and red bands for retrieving a chlorophyll signal in agreement with previous work.68 However, indices developed for other sensors joined the top performers when model calibration included visually identified data and in multivariate models. Specifically, S08, developed for OLI,30 and S14, developed for the MSI, were the top performing univariate models in the ungaged calibration scenario (Table 4). Indices S08 and S14 focus on the reflectance peak for band 3 (560 nm), illustrating the influence of “green” light on the identification of algal blooms when using RGB color composites to identify algal blooms. Index S05, developed for MERIS58 and focused on the “red edge,” joined S08 and S14 as a top performer for the augmented calibration scenario (Table 5). The improved performance from the augmented calibration scenarios (Table 6) highlights the value of using visible, spatial, and infrared cues to identify algal blooms.


Univariate versus Multivariate Results

The multivariate model performed just as well as all the statistically significant univariate models for the gaged and ungaged calibration scenarios. The multivariate model under the augmented calibration scenario was the highest performing of the statistically significant models overall. This increase in the performance could be due to the incorporation of multiple spectral features present in algal blooms (Fig. 4), as the ensemble model calibrated with in situ data augmented with spectra extracted from satellite imagery focused on bands 2–5 and 8a (Table 1). This result is similar to previous studies69,70 and is consistent with our hypothesis “incorporating multiple spectral indices is more robust than selecting a single spectral index.” The improvement in accuracy is attributable to an increase in precision associated with a reduction in false positives as well as an increase in recall. These results suggest that the multivariate models were more skilled in identifying observed bloom conditions (Table 6).


Incorporating Image Derived Training Data

The univariate and multivariate models trained on visually identified training data alone were nearly as accurate (79% accuracy) as training based on in situ observations (80% accuracy). This is a remarkable finding because it implies that training datasets can be built for waterbodies lacking in situ data by extracting the necessary information from the satellite images themselves. Further, the multivariate approach calibrated on the augmented observations provided the highest accuracy overall with a mean accuracy of 82% indicating a benefit of including visually identified end-member spectra even in cases where in situ data are available.

The multivariate model calibrated on visually identified data (MU) had near perfect model recall, meaning that nearly all the observed bloom conditions in the in situ observation dataset were identified in the resulting model. However, this same model had relatively low precision due to the presence of numerous false positives. The high recall and low precision indicate that classification with visually identified data is best suited to cases where decision makers tend to be more tolerant of false positives than false negatives. Notably, the probability (50%) and concentration (10  μg/L) thresholds can, and likely should, be adjusted in this approach to fit end-user communication and reporting needs. In fact, it can be seen in Fig. 9(c) that selecting a slightly higher chlorophyll-a threshold (15  μg/L) would result in perfect classification.

Figure 4 shows that the visually identified bloom locations had higher NIR reflectance than pixels identified as bloom conditions via in situ observations. This may reflect a bias in the visual interpretation toward identifying floating algae that would have higher NIR reflectance than submerged algae. Further, a robust analysis of the consistency and repeatability of manually classified training data in Brownlee and other waterbodies could improve classification.

The ungaged model results indicate that the use of image-derived spectra for training models could be useful in cases where in situ observations are limited. The reasonable accuracy obtained with the ungaged multivariate calibration (79% for MU), and the increased accuracy of the augmented multivariate (MA) relative to the gaged multivariate (MG) is consistent with our hypothesis “satellite imagery itself contains information useful for evaluating spectral indices.”


Spatial Patterns in Model Results

In addition to the correct identification of conditions at observation locations, the spatial patterns of model results can be examined qualitatively to confirm agreement with features visible in satellite imagery. In Fig. 10(a), an algal bloom is clearly seen in the true color composite. A sample collected from within this feature had chlorophyll-a concentration of 86.6  μg/L, verifying the feature as an algal bloom. The ribbon-like features of the algal bloom are well described by some of the models (Fig. 10). However, some univariate models do not appear to be sensitive to the presence of the algal mass, returning nearly uniform exceedance probabilities for all pixels in the image. Although this is not a quantitative assessment, examining the models’ abilities to reproduce spatial patterns of algal blooms provides insight into an index’s general performance.

Fig. 10

Classification of a bloom visible in Sentinel-2 true color (red, band 4; green, band 3; and blue, band 2) (a) imagery (RGB) from September 3, 202048 for each (b)–(r) individual spectral index (S1–S17) and the multivariate models for the (s) gaged (MG), (t) ungaged (MU), and (u) augmented (MA) scenarios. The bloom seen in these images was sampled on September 3, 2020 [“+” in (a)] and had chlorophyll-a concentration of 86.6  μg/L.



Sources of Uncertainty

The approach taken here is subject to multiple sources of uncertainty, including but not necessarily limited to the atmospheric correction procedure, interfering effects of sediment and other nonchlorophyll-a containing substances on the chlorophyll-a signal, the presence of nonalgal plants (e.g., submerged aquatic vegetation of sloughed macrophyte mats) obfuscating interpretation of the chlorophyll-a signal as an algal bloom, error rates associated with the visual identification process, the effects of wind-driven sun glint, the use of chlorophyll-a that is not corrected for degradation byproducts like pheophytin, adjacency effects, bottom reflectance, and potential temporal and spatial mismatch between in situ observations and extracted aquatic reflectance values. The limited number of in situ observations likely also contributed to calibration uncertainty, exemplifying the very common challenge of calibrating semiempirical approaches with limited data. Notably, the ungaged calibration approach removes the uncertainty associated with temporal and spatial mismatch as the signals are derived from imagery directly. This, in addition to a larger validation dataset, may have contributed to more univariate models with statistically significant calibrations under the ungaged approach relative to the gaged approach. Despite many potential sources of error, the achieved accuracies of 80% and higher indicate that the algal bloom signal is large in comparison with the noise associated with all these potential sources of uncertainty. The encouraging results reported herein notwithstanding, addressing each of these potential sources of uncertainty could improve model accuracy.


Future Applications

Our intent in introducing this approach is to provide an additional tool for public health and natural resource managers to identify potentially harmful conditions that warrant in situ monitoring. Providing timely situational awareness of algal bloom extent has the potential to increase resource efficiency by guiding field staff to priority sampling locations. These methods also afford the potential to identify nascent blooms in remote areas before they would be identified otherwise. Finally, historic satellite imagery contains information on algal bloom dynamics. Reanalysis of these images could provide information on spatial and temporal trends that might yield insight regarding potential drivers of algal blooms.



Multivariate models were as accurate as univariate indices in classifying aquatic chlorophyll-a relative to a 10-μg/L threshold. Manually digitized observations of end-member conditions (e.g., bloom and nonbloom) were used to calibrate aquatic chlorophyll-a retrieval in the absence of in situ observations with reasonable accuracy (79%) that is nearly equal to that of using in situ observations only (80%). Augmenting in situ observations with manually digitized observations of end-member conditions (e.g., bloom and nonbloom) improved remote sensing accuracy to 82%. These results suggest that image interpretation might be suitable for deriving training data for algal bloom classification in the absence of or to augment in situ observations matched with Sentinel-2 satellite imagery.


