Open Access
19 December 2022 Mapping the probability of freshwater algal blooms with various spectral indices and sources of training data
Tyler King, Stephen Hundt, Konrad Hafen, Victoria Stengel, Scott Ducar
Author Affiliations +
Abstract

Algal blooms are pervasive in many freshwater environments and can pose risks to the health and safety of humans and other organisms. However, monitoring and tracking of potentially harmful blooms often relies on in-person observations by the public. Remote sensing has proven useful in augmenting in situ observations of algal concentration, but many hurdles hinder efficient application by end users. First, numerous approaches to estimate aquatic chlorophyll-a are available and can produce inconsistent results. Second, lack of quantitative in situ observations limits opportunities to train models for specific waterbodies, such that models developed for other systems must be used instead. We (1) implement univariate and multivariate logistic regression models to estimate the probability that aquatic chlorophyll-a concentrations exceed an accepted threshold beyond which harmful effects become likely and (2) evaluate the use of visually classified bloom/no-bloom satellite imagery to augment in situ training data. Using a binary classification of aquatic chlorophyll-a exceeding 10 μg / L, we found that (1) logistic regression models were ∼80 % accurate, (2) univariate models trained with visually classified data produce nearly the same accuracy (79%) as models trained with in situ observations (80%), and (3) augmenting in situ chlorophyll-a observations with visual classifications outperformed (82% accuracy) models trained on in situ observations alone (80% accuracy). These results provide a framework for evaluating multiple spectral indices in retrieving algal bloom presence or absence and illustrate that training data derived directly from satellite imagery can be useful in augmenting in situ observations.

1.

Introduction

Freshwater algal blooms are a global concern,14 and there is evidence that they are becoming more common in response to climate change.1,57 Because algal blooms can adversely affect public health, economies, and ecosystem services by degrading water quality,8,9 early identification of algal blooms can improve public safety and mitigate economic concerns.

Algal blooms are often identified via visual inspection of a waterbody,10 with reports from both waterbody managers and the public playing a fundamental role in algal bloom monitoring for state environmental monitoring agencies.1116 Although visual inspection by water quality agencies and public health departments is a relatively accurate way to identify the presence of algal blooms,10 the number of waterbodies that can be monitored this way is limited. Further, visual inspection results can be subjective and conclusions might differ between individuals, even when optical recording devices are used.17 As a result, some public health agencies maintain reactionary stances to algal bloom monitoring, waiting for a bloom to be reported before investigating, analyzing, and providing public health guidance.18,19 This stance can result in incomplete monitoring coverage (e.g., omission of algal bloom events unless reported), and delays in public health notices that can have real-world implications on human health and socioeconomics.19

Remote sensing has the potential to augment in situ visual inspection while increasing the spatial scale of coverage. In the past 50 years, considerable research attention has been devoted to developing remote sensing techniques for identifying and tracking algal blooms.20,21 Remote sensing of water quality for inland, freshwater systems has lagged marine applications partially due to the optical complexity of inland waters.22 Despite this lag, nearly 30 years of studies have focused on the development of methods to derive water quality metrics from spectral signatures.22 In the past 20 years, a shift toward operationalizing freshwater water quality remote sensing has occurred.22,23

Identifying cyanobacterial blooms has been the focus of significant investment in remote sensing, with particular focus on the ocean and land color instrument (OLCI) on board the Sentinel-3A and Sentinel-3B satellites.2427 By focusing on spectral features at 665 and 681 nm, this body of work relies on a well characterized two-step approach to identify the presence of phycocyanin and to then quantify the strength of the signal.24,28,29 OLCI collects imagery with a nominal 300-m ground sampling distance, allowing for the monitoring of larger waterbodies and the production of operational cyanobacterial index products at a large spatial scale. However, these products do not have sufficient spatial resolution to monitor the near-shore environment nor narrow waterbodies that are common in the intermountain west, where deep river valleys have been dammed to create reservoirs that produce hydropower and supply irrigation and drinking water.

Satellite-based sensors with spatial resolution sufficient to resolve narrow waterbodies [e.g., the operational land imager (OLI) on Landsat-8 and Landsat-9, and the multispectral instrument (MSI) on Sentinel-2A and Sentinel-2B] do not have the spectral resolution required to implement the cyanobacteria index approach listed above.29,30 Instead, work with these images to identify algal conditions has focused on retrieving chlorophyll-a,31,32 which has been demonstrated to serve as a robust surrogate for cyanobacterial concentrations in conditions dominated by cyanobacteria.33 Focusing on chlorophyll-a precludes differentiation between harmful algal blooms dominated by cyanobacteria and other aquatic photosynthetic growth.28,34,35 This lack of specificity leads to a bias toward public health protection when noncyanobacterial blooms are identified. Further, the 10-m spatial resolution delivered by Sentinel-2 imagery used in this study allows waterbody managers and public health officials to monitor relatively small waterbodies, narrow portions of larger waterbodies (e.g., bays), and near-shore environments where blooms can accumulate due to wind driven transport.36,37 In this work, we evaluate the ability to classify chlorophyll concentrations using higher spatial- but lower spectral- and temporal-resolution imagery from the MSI on board the Sentinel-2A and Sentinel-2B satellites.

Multiple spectral indices have been developed to retrieve chlorophyll-a conditions from a range of passive optical sensors and are presented in the literature.30,3846 However, none of these approaches have been shown to consistently outperform the others in retrieving chlorophyll-a concentrations. Additionally, we typically lack water quality observations for any given waterbody that are coincident with satellite imagery despite large-scale projects to compile such matchups.47 As such, two distinct challenges must be addressed when using satellite imagery to estimate water quality: (1) identifying spectral indices that describe water quality metrics of interest and (2) relating these spectral indices to water quality metrics in the absence of in situ observations. First, we hypothesize that incorporating multiple spectral indices will describe water quality more robustly than selecting a single spectral index. We test this hypothesis by evaluating the accuracy of single variate logistic regression models for each spectral index against multivariate logistic regression models that incorporate multiple spectral indices. Second, we hypothesize that algal blooms can be identified directly from true color composite satellite imagery, obviating the need for in situ observations. We test this hypothesis by training univariate and multivariate logistic regression models of algal bloom presence with bloom observations identified via visual interpretation of satellite imagery. We evaluate the performance of the logistic regression model calibrated with the visual interpretation calibration dataset relative to those calibrated with in situ samples to determine the efficacy of generating training data from satellite imagery. The work presented here differs from previous efforts by combining bloom presence and absence data with logistic regression models to produce bloom presence probabilities from multivariate models.

2.

Methods

2.1.

Study Site

This work was conducted in Brownlee Reservoir, located on the Idaho-Oregon border (Fig. 1). It is the largest reservoir in the Hells Canyon Complex of hydroelectric reservoirs at 61  km2 in surface area, 93 km in length, and 1.8  km3 in volume, with a maximum depth of nearly 100 m near the dam.50 The reservoir is 650  m wide, on average, and is surrounded by hills with 20% to 30% slopes. Brownlee Reservoir has designated beneficial uses of cold-water aquatic life, primary contact recreation, domestic water supply, industrial water supply, irrigation water, livestock watering, salmonid rearing and spawning, resident fish and aquatic life, wildlife and hunting, fishing, boating, aesthetics, and hydropower.51 Brownlee Reservoir is listed as impaired for excess nutrients associated with nuisance algae growth and has a history of cyanobacteria blooms.51 The reservoir is an active recreation destination with 20,000 nights of camping along the shore of the reservoir in 2013.11 Additionally, discharge from Brownlee Reservoir flows into the Hells Canyon National Recreational Area, which has been estimated to have more than 50,000 boaters visit per year making it a significant economic resource where the populace can be impacted by water quality.50

Fig. 1

Study area map: (a) overview of Brownlee Reservoir (blue polygon) with a Sentinel 2 true color background (red, band 4; green, band 3; and blue, band 2) from July 5, 2019.48 (b) Locations where in situ samples were collected. Red “+” symbols and blue “x” symbols indicate sites where chlorophyll-a concentrations were above or below 10  μg/L, respectively. (c) Locations of manual digitization of bloom presence and absence. Red “+” symbols and blue “x” symbols represent “bloom” and “no-bloom” classifications, respectively.49

JARS_16_4_044522_f001.png

2.2.

Field Data Collection

Water samples were collected by Idaho Power Company personnel from Brownlee Reservoir. Samples were collected from predetermined locations within the reservoir with known coordinates to match sample collection locations with pixels in the associated satellite imagery. Samples were collected within 2 m of the surface, immediately placed on ice, and delivered to the analysis laboratory within 24 h. Samples were spectrophotometrically analyzed for total chlorophyll-a, corrected for pheophytin following standard method 10200H.2.52 Only results from samples collected on the same date as Sentinel-2 satellite imagery were included in this analysis.

The World Health Organization has identified chlorophyll-a concentrations exceeding 10  μg/L to be associated with a transition from slight to moderate risk of adverse health effects from primary contact in cases where Microcystis dominates the chlorophyll-a concentration.53 Although the dominant taxa are not identified in this work, “bloom” and “bloom conditions” are defined in this work to represent chlorophyll-a concentrations greater than or equal to 10  μg/L.

2.3.

Visual Bloom Identification

To evaluate the efficacy of developing training datasets directly from satellite imagery, points representing the distinct presence or absence of an algal bloom were visually interpreted and digitized (Fig. 2) from a series of 26 Sentinel-2 satellite images obtained from the Copernicus Application Programming Interface.48,49 Digitization was conducted in the Geographic Information System ArcMAP 10.8.1 from Environmental Systems Research Institute, Inc. (Redlands, California) where true-color (red, band 4; green, band 3; and blue, band 2) Sentinel-2 images were displayed. Minimum and maximum values in the visualization were set to the equivalent to 0% and 100% reflectance, respectively. Locations associated with algal blooms were visually identified as pixels with elevated reflectance in the green band arranged in continuous shapes associated with algal blooms [Figs. 2(b) and 2(d)]. Points representing no-bloom conditions were identified based on low reflectance in the red, green, and blue bands to provide class balance in training data [Figs. 2(c) and 2(e)]. Bloom and no-bloom conditions were assigned without knowledge of in situ observations to reduce identification bias. Incorporating these data in the evaluation of spectral indices leverages the information that is readily available within historic satellite imagery via conventional image interpretation and is similar to approaches used to develop training data for pixel-based supervised image classification of land cover.54,55

Fig. 2

Example Sentinel-2 imagery48 of (a) Brownlee Reservoir visualized in true color (red, band 4; green, band 3; and blue, band 2) with manually digitized points representing (b), (d) bloom and (c), (e) no-bloom conditions. The northern red boxes in (a) correspond with the extent of (b) and (c). The southern red box in (a) corresponds with the extent of (d) and (e).

JARS_16_4_044522_f002.png

2.4.

Satellite Imagery

Level 1C top of atmosphere imagery collected with the multispectral instrument (MSI) sensors on the Sentinel-2A and Sentinel-2B satellites for tile 11TMK was obtained from the European Space Agency through the Copernicus Application Programming Interface.48 Top of atmosphere imagery was atmospherically corrected using the dark spectrum fitting algorithm approach implemented in the Atmospheric Correction for OLI ‘lite’ generic processor version (v20190326.0) to produce aquatic reflectance products.56 See Ref. 57 for a full description of the dark spectrum fitting approach. Default settings were used in the atmospheric correction with the exception that waterbody elevation was set to 610 m above sea level to account for atmospheric path length.

At each location where an in situ or visually identified observation was made, aquatic reflectance values for each band were extracted from all pixels with centers within 50 m of the observation’s location.49 A 50-m buffer was used to spatially smooth reflectance values and to account for potential positional error in sample collection location. The median reflectance values within the 50-m buffer were used to represent each band’s value at the specified location. The median statistic was used rather than the mean to reduce the impact of outliers on the resulting aquatic reflectance values.

2.5.

Spectral Index Evaluation

Seventeen spectral indices that were expected to be sensitive to chlorophyll-a concentrations were selected from the literature and evaluated30,3846 (Table 1). Spectral indices developed for sensors other than the MSI sensor used in this work were selected if the central wavelengths of all bands used in index development [i.e., bands from OLI or the medium resolution imaging spectrometer (MERIS)] fell within unique MSI bands. MSI bands were defined for this work by their central wavelengths and full-width half-maximums.60

Table 1

Sentienl-2 spectral indices evaluated.

Index IDIndex nameEquations using Sentinel-2 bandsCitationSensor used in citation
S01Be162Bsubb5b4Beck et al.39MSI
S02BR232b3O’Reilly et al.44OLI and MSI
S03BR54b5b4Gons et al.40MSI
S04BR8a4b8ab4Tebb et al.45OLI
S05Go04MCIb5b6Gower et al.58MERIS
S06KIVUb2b4b3Beck et al.30OLI
S07L83BDA(1b21b4)*b3Beck et al.30OLI
S08FLHvioletb3(b4+(b1b4))Beck et al.30OLI
S09MCIb5b4(b6b4)*(704.1664.6740.5664.6)Le et al.41MERIS
S10Moses3b(1b41b5)*b6Moses et al.43MSI
S11NDCI54(b5b4b5+b4)Mishra et al.42MSI
S12NDCI8a4(b8ab4b8a+b4)Beck et al.30OLI
S13S23BDA(1b41b5)*8Beck et al.30MSI
S14FLHblueb3(b4+(b2b4))Beck et al.30MSI
S15SABI(b8ab4b2+b3)Alawadi38OLI
S16Tomingb5(b4+b62)Toming et al.46MSI
S17ZhFLHb8a(b5+(b4b5))Zhao et al.59MERIS

2.6.

Binary Logistic Regression

The probability that an algal bloom was present for each pixel in a Sentinel-2 image was determined by relating the presence (chlorophyll-a concentration >10  μg/L) or absence (chlorophyll-a concentration <10  μg/L) of an algal bloom to the value for one or more spectral indices using a binary logistic regression approach. Binary logistic regression was implemented as follows:

Eq. (1)

p=eβ0+β1*X1+  βn*Xn1+eβ0+β1*X1+  βn*Xn,
where p is the probability that chlorophyll-a concentration exceeded 10  μg/L, β0 is an intercept calibration term, and β1 through βn are the parameter effects for spectral indices X1 through Xn. To address class imbalances in calibration data (i.e., more observations of bloom versus nonbloom conditions), weights were applied to the observations as

Eq. (2)

Wp=1,

Eq. (3)

WN=#N/#P,
where Wp and #P are the weights for and number of bloom condition observations, respectively, and WN and #N are the weights for and number of nonbloom condition observations, respectively.

Univariate and multivariate logistic regression models were developed to assess performance of individual spectral indices and combinations of spectral indices to identify algal blooms. Additionally, logistic regression models were trained and tested with different combinations of in situ and visually identified observations to evaluate the impact of different training data sources on model performance (Table 2).

Table 2

Logistic model regression scenarios for the univariate and multivariate models.

Calibration scenarioTraining dataTesting data
Gaged80% in situ observations20% in situ observations
Ungaged100% manually classified points20% in situ observations
Augmented80% in situ observations + 100% manually classified points20% in situ observations

The “gaged” calibration scenario reflects a widely used approach to model calibration using in situ observations.30 The “ungaged” scenario evaluates the efficacy of training a model based on visually identified bloom occurrences when in situ observations are too sparse or not available. The “augmented” scenario evaluates the utility of augmenting in situ observations with visually identified blooms.

Each modeling scenario and the associated training and testing data are described in the following sections. Performance between and among univariate and multivariate models was evaluated using the accuracy metrics described at the end of this section. All analyses were conducted in version 3.6.0 of the R statistical programming language 61 using RStudio v1.2.1335.62

2.6.1.

Univariate logistic regression models

Logistic regression models were developed for each of the 17 spectral indices listed in Table 1 and each of the calibration scenarios listed in Table 2. Performance of the resulting univariate models was quantified by assessing the accuracy of each individual index in identifying algal blooms; these results provided benchmarks to compare multivariate models.

2.6.2.

Multivariate logistic regression models

Multivariate logistic regressions were produced to test the hypothesis that classifications based on multiple spectral indices are more robust than classifications from single spectral indices. Three multivariate logistic regression models were developed, one for each of the gaged, ungaged, and augmented scenarios in Table 2, to assess how in situ and visually identified training data affect accuracy of algal bloom identification from Sentinel-2 imagery.

Multivariate logistic regressions were produced from the spectral indices listed in Table 1 using a three-step approach. First, highly correlated spectral indices were identified based on their variance inflation factor (VIF) values and removed one at a time to achieve a subset of spectral indices where the VIF for each index was <10.63,64 This was done by removing the index with the highest VIF, recomputing VIF for all remaining indices and removing the subsequent index with the highest VIF. This process was repeated until no indices had VIF values above ten. Second, the scenario-specific training dataset identified in Table 2 was selected. Third, multivariate logistic regressions were calibrated using all spectral indices identified through the VIF-based variable selection process. During the calibration procedure, parsimonious multivariate models were identified using stepwise variate selection with the objective of minimizing the Akaike information criterion (AIC).65 This procedure was repeated for all three calibration scenarios in Table 2.

2.7.

Accuracy Assessment

The accuracy of the logistic regression models was evaluated using a 10-fold cross validation approach with an 80% calibration, 20% validation split (Table 2). For each iteration, 80% of the in situ data were randomly selected as the training dataset, and the remaining 20% were used to test model accuracy. Performance was evaluated using four metrics: precision, recall, F1 score, and overall accuracy. Precision is a measure of how many of a model’s positive predictions (e.g., above threshold) were correct [Eq. (4)], whereas recall measures how many of the positive observations were identified as such in the model [Eq. (5)]. These are given as

Eq. (4)

precision=#TP/(#TP+#FP),

Eq. (5)

recall=#TP/(#TP+#FN),
where #TP is the number of true positives, #FP is the number of false positives, and #FN is the number of false negatives. The F1 statistic was used as a multiple-criterion metric to evaluate the performance of logistic regressions that accounts for the trade-off between precision and recall.66 The F1 statistic was computed as

Eq. (6)

F1=2*precisionprecision+recall.

Accuracy, defined here as the percent of observations that were correctly classified, was calculated to provide a more intuitive and familiar evaluation of model performance. Accuracy was calculated as the number of true positive and true negative results divided by total number of observations in the validation dataset.67

An exceedance probability of 50% (0.5) was used to classify model output as exceeding 10  μg/L. Figure 3 provides a graphical example of the four possible outcomes, true positive, true negetive, false positive, and false negative, for each validation data point relative to the 50% and 10  μg/L thresholds.

Fig. 3

Schematic of result quadrant to illustrate the four possible outcomes for each validation data point. Observed data are classified as those that fall above or below 10  μg/L of chlorophyll-a. Model results are divided between those predicting more or less than 50% probability of exceeding 10  μg/L.

JARS_16_4_044522_f003.png

3.

Results

3.1.

Field Observations

Twenty-four in situ observations from 15 sites along Brownlee Reservoir were used in the analysis (Fig. 1). Chlorophyll-a concentrations in these samples ranged from 1.2 to 241  μg/L with a median value of 7  μg/L. There were 10 observations (42%) with concentrations of 10  μg/L or higher, indicating relative parity in observations above and below the 10  μg/L threshold. An additional 195 points were manually digitized from 26 Sentinel-2 images (Fig. 1). Of the manually digitized points, 109 (56%) were classified as blooming conditions. Data are available in Ref. 49.

Reflectance spectra from extract from imagery at bloom locations showed elevated reflectance in bands three (559  nm) and five (704  nm) for both in situ and visually identified bloom locations (Fig. 4). The reflectance values were similar between visually identified and in situ observations for the nonbloom conditions while reflectance values were higher for bands 3 (560  nm), 5 (704  nm), 6 (740  nm), 7 (783  nm), 8 (833  nm), and 8a (865  nm) for the visually identified data under bloom conditions than for the in situ data.

Fig. 4

Reflectance profiles for (a) visually identified points that were manually digitized and (b) in situ monitoring locations (bottom) separated into bloom (black) and no-bloom (white) conditions.

JARS_16_4_044522_f004.png

3.2.

Univariate Model Performance

With the gaged calibration approach, the relationship between four spectral indices (S02, S08, S10, and S13) and chlorophyll concentration exceeding 10  μg/L were statistically significant (p<0.05). Of these four models, the univariate models based on S10 and S13 had the highest classification accuracies of 80% and F1 scores of 0.74 (Table 3). The univariate models established with the gaged calibration approach that used indices S09 and S16 were the highest performing with clear separation in exceedance probability between concentrations above and below the 10  μg/L threshold (Fig. 5) but were not found to be statistically significant (i.e., the model β1 term had p>0.05). Misclassified observations for models based on S10 and S13 had concentrations within 2.5  μg/L of the 10  μg/L threshold on average illustrating that for the best performing models, cases of misclassification were limited to conditions near the 10  μg/L threshold (Fig. 5).

Table 3

Univariate model performance using the gaged calibration approach.

Indexβ0β1β1p-valueAccuracyPrecisionRecallF1
S01a0.0522.20.790.520.000.000
S02a9.61−16.10.020.740.500.850.63
S03a−20.1320.50.060.760.550.860.67
S042.00−3.00.200.620.360.620.46
S05−2.22648.10.100.740.450.910.61
S060.05−2.700.640.400.050.100.06
S070.030.70.700.440.050.130.07
S08−3.72201.80.040.720.500.790.61
S09−4.874642.10.070.840.680.940.79
S10a0.6129.70.040.800.640.880.74
S11a0.4239.30.060.760.550.860.67
S12−1.07−4.90.160.640.410.640.50
S13a0.6835.60.030.800.640.880.74
S14a−3.57364.40.060.700.410.820.55
S15−0.67−5.80.360.560.230.500.31
S16a−5.305577.50.090.840.680.940.79
S17−0.76−166.00.190.660.360.730.48

aIndex developed for MSI (Table 1).

Fig. 5

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the gaged approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).

JARS_16_4_044522_f005.png

When training with the visually identified dataset and testing on the in situ observations in the ungaged approach, all spectral indices, except those based on S06 and S17, had statistically significant relationships (p<0.05) with the probability of bloom occurrence. Of these models, those based on S08 and S14 were the highest performers with accuracy rates of 79% and F1 scores of 0.67 (Table 4). However, separation in exceedance probabilities across concentrations was less clear (Fig. 6) when compared to the gaged calibration approach (Fig. 5). Misclassified observations for models based on S08 and S14 had concentrations within 4  μg/L of the 10  μg/L threshold on average, suggesting that cases of misclassification were limited to conditions near the 10  μg/L threshold for the best-performing models (Fig. 6).

Table 4

Univariate model performance using the Ungaged Calibration Approach.

IndexModel IDβ0β1β1p-valueAccuracyPrecisionRecallF1
S01aUU10.26153.6<0.01*0.580.200.500.29
S02aUU2−8.84−18.7<0.01*0.750.510.840.63
S03aUU3−23.1518.4<0.01*0.750.401.000.57
S04UU41.572.00.010.470.100.210.14
S05UU5−17.821659.60.010.720.331.000.50
S06UU60.25−15.670.080.490.070.190.10
S07UU7−1.204.8<0.010.540.100.330.15
S08UU8−26.691005.30.010.790.501.000.67
S09UU9−11.531347.3<0.010.710.301.000.46
S10aUU10−5.9937.6<0.010.740.381.000.55
S11aUU11−5.1245.5<0.010.750.401.000.57
S12UU120.643.10.010.490.190.310.24
S13aUU13−6.7650.7<0.010.670.201.000.33
S14aUU14−16.91933.50.0020.790.501.000.67
S15UU150.716.0<0.010.540.200.400.27
S16aUU16−11.341341.0<0.010.710.301.000.46
S17UU170.21−7.80.740.710.820.610.70

aIndex developed for MSI (Table 1).

Fig. 6

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the ungaged approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).

JARS_16_4_044522_f006.png

When in situ observations are augmented with visually identified observations in the augmented calibration approach, all models except those based on S06 and S17 had statistically significant relationships with the probability of bloom occurrence (p<0.05). Models based on S05, S08, and S14 had the highest F1 scores (0.58) and the highest accuracy (74%, Table 5). Accuracy and F1 values for these highly correlated indices were lower than for the top performing models under the gaged and ungaged calibration approaches because of decreases in precision driven by an increase in false negatives (Fig. 7). Misclassified observations for models based on S05, S08, and S14 had concentrations within 3  μg/L of the 10  μg/L threshold on average, indicating that for the best performing models in the augmented calibration approach the cases of misclassification are limited to conditions near the 10  μg/L threshold (Fig. 7).

Table 5

Univariate model performance using the augmented calibration approach.

IndexModel IDβ0β1β1p-valueAccuracyPrecisionRecallF1
S01aUA10.06153.5<0.010.580.140.600.22
S02aUA29.224.4<0.01*0.720.450.910.51
S03aUA3−15.0412.2<0.010.680.271.000.43
S04UA4−1.25-0.60.060.480.050.170.07
S05UA5−5.03579.4<0.010.740.411.000.58
S06UA6−0.820.000.110.460.090.220.13
S07UA7−1.094.0<0.010.560.090.500.15
S08UA8−10.27411.6<0.010.740.411.000.58
S09UA9−4.04588.9<0.010.680.271.000.43
S10aUA10-2.9220.9<0.010.680.271.000.43
S11aUA11−2.8628.9<0.010.680.271.000.43
S12UA120.322.30.020.480.140.300.19
S13aUA13−2.9225.6<0.010.660.231.000.37
S14aUA14−7.74476.8<0.010.740.411.000.58
S15UA150.415.1<0.010.540.140.430.21
S16aUA16−4.00590.6<0.010.680.271.000.43
S17UA17−0.06−15.40.500.840.950.750.84

aIndex developed for MSI (Table 1).

Fig. 7

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) from the 10-fold cross validation for models calibrated with the augmented approach. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).

JARS_16_4_044522_f007.png

3.3.

Multivariate Model Performance

Of the 17 spectral indices examined for Sentinel-2 (Table 1), S01, S03, S04, S05, S09, S10, S11, S12, S14, S16, and S17 were found to be the most highly correlated with other indices (Fig. 8) and were removed in the stepwise, VIF-based variable selection process. The remaining six indices, S02, S06, S07, S08, S11, and S15, had VIF values <10 at the end of the stepwise removal process and were selected for evaluation in the multivariate regression approach.

Fig. 8

Spectral index correlation matrix. Pearson correlation coefficients are provided in the upper right. Positive correlation values are in blues, negatives are in reds. Indices with “*” annotations had VIF values <10 at the end of the stepwise removal process and were included in the multivariate model calibration process.

JARS_16_4_044522_f008.png

The best performing multivariate models for the gaged (MG), ungaged (MU), and augmented (MA) model calibration approaches had accuracies of 0.80, 0.79, and 0.82, respectively (Table 6). For the multivariate models, the augmented calibration approach also had the highest F1 statistic (0.73), although it is rather similar to the F1 score of 0.72 for the gaged calibration approach. Misclassified observations for the gaged, ungaged, and augmented multivariate models had concentrations within 3  μg/L of the 10  μg/L threshold, on average, suggesting that for all multivariate models the cases of misclassification are limited to conditions near the 10  μg/L threshold (Fig. 9).

Table 6

Performance of the multivariate and top performing univariate models.

Calibration approachModelAccuracyPrecisionRecallF1
GagedS10, S130.800.640.880.74
MG0.800.590.930.72
UngagedS08, S140.790.50>0.990.67
MU0.790.50>0.990.67
AugmentedS05, S08, S140.740.41>0.990.58
MA0.820.57>0.990.73
MG, MU, and MA refer to top performing multivariate models calibrated with the gaged, ungaged, and augmented approaches, respectively.

Fig. 9

Modeled probability of exceeding 10  μg/L (y axis) for each observed chlorophyll-a concentration (x axis) for multivariate models calibrated with the (a) gaged, (b) ungaged, and (c) augmented approaches. The vertical black line at 10  μg/L represented the classification threshold of bloom versus no-bloom in the observed data. The dashed horizontal line at 0.5 exceedance probability represents the threshold of bloom versus no-bloom in the remotely sensed data. Points in the upper right are true positives (TPs), upper left are false positives (FPs), bottom left are true negatives (TNs), and bottom right are false negatives (FNs).

JARS_16_4_044522_f009.png

The spectral indices included in the best performing multivariate models varied by calibration scenario (Table 7). The multivariate model calibrated with the gaged approach (MG) selected two model members. The stepwise parameter selection process for the ungaged multivariate model calibration approach (Mu) resulted in a univariate model (S08), as a balance between model parsimony and maximum model likelihood. The multivariate model calibrated with the augmented dataset incorporated all potential spectral indices except for S02 and S06. In all cases, the models with the lowest AIC also had the highest F1 scores.

Table 7

Multivariate model parameters for each calibration approach.

TermIndex nameGaged (MG)Ungaged (MU)Augmented (MA)
Coefficientp-valueCoefficientp-valueCoefficientp-value
Intercept−1.20.0336−26.69<0.01−5.4<0.01
S02aBR23
S06KIVU
S07L83BDA−4.5<0.01
S08FLHviolet1057.90.01160.6<0.01
S11aNDCI54300.03724.5<0.01
S15SABI−130.1−12.2<0.01

aIndex developed for MSI.

4.

Discussion

We developed models that can be applied to identify algal blooms from satellite imagery by evaluating different data sources describing the presence and absence of algal bloom conditions against multiple spectral indices designed to identify chlorophyll presence.

4.1.

Correlation of Spectral Indices

Although 17 spectral indices were identified in the literature and evaluated in this work, many were found to be highly correlated. Through an iterative index removal process, 11 indices were removed before the remaining indices had VIFs <10. This result indicates that only six of the evaluated 17 spectral indices are required to represent the observed variability in chlorophyll-a concentrations. Reducing the search space by more than 60% is valuable as it reduces the number of indices that require evaluation.

4.2.

Spectral Indices

As expected, spectral indices developed for and evaluated on MSI imagery outperformed those developed for other sensors when used in isolation in the univariate models calibrated with in situ observations (Tables 1 and 3). Specifically, the top-performing univariate models with the gaged calibration approaches, S10 and S13, were developed for the MSI.30,43 They also both focus on band 5 (704 nm) relative to band 4 (665 nm) normalized by bands 6 (740 nm) or 8a (865 nm) thus illustrating the importance of the “red edge” and red bands for retrieving a chlorophyll signal in agreement with previous work.68 However, indices developed for other sensors joined the top performers when model calibration included visually identified data and in multivariate models. Specifically, S08, developed for OLI,30 and S14, developed for the MSI, were the top performing univariate models in the ungaged calibration scenario (Table 4). Indices S08 and S14 focus on the reflectance peak for band 3 (560 nm), illustrating the influence of “green” light on the identification of algal blooms when using RGB color composites to identify algal blooms. Index S05, developed for MERIS58 and focused on the “red edge,” joined S08 and S14 as a top performer for the augmented calibration scenario (Table 5). The improved performance from the augmented calibration scenarios (Table 6) highlights the value of using visible, spatial, and infrared cues to identify algal blooms.

4.3.

Univariate versus Multivariate Results

The multivariate model performed just as well as all the statistically significant univariate models for the gaged and ungaged calibration scenarios. The multivariate model under the augmented calibration scenario was the highest performing of the statistically significant models overall. This increase in the performance could be due to the incorporation of multiple spectral features present in algal blooms (Fig. 4), as the ensemble model calibrated with in situ data augmented with spectra extracted from satellite imagery focused on bands 2–5 and 8a (Table 1). This result is similar to previous studies69,70 and is consistent with our hypothesis “incorporating multiple spectral indices is more robust than selecting a single spectral index.” The improvement in accuracy is attributable to an increase in precision associated with a reduction in false positives as well as an increase in recall. These results suggest that the multivariate models were more skilled in identifying observed bloom conditions (Table 6).

4.4.

Incorporating Image Derived Training Data

The univariate and multivariate models trained on visually identified training data alone were nearly as accurate (79% accuracy) as training based on in situ observations (80% accuracy). This is a remarkable finding because it implies that training datasets can be built for waterbodies lacking in situ data by extracting the necessary information from the satellite images themselves. Further, the multivariate approach calibrated on the augmented observations provided the highest accuracy overall with a mean accuracy of 82% indicating a benefit of including visually identified end-member spectra even in cases where in situ data are available.

The multivariate model calibrated on visually identified data (MU) had near perfect model recall, meaning that nearly all the observed bloom conditions in the in situ observation dataset were identified in the resulting model. However, this same model had relatively low precision due to the presence of numerous false positives. The high recall and low precision indicate that classification with visually identified data is best suited to cases where decision makers tend to be more tolerant of false positives than false negatives. Notably, the probability (50%) and concentration (10  μg/L) thresholds can, and likely should, be adjusted in this approach to fit end-user communication and reporting needs. In fact, it can be seen in Fig. 9(c) that selecting a slightly higher chlorophyll-a threshold (15  μg/L) would result in perfect classification.

Figure 4 shows that the visually identified bloom locations had higher NIR reflectance than pixels identified as bloom conditions via in situ observations. This may reflect a bias in the visual interpretation toward identifying floating algae that would have higher NIR reflectance than submerged algae. Further, a robust analysis of the consistency and repeatability of manually classified training data in Brownlee and other waterbodies could improve classification.

The ungaged model results indicate that the use of image-derived spectra for training models could be useful in cases where in situ observations are limited. The reasonable accuracy obtained with the ungaged multivariate calibration (79% for MU), and the increased accuracy of the augmented multivariate (MA) relative to the gaged multivariate (MG) is consistent with our hypothesis “satellite imagery itself contains information useful for evaluating spectral indices.”

4.5.

Spatial Patterns in Model Results

In addition to the correct identification of conditions at observation locations, the spatial patterns of model results can be examined qualitatively to confirm agreement with features visible in satellite imagery. In Fig. 10(a), an algal bloom is clearly seen in the true color composite. A sample collected from within this feature had chlorophyll-a concentration of 86.6  μg/L, verifying the feature as an algal bloom. The ribbon-like features of the algal bloom are well described by some of the models (Fig. 10). However, some univariate models do not appear to be sensitive to the presence of the algal mass, returning nearly uniform exceedance probabilities for all pixels in the image. Although this is not a quantitative assessment, examining the models’ abilities to reproduce spatial patterns of algal blooms provides insight into an index’s general performance.

Fig. 10

Classification of a bloom visible in Sentinel-2 true color (red, band 4; green, band 3; and blue, band 2) (a) imagery (RGB) from September 3, 202048 for each (b)–(r) individual spectral index (S1–S17) and the multivariate models for the (s) gaged (MG), (t) ungaged (MU), and (u) augmented (MA) scenarios. The bloom seen in these images was sampled on September 3, 2020 [“+” in (a)] and had chlorophyll-a concentration of 86.6  μg/L.

JARS_16_4_044522_f010.png

4.6.

Sources of Uncertainty

The approach taken here is subject to multiple sources of uncertainty, including but not necessarily limited to the atmospheric correction procedure, interfering effects of sediment and other nonchlorophyll-a containing substances on the chlorophyll-a signal, the presence of nonalgal plants (e.g., submerged aquatic vegetation of sloughed macrophyte mats) obfuscating interpretation of the chlorophyll-a signal as an algal bloom, error rates associated with the visual identification process, the effects of wind-driven sun glint, the use of chlorophyll-a that is not corrected for degradation byproducts like pheophytin, adjacency effects, bottom reflectance, and potential temporal and spatial mismatch between in situ observations and extracted aquatic reflectance values. The limited number of in situ observations likely also contributed to calibration uncertainty, exemplifying the very common challenge of calibrating semiempirical approaches with limited data. Notably, the ungaged calibration approach removes the uncertainty associated with temporal and spatial mismatch as the signals are derived from imagery directly. This, in addition to a larger validation dataset, may have contributed to more univariate models with statistically significant calibrations under the ungaged approach relative to the gaged approach. Despite many potential sources of error, the achieved accuracies of 80% and higher indicate that the algal bloom signal is large in comparison with the noise associated with all these potential sources of uncertainty. The encouraging results reported herein notwithstanding, addressing each of these potential sources of uncertainty could improve model accuracy.

4.7.

Future Applications

Our intent in introducing this approach is to provide an additional tool for public health and natural resource managers to identify potentially harmful conditions that warrant in situ monitoring. Providing timely situational awareness of algal bloom extent has the potential to increase resource efficiency by guiding field staff to priority sampling locations. These methods also afford the potential to identify nascent blooms in remote areas before they would be identified otherwise. Finally, historic satellite imagery contains information on algal bloom dynamics. Reanalysis of these images could provide information on spatial and temporal trends that might yield insight regarding potential drivers of algal blooms.

5.

Conclusion

Multivariate models were as accurate as univariate indices in classifying aquatic chlorophyll-a relative to a 10-μg/L threshold. Manually digitized observations of end-member conditions (e.g., bloom and nonbloom) were used to calibrate aquatic chlorophyll-a retrieval in the absence of in situ observations with reasonable accuracy (79%) that is nearly equal to that of using in situ observations only (80%). Augmenting in situ observations with manually digitized observations of end-member conditions (e.g., bloom and nonbloom) improved remote sensing accuracy to 82%. These results suggest that image interpretation might be suitable for deriving training data for algal bloom classification in the absence of or to augment in situ observations matched with Sentinel-2 satellite imagery.

Acknowledgements

This research was supported by Idaho Power Company. The authors would like to extend gratitude to Brian Hoelscher and Nick Gastelecutto at Idaho Power Company for support in data collection. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. government. The Article was prepared solely by employees of the United States federal government as part of the employees' official duties. It is an official U.S. government publication, and is not subject to copyright protection within the United States.

Data Availability

The data used in this study are available in Ref. 49.

References

1. 

J. C. Ho, A. M. Michalak and N. Pahlevan, “Widespread global increase in intense lake phytoplankton blooms since the 1980s,” Nature, 574 (7780), 667 –670 https://doi.org/10.1038/s41586-019-1648-7 (2019). Google Scholar

2. 

Z. Namsaraev et al., “Algal bloom occurrence and effects in Russia,” Water, 12 (1), 285 https://doi.org/10.3390/w12010285 (2020). Google Scholar

3. 

L. L. Ndlela et al., “An overview of cyanobacterial bloom occurrences and research in Africa over the last decade,” Harmful Algae, 60 11 –26 https://doi.org/10.1016/j.hal.2016.10.001 HALNE7 (2016). Google Scholar

4. 

F. R. Pick, “Blooming algae: a Canadian perspective on the rise of toxic cyanobacteria,” Can. J. Fish. Aquat. Sci., 73 (7), 1149 –1158 https://doi.org/10.1139/cjfas-2015-0470 CJFSDX 1205-7533 (2016). Google Scholar

5. 

C. J. Gobler, “Climate change and harmful algal blooms: insights and perspective,” Harmful Algae, 91 101731 https://doi.org/10.1016/j.hal.2019.101731 HALNE7 (2020). Google Scholar

6. 

H. W. Paerl and J. Huisman, “CLIMATE: blooms like it hot,” Science, 320 (5872), 57 –58 https://doi.org/10.1126/science.1155398 SCIEAS 0036-8075 (2008). Google Scholar

7. 

H. W. Paerl and J. Huisman, “Climate change: a catalyst for global expansion of harmful cyanobacterial blooms,” Environ. Microbiol. Rep., 1 (1), 27 –37 https://doi.org/10.1111/j.1758-2229.2008.00004.x (2009). Google Scholar

8. 

C. B. Lopez et al., Scientific Assessment of Freshwater Harmful Algal Blooms, Interagency Working Group on Harmful Algal Blooms, Hypoxia, and Human Health of the Joint Subcommittee on Ocean Science and Technology, Washington, DC (2008). Google Scholar

9. 

USEPA, Nutrient Criteria Technical Guidance Manual: Lakes and Reservoirs EPA 822-B00-001, United States Environmental Protection Agency, Office of Water, Washington, DC (2000). Google Scholar

10. 

USEPA, Recommendations for Cyanobacteria and Cyanotoxin Monitoring in Recreational Waters: EPA 823-R-19-001, Washington, DC (2019). Google Scholar

11. 

Idaho Power Company, Section 401 Water-Quality Certification Application Hells Canyon Complex FERC No. 1971, Idaho Power Company( (2018). Google Scholar

12. 

Montana DEQ, Harmful Algal Bloom (HAB) Guidance Document for Montana, Montana Department of Environmental Quality( (2021). Google Scholar

13. 

New Hampshire DES, New Hampshire Department of Environmental Services CyanoHAB Response Protocol for Public Water Supplies, New Hampshire Department of Environmental Services( (2020). Google Scholar

14. 

New York DEC, Harmful Algal Bloom Action Plan Skaneateles Lake, New York Department of Environmental Conservation( (2022). Google Scholar

15. 

Ohio EPA, Public Water System Harmful Algal Bloom Response Strategy, Ohio Environmental Protection Agency( (2014). Google Scholar

16. 

Oregon Health Authority, Oregon Harmful Algae Bloom Surveillance (HABS) Program Recreational Use Public Health Advisory Guidelines Cyanobacterial Blooms in Freshwater Bodies, Oregon Health Authority Public Health Division Center for Health Protection( (2019). Google Scholar

17. 

T. J. Malthus, R. Ohmsen and H. J. van der Woerd, “An evaluation of citizen science smartphone apps for inland water quality assessment,” Remote Sens., 12 (10), 1578 https://doi.org/10.3390/rs12101578 (2020). Google Scholar

18. 

H. Rashidi et al., “Monitoring, managing, and communicating risk of Harmful Algal Blooms (HABs) in recreational resources across Canada,” Environ. Health Insights, 15 117863022110144 https://doi.org/10.1177/11786302211014401 (2021). Google Scholar

19. 

S. Stroming et al., “Quantifying the human health benefits of using satellite information to detect cyanobacterial harmful algal blooms and manage recreational advisories in U.S. Lakes,” GeoHealth, 4 (9), e2020GH000254 https://doi.org/10.1029/2020GH000254 (2020). Google Scholar

20. 

M. Gholizadeh, A. Melesse and L. Reddi, “A comprehensive review on water quality parameters estimation using remote sensing techniques,” Sensors, 16 (8), 1298 https://doi.org/10.3390/s16081298 SNSRES 0746-9462 (2016). Google Scholar

21. 

R. M. Khan et al., “A meta-analysis on harmful algal bloom (HAB) detection and monitoring: a remote sensing perspective,” Remote Sens., 13 (21), 4347 https://doi.org/10.3390/rs13214347 (2021). Google Scholar

22. 

S. N. Topp et al., “Research trends in the use of remote sensing for inland water quality science: moving towards multidisciplinary applications,” Water, 12 (1), 169 https://doi.org/10.3390/w12010169 (2020). Google Scholar

23. 

B. A. Schaeffer et al., “Mobile device application for monitoring cyanobacteria harmful algal blooms using Sentinel-3 Satellite Ocean and land colour instruments,” Environ. Modell. Software, 109 93 –103 https://doi.org/10.1016/j.envsoft.2018.08.015 (2018). Google Scholar

24. 

M. M. Coffer et al., “Quantifying national and regional cyanobacterial occurrence in US lakes using satellite remote sensing,” Ecol. Indic., 111 105976 https://doi.org/10.1016/j.ecolind.2019.105976 (2020). Google Scholar

25. 

S. Mishra et al., “Measurement of cyanobacterial bloom magnitude using satellite remote sensing,” Sci. Rep., 9 (1), 18310 https://doi.org/10.1038/s41598-019-54453-y SRCEC3 2045-2322 (2019). Google Scholar

26. 

N. Pahlevan, S. Ackleson and B. Shaeffer, “Toward a satellite-based monitoring system for water quality,” EOS, 99 https://doi.org/10.1029/2018EO093913 EOSMAW (2018). Google Scholar

27. 

P. Whitman et al., “A validation of satellite derived cyanobacteria detections with state reported events and recreation advisories across U.S. lakes,” Harmful Algae, 115 102191 https://doi.org/10.1016/j.hal.2022.102191 HALNE7 (2022). Google Scholar

28. 

R. S. Lunetta et al., “Evaluation of cyanobacteria cell count detection derived from MERIS imagery across the eastern USA,” Remote Sens. Environ., 157 24 –34 https://doi.org/10.1016/j.rse.2014.06.008 (2015). Google Scholar

29. 

M. W. Matthews, S. Bernard and L. Robertson, “An algorithm for detecting trophic status (chlorophyll-a), cyanobacterial-dominance, surface scums and floating vegetation in inland and coastal waters,” Remote Sens. Environ., 124 637 –652 https://doi.org/10.1016/j.rse.2012.05.032 (2012). Google Scholar

30. 

R. Beck et al., “Comparison of satellite reflectance algorithms for estimating chlorophyll-a in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations,” Remote Sens. Environ., 178 15 –30 https://doi.org/10.1016/j.rse.2016.03.002 (2016). Google Scholar

31. 

M. G. Allan et al., “Empirical and semi-analytical chlorophyll a algorithms for multi-temporal monitoring of New Zealand lakes using Landsat,” Environ. Monit. Assess., 187 (6), 364 https://doi.org/10.1007/s10661-015-4585-4 EMASDH 0167-6369 (2015). Google Scholar

32. 

F. Watanabe et al., “Estimation of chlorophyll-a concentration and the trophic state of the Barra Bonita hydroelectric reservoir using OLI/Landsat-8 images,” IJERPH, 12 (9), 10391 –10417 https://doi.org/10.3390/ijerph120910391 (2015). Google Scholar

33. 

R. P. Stumpf et al., “Challenges for mapping cyanotoxin patterns from remote sensing of cyanobacteria,” Harmful Algae, 54 160 –173 https://doi.org/10.1016/j.hal.2016.01.005 HALNE7 (2016). Google Scholar

34. 

S. G. H. Simis, S. W. M. Peters and H. J. Gons, “Remote sensing of the cyanobacterial pigment phycocyanin in turbid inland water,” Limnol. Oceanogr., 50 (1), 237 –245 https://doi.org/10.4319/lo.2005.50.1.0237 LIOCAH 0024-3590 (2005). Google Scholar

35. 

T. T. Wynne et al., “Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes,” Int. J. Remote Sens., 29 (12), 3665 –3672 https://doi.org/10.1080/01431160802007640 IJSEDK 0143-1161 (2008). Google Scholar

36. 

R. P. Stumpf et al., “Hydrodynamic accumulation of Karenia off the west coast of Florida,” Cont. Shelf Res., 28 (1), 189 –213 https://doi.org/10.1016/j.csr.2007.04.017 CSHRDZ 0278-4343 (2008). Google Scholar

37. 

Y. Zhang et al., “A view of physical mechanisms for transporting harmful algal blooms to Massachusetts Bay,” Mar. Pollut. Bull., 154 111048 https://doi.org/10.1016/j.marpolbul.2020.111048 MPNBAZ 0025-326X (2020). Google Scholar

38. 

F. Alawadi, “Detection of surface algal blooms using the newly developed algorithm surface algal bloom index (SABI),” Proc. SPIE, 7825 782506 https://doi.org/10.1117/12.862096 PSISDG 0277-786X (2010). Google Scholar

39. 

R. Beck et al., “Comparison of satellite reflectance algorithms for estimating phycocyanin values and cyanobacterial total biovolume in a temperate reservoir using coincident hyperspectral aircraft imagery and dense coincident surface observations,” Remote Sens., 9 (6), 538 https://doi.org/10.3390/rs9060538 (2017). Google Scholar

40. 

H. J. Gons, “A chlorophyll-retrieval algorithm for satellite imagery (medium resolution imaging spectrometer) of inland and coastal waters,” J. Plankton Res., 24 (9), 947 –951 https://doi.org/10.1093/plankt/24.9.947 JPLRD9 0142-7873 (2002). Google Scholar

41. 

C. Le et al., “Evaluation of chlorophyll-a remote sensing algorithms for an optically complex estuary,” Remote Sens. Environ., 129 75 –89 https://doi.org/10.1016/j.rse.2012.11.001 (2013). Google Scholar

42. 

S. Mishra and D. R. Mishra, “Normalized difference chlorophyll index: a novel model for remote estimation of chlorophyll-a concentration in turbid productive waters,” Remote Sens. Environ., 117 394 –406 https://doi.org/10.1016/j.rse.2011.10.016 (2012). Google Scholar

43. 

W. J. Moses et al., “Operational MERIS-based NIR-red algorithms for estimating chlorophyll-a concentrations in coastal waters—the Azov Sea case study,” Remote Sens. Environ., 121 118 –124 https://doi.org/10.1016/j.rse.2012.01.024 (2012). Google Scholar

44. 

J. E. O’Reilly et al., “Ocean color chlorophyll algorithms for SeaWiFS,” J. Geophys. Res., 103 (C11), 24937 –24953 https://doi.org/10.1029/98JC02160 JGREA2 0148-0227 (1998). Google Scholar

45. 

E. J. Tebbs, J. J. Remedios and D. M. Harper, “Remote sensing of chlorophyll-a as a measure of cyanobacterial biomass in Lake Bogoria, a hypertrophic, saline–alkaline, flamingo lake, using Landsat ETM+,” Remote Sens. Environ., 135 92 –106 https://doi.org/10.1016/j.rse.2013.03.024 (2013). Google Scholar

46. 

K. Toming et al., “First experiences in mapping lake water quality parameters with Sentinel-2 MSI imagery,” Remote Sens., 8 (8), 640 https://doi.org/10.3390/rs8080640 (2016). Google Scholar

47. 

M. R. V. Ross et al., “AquaSat: a data set to enable remote sensing of water quality for inland waters,” Water Resour. Res., 55 (11), 10012 –10025 https://doi.org/10.1029/2019WR024883 WRERAQ 0043-1397 (2019). Google Scholar

48. 

European Space Agency (ESA), “Copernicus Open Access Hub,” (2022). https://scihub.copernicus.eu/ Google Scholar

49. 

T. V. King and K. C. Hafen, “Chlorophyll-a concentrations and algal bloom condition paired with Sentinel-2 aquatic reflectance values collected for Brownlee Reservoir, ID from 2015 through 2020,” U.S. Geological Survey( (2022). Google Scholar

50. 

Idaho Power Company, Recreational Use Associated with the Snake River in the Hells Canyon National Recreation Area, Idaho Power Company( (2002). Google Scholar

51. 

IDEQ and ODEQ, Snake River—Hells Canyon Total Maximum Daily Load (TMDL), 710 Idaho Department of Environmental Quality, Boise, Idaho (2004). Google Scholar

52. 

APHA, Standard Methods for the Examination of Water and Wastewater, American Public Health Association, Washington DC (1999). Google Scholar

53. 

V. Romano Spica et al., Guidelines for Safe Recreational Water Environments. Volume 1, Coastal and Fresh Water, Antonio Delfino Editore( (2010). Google Scholar

54. 

R. Khatami, G. Mountrakis and S. V. Stehman, “A meta-analysis of remote sensing research on supervised pixel-based land-cover image classification processes: general guidelines for practitioners and future research,” Remote Sens. Environ., 177 89 –100 https://doi.org/10.1016/j.rse.2016.02.028 (2016). Google Scholar

55. 

J. A. Richards, “Supervised classification techniques,” Remote Sensing Digital Image Analysis, 247 –318 Springer Berlin Heidelberg, Berlin, Heidelberg (2013). Google Scholar

56. 

Q. Vanhellemont, “Adaptation of the dark spectrum fitting atmospheric correction for aquatic applications of the Landsat and Sentinel-2 archives,” Remote Sens. Environ., 225 175 –192 https://doi.org/10.1016/j.rse.2019.03.010 (2019). Google Scholar

57. 

Q. Vanhellemont and K. Ruddick, “Atmospheric correction of metre-scale optical satellite data for inland and coastal water applications,” Remote Sens. Environ., 216 586 –597 https://doi.org/10.1016/j.rse.2018.07.015 (2018). Google Scholar

58. 

J. Gower et al., “Detection of intense plankton blooms using the 709 nm band of the MERIS imaging spectrometer,” Int. J. Remote Sens., 26 (9), 2005 –2012 https://doi.org/10.1080/01431160500075857 IJSEDK 0143-1161 (2005). Google Scholar

59. 

D. Zhao et al., “The relation of chlorophyll-a concentration with the reflectance peak near 700 nm in algae-dominated waters and sensitivity of fluorescence algorithms for detecting algal bloom,” Int. J. Remote Sens., 31 (1), 39 –48 https://doi.org/10.1080/01431160902882512 IJSEDK 0143-1161 (2010). Google Scholar

60. 

European Space Agency (ESA), Sentinel-2 User Handbook, European Space Agency (ESA)( (2015). Google Scholar

61. 

R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria (2019). Google Scholar

62. 

RStudio Team, RStudio: Integrated Development Environment for R, RStudio, PBC., Boston, Massachusetts (2020). Google Scholar

63. 

N. Hamzehpour, H. Shafizadeh-Moghadam and R. Valavi, “Exploring the driving forces and digital mapping of soil organic carbon using remote sensing and soil texture,” CATENA, 182 104141 https://doi.org/10.1016/j.catena.2019.104141 CIJPD3 0341-8162 (2019). Google Scholar

64. 

M.-C. Tu, P. Smith and A. M. Filippi, “Hybrid forward-selection method-based water-quality estimation via combining Landsat TM, ETM+, and OLI/TIRS images and ancillary environmental data,” PLoS One, 13 (7), e0201255 https://doi.org/10.1371/journal.pone.0201255 POLNCL 1932-6203 (2018). Google Scholar

65. 

H. Akaike, “Information theory and an extension of the maximum likelihood principle,” Selected Papers of Hirotugu Akaike, 199 –213 Springer, New York (1998). Google Scholar

66. 

C. J. V. Rijsbergen, Information Retrieval, 2nd ed.Butterworth-Heinemann( (1979). Google Scholar

67. 

N. Kerle, L. L. Janssen and G. C. Huurneman, “Principles of Remote Sensing,” 250 The Netherlands( (2004). Google Scholar

68. 

J. Bramich, C. J. S. Bolch and A. Fischer, “Improved red-edge chlorophyll-a detection for Sentinel 2,” Ecol. Indic., 120 106876 https://doi.org/10.1016/j.ecolind.2020.106876 (2021). Google Scholar

69. 

K. T. Peterson et al., “Machine learning-based ensemble prediction of water-quality variables using feature-level and decision-level fusion with proximal remote sensing,” Photogramm. Eng. Remote Sens., 85 (4), 269 –280 https://doi.org/10.14358/PERS.85.4.269 (2019). Google Scholar

70. 

M. Xu et al., “Implementation strategy and spatiotemporal extensibility of multipredictor ensemble model for water quality parameter retrieval with multispectral remote sensing data,” IEEE Trans. Geosci. Remote Sens., 60 1 –16 https://doi.org/10.1109/TGRS.2020.3045921 IGRSD2 0196-2892 (2022). Google Scholar

Biographies of the authors are not available.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Tyler King, Stephen Hundt, Konrad Hafen, Victoria Stengel, and Scott Ducar "Mapping the probability of freshwater algal blooms with various spectral indices and sources of training data," Journal of Applied Remote Sensing 16(4), 044522 (19 December 2022). https://doi.org/10.1117/1.JRS.16.044522
Received: 3 October 2022; Accepted: 28 November 2022; Published: 19 December 2022
Lens.org Logo
CITATIONS
Cited by 4 scholarly publications.
Advertisement
Advertisement
KEYWORDS
Calibration

Data modeling

Education and training

Visualization

Satellites

Satellite imaging

Visual process modeling

Back to Top