Patrick Gray (patrick.c.gray at duke) - https://github.com/patrickcgray

Chapter 7: Download Imagery from Google Earth Engine for Time Series Analysis

Here we will be downloading imagery from Google Earth Engine to look at sea surface temperature variations over time.

Google Earth Engine combines a multi-petabyte catalog of satellite imagery and geospatial datasets with planetary-scale analysis capabilities and makes it available for scientists, researchers, and developers to detect changes, map trends, and quantify differences on the Earth's surface.

This platform allows you to leverage the benefit of datasets already in the cloud, process them in parallel, and either run analysis in the cloud or pull data down locally to analyze. Here our goal is to download a stack of MODIS (or Moderate Resolution Imaging Spectroradiometer) data where each band is a daily value of sea surface temperature measured by the satellite.

You will need an account, which is available freely from Google and can then access the code used to generate these temporal stacks of SST available here: https://code.earthengine.google.com/bbf2eac7fd8f59e85ba54e5914dedc3c. This data is already available in the data/ directory but we highly recommend you run through the GEE code and download from there in order to understand the full workflow.

After visualizing and cleaning this data we'll be running a quick harmonic time series analysis using the python package seasonal https://github.com/welch/seasonal.

Inspecting MODIS SST Data

Assuming you have these files located data/ with the three images called pamlicoStackedSST.tif, gulfstreamStackedSST.tif, and hatterasGulfStackedSST.tif let's get started!

First let's pull in the SST image dataset for the Pamlico Sound. As usual let's check out the metadata for this raster:

Now let's visualize it, knowing that it won't look totally normal since this is a 500m resolution image of just sea surface temperature.

Now let's pull in the SST image dataset for the section of the Gulf Stream just off Cape Lookout, North Carolina.

Again we'll visualize.

Let's check where these rasters are located just to ensure we have the context of the environment we're inspecting:

This looks good! Though do note that this is the minimum bounding box of each raster and not the actual footprint, the Pamlico Sound raster does not include land.

Now let's plot the data across time.

But as we can see above and here that we have lots of missing data:

Quite a lot of missing data. We can assume most of this is from cloud coverage:

We can fill this in by simply interpolating across the vector to fill in holes with nearby data:

Time Series Analysis

Bringing in seasonal now let's look at a simple toy example of data with some seasonality to it and extract the trend and the residual error.

For more details on this specific harmonic analysis technique check out https://github.com/welch/seasonal

Let's give it a try on our data now, starting with the SST image covering three years of Pamlico Sound

Now let's pull in a much longer time series of a slightly smaller area in the Gulf Stream, again letting Google Earth Engine do the heavy lifting in filtering through the data and feeding us only our small spatial area of interest at each timestep. This may take a moment since it is such a dense time series.

While small spatially this image has over 6000 bands! Let's take a look at a random band:

As before let's convert this to a vector and take the mean of each timestep and interpolate out the missing time steps.

Now let's check out the trend in this data:

Pretty cool!

Final Wrap-up and Next Steps

Congrats on making it this far! Here we downloaded MODIS SST imagery from Google Earth Engine, cleaned this data, and analyzed it for seasonality and trends.

As a final chapter in this series we'll explore xarray which is a package for large N-dimensional array manipulation and a powerful set of visualization tools contained within the package hvplot:(link to webpage or Notebook).