Patrick Gray (patrick.c.gray at duke) - https://github.com/patrickcgray

Chapter 2: Your first remote sensing vegetation index

Introduction

Now that we can read our data into the computer, let's calculate some vegetation indices.

The Normalized Difference Vegetation Index (NDVI) is so ubiquitous that it even has a Wikipedia entry. If you're here for learning how to do remote sensing image processing this is a classic that you need to know. If you need a refresher, please visit the Wikipedia page for NDVI.

This chapter will be very straightfoward. We've already seen how we can read our imagery into a NumPy array -- this chapter will simply extend these tools by showing how to do simple calculations on NumPy objects.

First we have an two additional package we need to install that isn't included in our Docker image. You may already have folium installed in which case you can ignore this block but if you've based your environment on the Dockerfile you'll need to run this:

Now let's bring up our previous code for opening our image and reading in the data:

We're going to be using a Sentinal-2 image because they're publicly available and amazing quality. They come with each band as a separate file so we'll load up bands 2,3,4 and 8. Blue, green, red, and NIR respectively.

We'll clip this image slightly just in case of memory constraints on potential user's computers and for faster visualization.

Let's take a look at the image:

Here we're going to use the rasterio show function which is a wrapper on matplotlib and allows us to plot a raster easily. We also are going to subset the clipped_img to just look at the first three bands so it will be an RGB image.

Looks good!

We can see this is a coastal area with some barrier islands, the town of Beaufort in the upper center and the Duke Marine Lab in the top left.

Note: This image is already 8-bit aka 0-255 pixel values. If it were not we would have to stretch the image appropriately or it wouldn't show up right during plotting.

Now let's plot a histogram of pixel values to check what is going on with the different bands:

Even from simple visualizations we can see the contrast between the red and the near-infared (NIR) bands. Also note that the peak at 255 is simply from areas with no data which are labeled 255 on all bands. For example:


NDVI

To calculate NDVI, we can simply use standard arithmetic operators in Python because these operations in NumPy are vectorized. Just like MATLAB, R, and other higher level languages, NEVER loop over a NumPy array unless you can't avoid it.

Let's check out some stats on our NDVI

Since this is simply a 2D array, with an NDVI value for each pixel, we can plot it like an image.

Note that the vegetation has high NDVI values and the sand and water have low NDVI.

We're plotting here with matplotlib instead of the rasterio specific plotting functions because this is just a normal matrix. We'll show a lot more plotting in the next chapter.

A lot of red for the water and green for the vegetation, so this looks correct.

Speaking of looking correct, the next chapter (link to webpage or Notebook) will demonstrate how you can visualize your results using actual plots!