Preparing Aerial Imagery for Crop Classification

    Preparing Aerial Imagery for Crop Classification

    The first time I had an idea of creating this article, was when I started analyzing satellite and aerial images with the help of Computer Vision. At that time I did not find any complete guide on the topic or even data description and processing methods for further analysis. To this day there is a long and tedious search awaiting Data Scientists who want to dive deeper into this sort of analysis. They need to find out what are GeoTIFF and GeoJSON, how to generate a binary segmentation mask or convert raster and vector to the same WGS. In this article, I will try to shed light on these issues and many more.

    All the code is available at Google Colab and GitHub. Feel free to try it out yourself!


    All of my explanations are centered around the current Zindi: Farm Pin Crop Detection Challenge. Let’s get started.

    First of all, we need to download the data. There are two archives that we will use: containing GeoJSON and 201–01– containing GeoTIFF files. All of it is open-source data from the Sentinel-2 satellite. Inside you will find .jp2 files.


    Let’s find out what GeoTIFF is. Wikipedia says:

    GeoTIFF is a public domain metadata standard that allows georeferencinginformation to be embedded within a TIFF file. The potential additional information includes map projectioncoordinate systemsellipsoidsdatums, and everything else necessary to establish the exact spatial reference for the file. The GeoTIFF format is fully compliant with TIFF 6.0, so software incapable of reading and interpreting the specialized metadata will still be able to open a GeoTIFF format file.

    You can explore geocoding and other technical details here. But, to put it simply, GeoTIFF is a Raster image with geographical reference. This means each pixel in the image is tied to a real point on the world map.

    Working with GeoTIFF


    Rasters usually have large size, to open them, you need to use specialized software like ArcGis, or its opensource brother QGIS. You can find QGIS installation guide following the link:


    There are two main libraries to work with georeferenced images in the Python ecosystem. They are GDAL and Rasterio. GDAL is an older C++ library, which is faster and has more capabilities. But Rasterio is much easier to use for Python developers, that’s why we will use it in Rasterio.

    Reading raster files

    Let’s open the .jp2 file:

    After executing this block you must see raster image shape and meta:

    Metadata contains the number of bands in the raster, coordinate system, dtype, driver nodata and transform — all of these fields are straightforward except for transform. “Transform” is an Affine object which creates image geographical referencing. Again, simply put, it is an affine transformation of image coordinates to a world geographical system.

    Reshaping rasters

    Rasterio reads raster in a differing from usual (OpenCV, Skimage) channel order (num_channels, width, height) instead of (width, height, num_channels). But luckily Rasterio has a function that transforms it into a familiar and nice shape. Let’s use it and check the results:

    Opening the labels

    And now, when we got a basic understanding of what a Raster is, let’s check how the labeling looks like. Usually the labelling is stored in .geojson or .shp formats. GeoPandas will help us reading it.

    The main difference in Pandas vs GeoPandas is GeoDataframe has to have a ‘geometry’ column. This column contains a Shapely geometry object like Point, LineString, Polygon, Multipolygon. Shapely is a library to work with vector data — applying operations like cut, intersect, unite, etc, and doing it very efficiently, because of the vector format of data.

    Data masking using Rasterio

    As you can see the Polygon (2467881.175041331 …) coordinates are in the World Geographical System. Keep in mind that we need to solve a classification task. Let’s prepare the data, cutting each field from a map. For this purpose, Rasterio has a function called the mask.

    Rasterio failed to mask the data. The problem is that .shp CRS doesn’t match the CRS of the raster. We need to convert them manually. First, let’s determine the current projection of Polygons. Take some point from polygon as an example:

    1. Go to
    2. We know the data came from South Africa let’s zoom into it.
    3. Use it with any coordinates (for example, X: 2467881.175041331 Y: -3352032.059296422) from the GeoDataframe and check the output. We are looking for a place in South Africa, that has a river — we can see it in our image. Looking through the results we will see one that fits:

    EPSG:3395 Name: WGS 84 / World Mercator.

    Now let’s check the masking again:

    Now you see only 150 masks weren’t masked well. Those fields came from neighboring tile, to make them also work you need to download the file: 2017–01– Merge it with the current raster and repeat procedure.

    Now you have a folder with a dataset similar to MNIST, which consists of raw images for train and labels in traindf. Looks like we are done. It is time to take your favorite framework and train some neural networks.

    Data preparation

    If you want to make semantic segmentation not to just classify the fields, but also find it’s boundaries, you can do a field/not field binary segmentation or a more complex multiclass segmentation for each class binary masks. Let’s prepare data for semantic segmentation! One solution for this is to convert vector polygons to raster and then cut it to tiles. We will use function rasterio.rasterize.

    Let’s save this binary mask. Here you must remember that the new raster has 1 channel instead of 3 and update .meta respectively.

    I will leave preparing the data for multiclass segmentation to the Reader as a small home task =)

    And now you can cut TCI raster and mask to tiles and use them to train your segmentation model.


    In this article, I tried to provide the Reader with some basics on preparing aerial/satellite images to some Computer Vision processing. Gave some knowledge on how to work with most-used data types as .shp, .geojson, .tiff, .jp2 using satellite data competition as an example. Also, I described some major pitfalls, which the newcomers can meet. I hope it will help you start off faster with the new data type.

    • #Computer Science
    • #Computer Vision
    • #Custom software development
    • #GIS
    • #Quantum