Geographic ground-based image classification

Kevin Sparks, Jonathan Nelson

Brief introduction

In our geographic ground-based image classification analysis, we developed Support Vector Machine (SVM) models to predict spatially-linked, content-independent information (land cover type) associated with on-the-ground Flickr images of landscapes based on an input of content-dependent information (visual attributes) extracted from the images.

Overview

We adapt the standard two-part approach to image classification. First, we extract visual signatures from on-the-ground Flickr images to form our feature space. Second, we construct SVM models based on training data which include predictor variables (visual signatures) and well-defined labels (land cover types). To generate the land cover labels for each image, we spatially joined each Flickr image to its corresponding National Land Cover Dataset (NLCD) classification.

Visual explanation of the process of spatially joining image data to NLCD land cover classes for the geolocated Flickr images.

We created an image montage of on-the-ground images grouped by land cover type and sorted dark to bright, left to right. The approach is useful, because it allows us to visualize the global overview of the distribution of photos across land cover types, together with the local variability of photos found within certain land cover types. At full resolution, the montage can be zoomed to the individual-photograph level, thus making it easy to quickly scan for representative images, potential outliers, etc.

See a higher res portion of the image montage here.

An image montage of the geolocated Flickr images, distributed across the 16 land cover classes.
Examples of prototypical Forest (left) and Water (right) images.

Once each image was assigned the visual signature features, and a land cover class label, a series of SVM classifications were performed. To reiterate, the visual signatures are used to predict land cover class.

An example of the classification process workflow. Each individual gray block represents a unique SVM model, classifying between the two land cover classes in the blue blocks.

SVM accuracy was influenced by messy data seen below. Many images have no environmental land cover information present in them, and thus are noise.

Examples of ideal, clean data tinted in green on the left. Examples of messy, bad, undesirable data tinted in red on the right.

Summary

Undesirable, messy data was pruned by median brightness values, but images depicted above, for example, still remained. We very much expect that with cleaner data, the accuracy rate for all SVM models in these series of experiments will significantly increase. Most importantly, in the face of messy data, the visual signature features proved their validity. This was especially the case in distinguishing between Forest images and Water images. We believe it is important to first empirically prove the usefulness of visual signatures in the classification process. As we continue this work, we will incorporate pixel by pixel feature space and explore the usefulness of tags associated with the Flickr photos.