Geographic ground-based image classification
Kevin Sparks, Jonathan Nelson
In our geographic ground-based image classification analysis, we developed Support Vector Machine (SVM) models to predict spatially-linked, content-independent information (land cover type) associated with on-the-ground Flickr images of landscapes based on an input of content-dependent information (visual attributes) extracted from the images.
We adapt the standard two-part approach to image classification. First, we extract visual signatures from on-the-ground Flickr images to form our feature space. Second, we construct SVM models based on training data which include predictor variables (visual signatures) and well-defined labels (land cover types). To generate the land cover labels for each image, we spatially joined each Flickr image to its corresponding National Land Cover Dataset (NLCD) classification.
We created an image montage of on-the-ground images grouped by land cover type and sorted dark to bright, left to right. The approach is useful, because it allows us to visualize the global overview of the distribution of photos across land cover types, together with the local variability of photos found within certain land cover types. At full resolution, the montage can be zoomed to the individual-photograph level, thus making it easy to quickly scan for representative images, potential outliers, etc.
See a higher res portion of the image montage here.
Once each image was assigned the visual signature features, and a land cover class label, a series of SVM classifications were performed. To reiterate, the visual signatures are used to predict land cover class.
SVM accuracy was influenced by messy data seen below. Many images have no environmental land cover information present in them, and thus are noise.
Undesirable, messy data was pruned by median brightness values, but images depicted above, for example, still remained. We very much expect that with cleaner data, the accuracy rate for all SVM models in these series of experiments will significantly increase. Most importantly, in the face of messy data, the visual signature features proved their validity. This was especially the case in distinguishing between Forest images and Water images. We believe it is important to first empirically prove the usefulness of visual signatures in the classification process. As we continue this work, we will incorporate pixel by pixel feature space and explore the usefulness of tags associated with the Flickr photos.