21st Century Paleontology with Machine Learning

Unlock Your Inner Jurassic Park Fan and Learn to Hunt Dinosaurs with Computer Vision

Published in

Intel Analytics Software

6 min readNov 8, 2022

Let’s build an AI fossil-hunting tool based on PyTorch and the Intel AI Analytics Toolkit. This tutorial will help you understand how to decompose an image classification problem, like dinosaur fossil hunting, into a few key components: building context from data, proper data representation to our model, model definition/training, and producing actionable insights from model predictions. Please visit the Jurassic repository to run the Jupyter notebooks discussed in this article.

Building Context from Data

Our model will be trained from aerial photos rather than satellite images, which are typically flown at ~120,000 feet and do not have sufficient resolution. Resolution must be high enough to find specific dinosaur bone fragments. The model we train will focus on the landscape’s colors, textures, and shapes (Figure 1).

Figure 1. Aerial images of Brushy Basin Member, Dinosaur National Monument Area

As the images are passed through our convolutional neural network (CNN), higher and higher level features are extracted by each layer until we have models equipped to recognize the critical features (Figure 2). Our model will learn how to differentiate the depositional environment’s colors, textures, and shapes in Utah’s Dinosaur National Monument sites.

Figure 2. CNN low-level feature map source

Proper Representation of Data to Our Model

As previously mentioned, we will work with aerial imagery to build our training and testing data. We will use the Google Earth Engine SDK to extract images at specific locations and altitudes. The Jupyter notebook can be found here. After retrieving the aerial imagery, we need to process the images into a format that our model expects (Figure 3).

Figure 3. Images are indexed, named, and partitioned into 224 x 224 pixel slices.

The images are then manually labeled based on the following criteria and classes (Figure 4):

Class 0 — Non-bone locations within a few miles of bone sites (no bones possible)
Class 1 — Any region with similar depositional environments for which bones have not been positively identified (bones are possible, but have not been found)
Class 2 — Bone locations have been identified via GPS locations and mapped to image coordinates (verified fossil sites)

Figure 4. Data separation scheme for training and validation

Upon completing the manual labeling process, the labeled data is separated into training and validation folders for the next step in our tutorial.

Let’s look at our data distributions. These histograms show the total number of samples per label in our training (Figure 5a) and validation (Figure 5b) datasets.

Figure 5a. Distributions of labels in our training data

Figure 5b. Distributions of labels in our validation data

Now, let’s spot-check some of the images (Figures 6a and 6b). The augmented samples in our training dataset seem to have negative space along the edges. This is fine for our model, but in the future, we could try pixel rotations rather than rotating the entire image to avoid this negative space. From these sample images, ignoring the vegetation, we can start to see patterns of textures and colors emerge. In Figure 6a, we see that labels 1 and 2 have distinct light-colored streaks mixed in with darker-shades, while the samples in Figure 6b of label 0 have more homogeneous coloring.

Figure 6a. Example of labels 1 and 2 in our training dataset

Figure 6b. Example of label 0 in our training dataset

The patterns present in labels 1 and 2 indicate the Brushy Basin Member (Figure 1), the target bench of the Morrison Formation, where we expect to find fossil-bearing rocks. These distinctions between images are detectable after visual inspection, but our goal is to build a model that can perform this analysis for thousands of images in seconds.

Model Definition and Training

Using our labeled dinosaur hunting dataset, we will transfer learn a pretrained ResNet model through a process called “domain adaption.” Transfer learning allows us to leverage the pretrained weights in an existing model for new tasks (Figure 6).

Figure 7. Traditional machine learning (learning from scratch) vs. transfer learning (Source)

We could train our model from scratch, yielding a dedicated fit-for-purpose model, but this would require significantly more data. Let’s unpack this methodology:

Learning from scratch is your traditional deep learning scheme where you initialize the weights of your network as zeros, randomly, or with some predefined value. The model then uses backpropagation to update the weights of your model based on some objective function and optimizer. This type of training is generally more computationally expensive, requiring accelerators like GPUs and HPUs, and can require hours, days, or weeks to complete.
Domain Adaptive Transfer Learning is when we start with a model pre-trained on an original dataset and introduce a completely different dataset and use it to retrain the model. This adapts our model to a new problem and transfers the learnings from a previous dataset. In this tutorial, we start with a pre-trained ResNet model and domain adapt it for a fossil likelihood classification task.
Fine-tuning is another sub-category of transfer learning that doesn’t switch domains but updates an existing model with new data. For example, we could acquire new data from a fossil location to update our dinosaur hunting model after the initial training. For this to make sense, our original dataset would have to be significantly larger than the new data so that we can justify preserving some of the weights through transfer learning techniques like layer freezing or learning rate drops. If the datasets are similar, it might be worth training a new model from scratch.

Bypassing “learning from scratch” gives us more flexibility when considering the type of hardware we need. In this case, transfer learning will enable us to perform our deep learning and inference directly on the CPU, thus eliminating the need for additional accelerated hardware.

We will use the Intel Extension for PyTorch (IPEX) to accelerate training and inference. IPEX allows us to apply channels last, graph optimization, operator optimization, and auto-mixed precision, all via an easy-to-use Python API. If you’d like to learn more about implementing IPEX, please visit the IPEX GitHub or IPEX PyTorch Documentation.

By adding the two lines of code below, we can immediately unlock the latest Intel hardware optimizations for PyTorch:

model = self.model.to(memory_format=torch.channels_last)
model, self.optimizer = ipex.optimize(
            self.model, optimizer=self.optimizer,dtype=torch.float32

After training our model enough epochs to stabilize and achieve satisfactory accuracy without overfitting (Figure 8), we can begin testing our model on unseen data.

Figure 8. Loss vs epochs and accuracy vs epochs for our training and validation datasets

Producing Actionable Insights from Model Predictions

Now that we have successfully transfer-learned our model, we can predict the labels of unseen data (Figure 9a and 9b) and stitch together a dinosaur fossil probability map.

Figure 9a. A batch of five predictions on unseen data

Figure 9b. A batch of five predictions on unseen data

Let’s combine our images and apply a gradient color overlay representative of our model’s predicted labels (Figure 10). The green gradients indicate a higher probability of finding fossils in that location. Now we can put on our boots and begin our dinosaur hunting adventure, starting with the areas of higher fossil likelihood predicted by our model.

Figure 10. Dinosaur fossil likelihood map

Concluding Remarks

We explored a fascinating application of computer vision to the discipline of paleontology. One important thing to highlight is that the workflow described in this article and the underlying Jupyter notebooks can be applied to any scenario where aerial photographs are used to delineate the properties of a region. Some future examples we could explore are forest fire likelihood, coral reef bleaching detection, and agriculture crop performance. If you are interested in evaluating the extended work performed with the Intel Distribution of OpenVINO Toolkit, check out the notebooks on CPU and iGPU inference.