Tin-Tungsten Prospecting with Machine Learning in Northeast Tasmania, Australia

Geological Setting

The oldest basement rocks in Northeastern Tasmania belong to a thick sequence of Ordovician-Devonian quartz-rich turbidites, the Mathinna Supergroup. Mathinna sediments were deformed during the Tabberabberan Orogeny, a major Devonian tectonic event that affected much of Southeastern Australia and coincided with voluminous granitic magmatism. Devonian aged egranites cover ~6% of the Tasmanian landmass and are an important source for much of Tasmania’s mineral wealth, which in northeastern Tasmania includes a series of stockwork and greisen Sn-W deposits. Figure 1 below presents a map of the study area for context.

Figure 1. Context map taken from QGIS using Google’s satellite imagery. The red box in the right hand plot highlights the study area discussed in the article, while the red dots represent locations with known Sn-W mineralisation.

Mineral System Model

Greisen and stockwork type Sn-W deposits are typically associated with upper parts of evolved granitoid plutons where mineralising fluids exsolved from cooling magmas have been focussed and/or ponded (Fig. 2). Aspects of the mineral system exploitable by geophysics include:

  • Granite domes/cupolas are easily targetted as pronounced gravity lows
  • Sn-W granites are typically depleted in compatible elements which can give rise to low Fe-Ti oxide mineral content, making for magnetically ‘quiet’ granite targets
  • Enrichment in incompatible elements can result in strong radiometric U and K responses when exposed at surface
  • Contact aureoles around granites may have different weathering properties giving rise to distinct morphological characteristics visible in topography data
Figure 2. Model of Sn-W stockwork and greisen deposits. Scale is very approximate’. Image from Blevin (1998).

Conceptual Modelling Approach

The prospectivity modelling exercise presented herein relies on two types of data; point data representing the spatial location of known Sn-W mineralisation occurences, and gridded raster data sets containing geological, geophysical and remote sensing information surrounding the known Sn-W occurences. The goal of modelling is to train an ensemble decision tree model on a two-class problem in which classes contain raster data from pixels that are either proximal or distal to known occurences. The hope is that these models are sensitive to the multivariate signature of mineralisation in the raster data, and thus can be applied to all parts of the data sets in order to generate a prospectivity data layer of comparable resolution.

Data Sets

A total of 222 Sn-W occurences were extracted from the mineral occurences data set compiled by Mineral Resources Tasmania. These were filtered to ensure only in situ occurences were included as transported alluvial deposits would likely introduce spurious signals into the modelling procedure.

Table 1. Information regarding the evidence layers used in Sn-W prospectivity analyses.
Figure 3. Data layers with Sn-W occurences plotted as red stars. Colour stretches have been clipped to 95% of the data range for each layer.

Principal Component Analysis

Figure 4 below presents principal component eigenvectors for each data band. It is clear that all radiometric and Landsat data bands are highly positively correlated with themselves given the similar magnitude and direction of their eigenvectors. The inclusion of highly correlated data layers such as these into ensemble decision tree modelling workflows can contribute a degree of redundant information that may adversely effect the performance of the models. It is for this reason that a feature extraction procedure was applied to the raster data prior to modelling.

Figure 4. Principal component eigenvectors for each data layer. Note that radiometric and Landsat layers have similar variance and are positively correlated.

Feature Extraction

Feature extraction involved a linear dimensionality reduction procedure in which three principal components were derived from the six Landsat layers, and two principal components from the four radiometric layers. Figure 5 below presents each of the three Landsat principal components as an RGB image. Despite using a Landsat product compiled over 30 years so as to represent the ‘barest earth’ Landsat signal, principal components tend to be sensitive to vegetation signals in NE Tasmania.

Figure 5. Landsat principal components plotted as RGB colour bands with Sn-W occurences overlain as white stars. Red colours represent cultivated land while blue colours represent dense wet eucalypt forest (dark blues). Low density sclerophyll forest appears as light to dark green colours while coastal sand dunes are given by white colours. Yellow colours in the south of the data set probably represent bare earth in the drier parts of the Fingal valley near Avoca.
Figure 6. Radiometric principal components.
Figure 7. Data layers after dimensionality reduction ready for prospectivity modelling.

Propsectivity Modelling with CatBoost

Prospectivity modelling relied on the python implementation of the CatBoostClassifier algorithm, part of the CatBoost family of gradient boosted decision tree modelling algorithms. I use CatBoost because I am familiar with it, but there are alternative algorithms similarly suited to this application including Random Forests, XGBoost and LightGBM.

  1. scale the input rasters to their respective unit variance
  2. loop through Sn-W occurence locations, hold out the current occurence as well as all other occurences within 2km
  3. extract evidence layer data from pixels within some box surrounding occurences not behing held out, this is the proximal data class
  4. extract at random an equivalent number of pixels outside the boxes surrounding occurences, the distal data class
  5. shuffle the classes and train a CatBoostClassifier model on 70% of the data, evaluate the model against 30% of the data and shrink to the iteration with the best evaluation metric (accuracy in this case)
  6. apply the model to every pixel in the evidence layer data set to get a raster with pixel values representing the model’s confidence that the given pixel is in the proximal class
  7. average all prediction rasters generated for each holdout iteration into a single output raster

Holdout Results

Figure 8 below presents Sn-W occurences plotted on a digital elevation model and coloured by the averaged model predictions for surrounding pixels in the case where the occurence was held out. Here, 99 from 222 occurences are correctly classified as being proximal to an occurence, a false negative rate of 55%. The models are largely insensitive to Sn-W occurences outside of the three main mining districts in which there is a high spatial density of occurences; Rossarden-Storeys Creek in the south, Blue Tier in the north and Great Pyramid in the east.

Figure 8. Sn-W occurences overlain onto DEM. Occurences are coloured by the mean probabilities describing the holdout model’s confidence that the surrounding pixels are proximal to mineralisation.

Feature Importance Results

Feature importance values describe the degree to which changes in the feature values effect the prediction values. The larger the feature importance value, the greater the effect variation in this feature has on the model’s prediction outputs. All of the 222 holdout models saved feature importance data which are summarised in the boxplot shown in Figure 9 below.

Figure 9. Feature importance box plots showing feature importance of each data layer across all 222 models.

Averaged Prediction Results

A final prospectivity map was generated by averaging all 222 model prediction rasters into a single raster layer with pixel values representing the average probability that pixels are priximal to Sn-W mineralisation. Figure 10 presents this layer with occurences overlain as cyan coloured stars. If you would like to view this raster layer yourself, you can download a GDA94 UTM zone 55 projected version here and a WGS84 mercator projected version here.

Figure 10. Averaged probabilities that pixels are proximal to a Sn-W occurence from 222 holdout models.

A Cheeky Squizz At Current Exploration Leases

The plot below (Fig. 11) presents the final averaged probability output as a semitransparent overlay on Google satellite imagery. Pink to yellow colours represent pixels classified as being proximal to mineralisation. Current metals exploration leases taken from Mineral Resources Tasmania are overlain as green polygons and Sn-W occurences used in model training are overlain as red stars.

Figure 11. Final averaged probability raster overlain onto Google satellite imagery in QGIS. Red stars are Sn-W occurences while green polygons are current metals exploration tenements. Yellow to pink colours in the prospectivity raster represent areas of high prospectivity.

Summary & Conclusion

Machine learning is a powerful tool in the kit of the mineral explorationist. However, it is not magic and does not give you the equivalent of X-ray vision for mineral systems. In this case, holdout models predicted known mineralisation 45% of the time and struggled to identify occurences outside of the major mining camps, so there are probably a number of prospective areas not identified by this prospectivity exercise. This could potentially be improved by incoporating geochemical information in the form of stream sediment geochemical analyses into the modelling procedure, something that may be released in future iterations of the project.

Future Work

  • Incorporate geochemical information from stream sediment analyses in some sophisticated way
  • Follow up some of the prospective areas with a database search to see whether they have been tested by historic drilling
  • Repeat the entire workflow on the same raster data for structurally hosted gold deposits



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Thomas Ostersen

Thomas Ostersen

Geophysicist and Data Scientist at Datarock. Mineral exploration, EM geophysics and perceptually uniform colourmaps float my boat.