RAPIDS AI
Published in

RAPIDS AI

Sparse Forests with FIL

Introduction

Using Sparse Forests with FIL

  • 'DENSE' to create a dense forest,
  • 'SPARSE' to create a sparse forest,
  • 'AUTO' (default) to let FIL decide, which currently always creates a dense forest.
from cuml import ForestInference
import sklearn.datasets
# Load the classifier previously saved with xgboost model_save()
model_path = 'xgb.model'
fm = ForestInference.load(model_path, output_class=True,
storage_type='SPARSE')
# Generate random sample data
X_test, y_test = sklearn.datasets.make_classification()
# Generate predictions (as a gpu array)
fil_preds_gpu = fm.predict(X_test.astype('float32'))

Implementation

Figure 1. Storing sparse forests in FIL

Benchmarks

  1. With depth limit, set to either 10 or 20; in this case, either a dense or sparse FIL forest can fit into GPU memory.
  2. Without depth limit; in this case, the model trained by SKLearn contains really deep trees. In our benchmark runs, the trees usually have a depth between 30 and 50. Trying to create a dense FIL forest runs out of memory, but a sparse forest can be created smoothly.
Figure 2. Benchmark results for FIL (dense and sparse trees) and SKLearn

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store