Benchmarking LightGBM: how fast is LightGBM vs xgboost?

Laurae
Laurae
Jan 9, 2017 · 4 min read

This post is about benchmarking LightGBM and xgboost (exact method) on a customized Bosch data set. I have seen xgboost being 10 times slower than LightGBM during the Bosch competition, but now we got back with some numbers to compare! Our next benchmark will be about the Fast Histogram method of xgboost. The setup used is identical to the first benchmarks on Bosch + xgboost I made.

Cross-posted on Imploding Gradients.

Global overview

Let’s go straight to the chart, this should alleviate all the impatience!

Image for post
View interactively online: https://plot.ly/~Laurae/9/

As we can see, in average LightGBM (binning) is between 11x to 15x faster than xgboost (without binning).

We also notice the ratio gets smaller when using many threads: this is obvious as if you cannot keep the threads busy 100%, then threading inefficiency kicks in (some threads may be forced to wait idle because the processing for the scheduling for the next task is not fast enough).

1–12 Threads

Let’s get a look at the 12 first threads.

Image for post
Chart for 1 to 12 threads
Image for post
Table for 1 to 12 threads

What we can notice for xgboost is that we have performance gains by going over 6 physical cores (using 12 logical cores helps by about 28.3%, going from 577.9 seconds to 414.3 seconds).

Is this the same for LightGBM? Yes! We dropped from 45.1 seconds to 33.6 seconds, which is a massive performance gain (25.5%).

Conclusion for this part: use all logical cores for threading, this helps tremendously. If you want your machine learning training pipeline to end about 25% faster (varies by CPUs, obviously), you now know what to do: use logical cores instead of physical cores for thread count.

13–24 Threads

What if we look specifically for 13 to 24 threads? We add up to 12 threads as a reference for comparison.

Image for post
Chart for 13–24 threads (12 threads added)
Image for post
Table for 13 to 24 threads (12 threads added)

We can notice quickly:

  • No improvements for xgboost, more or less noisy variance
  • Inverse improvements for LightGBM, with increased boosting time (from 33.6 seconds to up 38+ seconds)

Therefore, a quick conclusion is the following: do not overallocate logical cores, it is not a good practice. Keep using logical cores as the thread count, and do not go over that number.

Quick look at LightGBM specifically

We can do a quick look at LightGBM curve.

Image for post
LightGBM boosting time vs number of threads

This seems to be a linear improvement: from 202 seconds (1 full core used, 1 thread), we dropped to 33.6 seconds (6 full cores used, 12 threads), which is nearly a 100% multithreading efficiency. As we hit the wall with more threads, the multithreading efficiency lowers drastically and we have those inverse improvements.

Data RAM efficiency?

A quick look at the RAM usage depicts the following, using gc() twice after the creation of the matrices:

Image for post
RAM efficiency of data types
  • Initial data (dense, unused): approx 8,769 MB (27.9% vs original)
  • Original data (dgCMatrix): approx. 2,448 MB (100% vs original)
  • xgboost (xgb.DMatrix): approx. 1,701 MB (69.5% vs original)
  • LightGBM (lgb.Dataset): approx. 2,512 MB (102.6% vs original)

It seems LightGBM has a higher memory footprint than xgboost.

Training RAM efficiency

We are using 12 threads to check the RAM efficiency, taken at the end of the 50 boosting iterations, using gc before boosting, not using gc after boosting:

  • xgboost: approx. 1684 MB
  • LightGBM: approx. 1425 MB (84.6% of xgboost memory usage)

We can notice LightGBM has a lower RAM usage during training, at the cost of an increased RAM usage for the data in memory. There could be improvements in the potential modifications of the R LightGBM package to have a more efficiency way to store data.

Next benchmark?

The next benchmark will come when xgboost’s fast histogram method is up and running and usable in R. It is currently up and running, but not usable in R. This would be the closest “Apples to Apples” comparison between xgboost and LightGBM.

We will be also comparing the logarithmic loss between xgboost and LightGBM.

Data Science & Design

All about Data Science, Machine Learning, and Design.

Laurae

Written by

Laurae

Laurae’s Data Science & Design curated posts

Data Science & Design

All about Data Science, Machine Learning, and Design. Also, lot of things about Statistics, Data Visualization, Benchmarking, and funny stuff.

Laurae

Written by

Laurae

Laurae’s Data Science & Design curated posts

Data Science & Design

All about Data Science, Machine Learning, and Design. Also, lot of things about Statistics, Data Visualization, Benchmarking, and funny stuff.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store