Image for post
Image for post
Photo by Tanner Boriack on Unsplash

Accelerate Your scikit-learn Applications

Faster Experimentation with Predictable Behavior

Oleksandr Pavlyk
Jun 26, 2020 · 4 min read

Oleksandr Pavlyk (Intel Corporation) and Olivier Grisel (INRIA)

The Intel Distribution for Python (IDP), part of the Intel AI Analytics Toolkit, includes an optimized scikit-learn that accelerates a selection of common estimators (e.g., logistic regression, singular value decomposition, principal component analysis). These functions are built on top of the Intel Data Analytics Acceleration Library (DAAL) so they achieve performance close to that equivalent C++ programs. The DAAL-powered estimators are implemented in the daal4py package.

DAAL’s performance comes from efficient use of multiple CPU cores, cache-friendly blocking, and effective use of processor instruction sets. It is tuned to run best on Intel processors. Improved scikit-learn performance benefits users in shortened model development iteration cycles and reduced cost of training. Improved software engineering in the library also results in a smaller memory footprint, which allows users to tackle larger machine learning problems with their existing hardware.

Figures 1 and 2 show speedups of the accelerated scikit-learn over the base library, as measured with scikit-learn_bench. Figure 1 compares the multithreaded, accelerated scikit-learn against the best performance of the base scikit-learn between n_jobs=1 and n_jobs=-1. The scikit-learn user needs to know the algorithm details to use the n_jobs setting to improve performance for some functions (e.g., using a non-default value of n_jobs for LogisticRegression is detrimental to performance). Scikit-learn developers are working to improve the user experience.

Image for post
Image for post
Figure 1. Multithreaded speedup of the accelerated scikit-learn over the base scikit-learn

Running the accelerated scikit-learn sequentially shows that many algorithms in the base scikit-learn have room for performance improvement, notably training and inference of SVC as well as training of LinearRegression (Figure 2).

Image for post
Image for post
Figure 2. Sequential speedup of the accelerated scikit-learn over the base scikit-learn

The pursuit of performance can sometimes sacrifice correctness if the developer isn’t careful. To insure that the accelerated scikit-learn lives up to the high standards of the scikit-learn user community, the accelerated version is being required to pass the scikit-learn test suite. A system to run these tests was developed in collaboration with scikit-learn core developers, Olivier Grisel and Jérémie du Boisberranger.

Testing is done with the currently released scikit-learn and the current master sources. The status of these tests is displayed on the landing page of github.com/IntelPython/daal4py:

Image for post
Image for post

The testing is performed in CircleCI so an interested user has easy access to the testing logs for further inspection.

Special attention is paid to insuring deterministic, mathematical equivalence between the accelerated scikit-learn and the base version. Mathematical equivalence means solutions obtained by both versions agree within the tolerance specification of the solver. Such a cross checking has resulted in a feedback to improve scikit-learn’s own test suite, e.g.: scikit-learn/#12738, #12263, and #13992.

This collaboration also given scikit-learn developers a better understanding of the performance of their implementation. For instance, @jeremiedbb completely refactored the k-means implementation using Cython to improve multithreaded scalability (scikit-learn#11950). This work is now part of the 0.23 release of scikit-learn.

To accelerate your own scikit-learn installation, you need to install daal4py. This is easy to do with the conda package manager:

pip users can install daal4py as follows:

Once daal4py is installed, you can accelerate your scikit-learn installation (version >=0.19) in either of two ways:

which is great for running tests, and for quick experimentation. You can also do so explicitly in your script:

Patching is accompanied by informational message:

We invite you to try accelerating your scikit-learn workloads with daal4py and Intel AI Analytics Toolkit to see the performance improvements for yourself.

Image for post
Image for post

Better Insights Faster: Big data driving AI

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store