Optimizing Machine Learning Performance at Netsuite with GraalVM and NVIDIA GPUs

Alina Yurenko
Published in
4 min readMay 13, 2020


In this blog post we explore how GraalVM Python and grCUDA were used to build fast and highly accurate machine learning models at NetSuite.

Creating a next generation recommendation system at NetSuite

NetSuite provides a set of cloud-based business management services encompassing ERP, Financials, CRM, and e-commerce for more than 19,000 organizations. The NetSuite technology stack is based on Java SE and Oracle Database, which is deployed in Oracle data centers all over the globe.

One of the most popular NetSuite products is SuiteCommerce — an e-commerce platform combining ERP, financials, order management, and more. The Netsuite engineering team has been working on its next generation recommendation system, which targets a few different use cases. One is a general recommendation widget — a “Recommended for You” block that you can often see on e-commerce or video streaming platforms. Another use case is“searchandising,” which augments search results based on ML algorithms.


Recommender systems typically provide recommendations based on either collaborative filtering or content-based filtering. Content-based filtering creates recommendations by comparing the content of the items and the user profile. For example, if you select science fiction movies as your favorite genre, or watched many of those in the past, your streaming service will likely recommend you more of this genre.

Collaborative filtering methods are based on gathering and analyzing information about user preferences and behavior from many users, and predicting what the user will like based on their similarity to others.

To create a better recommendation system, the team adopted one the most common collaboratives techniques — matrix factorization approach.

Matrix Factorization (Pulaparthi, N.V.)

Matrix factorization algorithms work by decomposing the user-item interaction matrix into the product of rectangular matrices of lower dimensionality (rank) that represent users and items. The version, implemented by the Netsuite team, uses implicit feedback, like clicks and transactions. Once data is computed, it’s easy to retrieve prediction for user-item relevancy (personal recommendations), or item-item similarity (helpful in cross-sell recommendations).

After deciding to go with the matrix factorization approach, the team started exploring prototypes in Java, Scala, Spark and Python, and narrowed the solution down to two libraries: Implicit by Ben Frederickson and LightFM by Maciej Kula.


Both of the ML models mentioned above are fast and work well, but the issue is that they are both written in Python, whereas Netsuite is based on Java. The team decided to go with GraalVM, since it supports both Java and Python along with other languages. For Python, GraalVM offers high performance, and easy, consistent language interoperability.

Another part of the solution is GPU acceleration using grCUDA — an open-source language binding that allows developers to share data between NVIDIA GPUs and GraalVM languages (R, Python, JavaScript), and also launch GPU kernels. The team implemented the performance critical components in CUDA for the GPU, and used grCUDA from Python to exchange data with the GPU and to invoke the GPU kernels.

Recommender system prototype architecture


To deploy the prototypes, the team used the following setup:

Model computation time for different implementations

The graph above compares computation time of two models: a Java model created with EJML, and a Python & grCUDA model. As you can see, the matrix factorization is significantly faster when using GraalVM and grCUDA.


If you’re exploring ways to use Python machine learning models in your applications, it’s worth checking GraalVM Python and grCUDA for best performance.

As always, we would love to hear from you! If you have feedback or feature requests for our Python implementation or for GraalVM in general please create an issue in our GitHub repository or talk to us on Twitter: @graalvm.

This project was first presented by Lukas Stadler and Radek Obořil at Devoxx Belgium. View the recording to learn more: https://www.youtube.com/watch?v=a1ZLEDN9BMc. It was also presented at NVIDIA’s GPU Technology Conference. Register for GTC Digital and view the session recording on-demand here: Simplifying GPU Access: A Polyglot Binding for GPUs with GraalVM [S21269].



Alina Yurenko

I love all things tech & community. Developer Advocate for @graalvm, blog posts about programming, open source, and communities.