Advanced In-Database Analytics on the GPU

With Version 6.0, Kinetica introduces user-defined functions (UDFs), enabling GPU-accelerated data science logic to power advanced business analytics, on a single database platform.

User-defined functions (UDFs) enable compute as well as data-processing, within the database. Such ‘in-database processing’ is available on several high-end databases such as Oracle, Teradata, Vertica and others, but this is the first time such functionality has been made available on a database that fully utilizes the parallel compute power of the GPU on a distributed platform. In-database processing in Kinetica creates a highly flexible means of doing advanced compute-to-grid analytics.

This industry-first functionality stands to help democratize data science. Until now, organizations have typically needed to extract data to specialized environments to take advantage of GPU acceleration for data science workloads, such as machine learning and deep learning. Kinetica now makes it possible for sophisticated data science models to be developed and made available on the same database platform as is used for business analytics.

UDFs and the associated orchestration API enable data to be processed with custom code that can draw on the power of distributed GPUs. UDFs have direct access to CUDA APIs, and can take full advantage of the distributed architecture of Kinetica. Because Kinetica is designed from the ground up to take full advantage of the GPU, users have an advanced mechanism for distributed computation.

UDFs are able to receive filtered data, do arbitrary computations, and then save output to a separate table. Such computations might include linear interpolation, anomaly detection, clustering, regressions, or risk simulations such as Monte Carlo analysis. The brute-force parallel compute power of the GPU delivers fast response which makes it more suitable for interactive analytics and experimentation.

GPUs are particularly well suited for the types of vector and matrix operations found in machine learning and deep learning systems. With in-database processing, custom functions will be able to call machine learning/artificial intelligence libraries such as TensorFlow, BIDMach, Caffe, Torch and others to work directly on data within Kinetica.

The orchestration API is available with C++ and Java bindings. Additional languages, such as Python will be available soon.

Posted on 7wData.be.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.