When to DSSTNE?

So you have heard about DSSTNE and wondering if its a match for your machine learning problem? Let me help you then.

How to pronounce this?

Like “destiny” :) .

What is DSSTNE?

DSSTNE has been recently open sourced by Amazon. It is a C library for running deep learning neural networks algorithms against laaarge sparse input vectors on GPU hardware.

What kind problems DSSTNE solves?

Computing deep learning against laaarge sparse input vectors on cluster of GPU hardware.

Speaking in human language — assume you have many very large input records with many empty values and you need to calculate machine learning algorithm on this data in reasonable time. In such situation your regular machine learning solution running on regular CPU-based machine may be too slow for your needs. So you need to use a cluster of computers with GPU units, as GPU units are blazing fast when it comes to arithmetic operations. DSSTNE can do a heavy lifting for you :) .

Is DSSTNE for me?

If you operate on really large feature vectors with many sparse space between the values (i.e. vectors with many zeros and only bunch of values) and your Apache Spark ML solution does’t scale so well for this kind of data — you should consider using DSSTNE.

DSSTNE hello world

You can run DSSTNE as a docker container (which hides all the twisted C libraries complexity from you) or as a regular C library.

I’m personally really impressed how easy it is to run DSSTNE in Docker. That could be a perfect match to wire the former into your Kubernetes infrastructure.

There is also Python support coming to DSSTNE, to make it more civilized ;) .

Server hosting with GPU support

I’m DigitalOcean fanboy, but unfortunately DigitalOcean doesn’t support GPU droplets yet (I hope yet). Please don’t cry however, as Amazon EC2 allows you to create GPU instances.

Wanna learn more?