When a deep learning network has been trained the work of bringing it to production has just begun. Here at MoonVision different projects have very different production environments. Some projects lead to cloud hosted, some are run in the data centers of our customers and others are run as appliances on embedded industrial computers.

What all these environments have in common? They run Docker!

But what about the differences? In some environments we have NVIDIA GPUs for CUDA hardware acceleration, in some environments we don’t.

Why we re-engineered our container build pipeline

As you might know, to get the most out of CUDA requires CUDA-specific versions of various tools and libraries. In the past, we solved that by maintaining two versions of most of our Dockerfiles, one with CUDA acceleration and one for CPU only execution. …


Some time ago after training a model using deep learning frameworks you were locked into using it for inference as well. Considering that the inference environment often drastically differs from the training environment, this is likely not what you want. You feel the pain when you use various cloud platforms with GPU clusters for training and want to deploy the resulting models on low powered industrial PCs, on smartphones, or in the web. To do that using the same framework during inference as during training might not always be the best option. …

Jakob Klepp

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store