Running Deepy Locally on WSL2 in Windows 11

Daniel Kornev
Jul 7 · 8 min read


In this post I’ll share my experience running Deepy, our multiskill AI Assistant platform, on a local PC running Windows 11 with WSL2.

Last week my trusty Lenovo ThinkPad X1 Yoga 2nd Gen showed its age (it was bought in December 2017, and has been used almost non-stop across Russia in US ever since) and I’ve decided to move to a new PC. This time given the pandemics and work-from-home thing, my goal was to have a solid rig w/o compromises. The dream is to be able to host our AI Assistants/socialbots at least partially locally.

To give some perspective at DeepPavlov, we have DGX cluster with VMs running w/ a number of NVIDIA GTX 1080 TI GPUs. For our last year’s DREAM AI Assistant Demo I used one such VM with 3 GPUs; back then it was more or less enough to host the entire thing. We also have PCs with i7–7700, 32GB RAM, and same GPUs. At home, my X1 Yoga was equipped with a normal NVIDIA GTX 1080 GPU. Similar setup but non-TI GPU and just 16GB of RAM.

And so, my new rig is a PC (fifth in my entire life), with AMD Ryzen 7 5800x (16x virtual cores), 64GB RAM, the same NVIDIA GTX 1080 previously installed into Thunderbolt 3 dock, and 2TB Samsung SSD (NVMe). This rig has a potential to host up to 4GPUs (3 PCIE 16x ports + 2 Thunderbolt 3 ports enabled by the GIGABYTE VISION D-P motherboard), and up to 128GB of RAM.

This rig is currently running under Windows 11 (22000.51 at the time of writing). What I want is to be able to run our AI assistant platform locally (at least partially) and experience what you, our developers, would experience.

Note: There is no plan running it under Ubuntu as there are some desktop apps (mainly Microsoft Office and Visual Studio) that won’t run there.

The key thing is that our Deepy as well as DREAM AI Assistant Platform Demo, and other socialbots use our and third-party GPU-heavy NLP models. It makes sense to run them on my 1080 instead of using CPU.

Why WSL2?

Until recently, it was impossible in Windows to pass your host’s GPU into the virtual machines, but now this is a standard functionality of Windows 11.

As a side effect, this also allows you to run Linux GUI apps, too.

To run our platform under Windows 11, therefore, we can use one of at least three possible ways below:

  • Full-blown VM (running in Hyper-V)
  • Building on Windows
  • WSL2

While it might be interesting to build our platform natively on Windows, given that it would involve checking compatibility for all of the components, it makes sense to run things natively in Linux (e.g., Ubuntu 18.04) which we use in our data center anyways.

Turns out if you want to use your GPU in Hyper-V VM, you have to explicitly dismount it from your host machine, and mount it in the VM. This is not an option as my rig has only one GPU, and it can’t live without it.

Therefore, WSL2 with its host GPU sharing is the way to go. Below you can find the detailed instructions enabling you to get our Deepy up and running inside WSL2 on a Windows 11 machine with the compatible NVIDIA GPU.

Step-By-Step Instructions


You need admin rights to your machine. Use Windows Terminal (or any other terminal app of your choice) to begin installation.

Step 1: Install Windows Subsystem for Linux:

Step 2: Install Virtual Machine Platform (required for WSL2):

Step 3: Download the latest Linux kernel update: AMD64

Step 4: Make sure your WSL is set as WSL2 by default:

Step 5: Pick your Linux distributive from Microsoft Store, e.g., Ubuntu 18.04:

  • Ubuntu 18.04 LTS (other distribs are listed here; you’ll need glibc-enabled distributive to use NVIDIA CUDA-enabled driver in WSL, e.g., Ubuntu or Debian)

Step 6: Download and install the preview GPU driver:

NVIDIA CUDA-enabled driver for WSL

Step 7 (Optional): Use NVIDIA GeForce experience to upgrade your driver to the latest version.

Step 8: Make sure your WSL2 Linux Kernel version is at least 4.19.121 or hire:

If it’s not, use Windows Update to get the latest version.

Step 9: Follow the instructions from NVIDIA to install NVIDIA CUDA Toolkit. I’ve included them here for brevity but feel free to follow the link above:

Step 10: Install CUDA. We use CUDA 10 in our current version of DeepPavlov Library, so use this command:

Step 11: Make sure your CUDA apps can run using your host NVIDIA GPU:

You should get an output like this:

If this is what you see (your numbers will be different), this is great! It means you are now ready to move forward with NVIDIA Docker installation.

Step 12: Install Docker CE

Important: currently, Docker Desktop’s WSL2 backend isn’t supported by NVIDIA Container Toolkit. However I won’t advise you to use Docker Desktop anyways.

Step 13: Install NVIDIA Container Toolkit (you can read their instructions here):

Setup the stable and experimental repositories and the GPG key. The changes to the runtime to support WSL 2 are available in the experimental repository:

Install the NVIDIA runtime packages (and their dependencies) after updating the package listing:

Open a separate WSL 2 window and start the Docker daemon again using the following commands to complete the installation.

Step 14: Check that your NVIDIA docker containers can run on your WSL2 machine

In this example, let’s run an N-body simulation CUDA sample. This example has already been containerized and available from NGC.

From the console, you should see an output as shown below.


Run Deepy on your new WSL2-based Linux environment

Finally, it’s time to try out Deepy!

Step 1: Clone Deepy’s repository

Step 2: Build and run it

Step 3: Try it out!

Once the whole thing will be downloaded, built, and run, you can play with the Deepy on your machine.

Experiment With Deepy

There are several ways to play with Deepy:

  • via its REST APIs available in this case at (see the docs here)
  • via Deepy 3000 (our small UWP app originally built for our talk at NVIDIA GTC Fall 2020)
  • via our website running locally on your machine

For brevity, I’ll explain here how to get our demo website up and running:

Step 1: Clone demo2’s repository:

Step 2: Install node.js (below are instructions taken from here):

Enable the NodeSource repository by running the following curl command as a user with sudo privileges :

The command will add the NodeSource signing key to your system, create an apt sources repository file, install all necessary packages and refresh the apt cache.

Once the NodeSource repository is enabled, install Node.js and npm by typing:

The nodejs package contains both the node and npm binaries.

Step 3: Install Yarn:

Enable Yarn repository:

Add the Yarn APT repository to your system’s software repository list by typing:

Once the repository is added to the system, update the package list, and install Yarn, with:

Step 4: Edit URI used to access Deepy bot:

In the editor, change URI to

Save and close the editor (Ctrl+X, Y).

Step 5: Run the website:

Go to the root of your cloned demo2 repository:

Once everything is built, you should see this:


Step 6: Open your copy of our demo2 website in the browser:


This should show you Deepy 3000 web UI:

Click on “Agree”, and you’re good to go!

Now that you’ve got your system up and running, you can follow our Deepy’s wiki to learn more how to build your own skills and annotators for your own Multiskill AI Assistant!

Best of luck and let us know what you’ve built with Deepy!

Geek Culture

Proud to geek out. Follow to join our 1M monthly readers.