Orchestrate Your Virtual Machines with Proxmox
In most cases, a container has been more than sufficient to satisfy the functions I needed for developing or utilizing applications. It really has. But in some cases, mostly due to my lack of my ingenuity, there have been unavoidable circumstances where I have no choices other than going beyond the container environment. Think of a case where you cannot afford another bare metal hardware but absolutely need to populate something as close to that as possible, e.g.) for tweaking the booting sequence or foolproof use of systemctl, then going for VM feels like the only choice, and frankly for me, it really was (though, again, the advance of container ecosystem is making it more and more an edge case where you absolutely have no choice other than VM).
Some of the not-so-much-rarefied use cases might be when you have to spawn VMs at incoming requests or distribute the same (enough) snapshot of a machine across many people.
However, for me personally, the most familiar and truly meaningful (in other words, fun) use case of the VM is that I can have absolutely disposable and manipulatable machines at my fingertips as many as the host system allows, while being able to stay away from affecting my host system configuration or even damaging it irreversibly.
The bottomline is: VM is one of the greatest logical toys you can ever have on your host machine.
For that purpose, I know VirtualBox on Windows and VirtManager(Qemu/KVM) on Linux are serving the users handsomely.
However, besides VirtualBox unable to handle PCI passthrough and Qemu/KVM not allowing an easy bridging on a wireless interface, there might be some considerable heavy lifting to handle if you want not only VMs but VMs across multiple host machines (VM orchestration, so to speak).
Today’s article is about the great tool called Proxmox that can serve that purpose very well.
As you can see in the below picture, I chose to go with a live media installation.
You might have to spend some time to figure out which one of the entry best suits your hardware. For me, it was the third one with nomodeset as a kernel parameter.
You might prefer the handsome-looking graphical user interface, but this one was my only choice at the time, not that I’m complaining but it was.
Fill out some intuitive and obvious fields and you’re done. Proxmox installer will handle everything that is necessary for you.
After the installation is completed and the machine reboots, access the admin panel which is available at the machine_ip_address:8006 by default, this is what you get.
Let’s create our first VM. If you’re familiar with Qemu or Virt Manager, make yourself at home as the interface doesn’t feel like your first encounter with it.
But first, obviously we need an iso image of the guest os to be installed onto our VM.
Be glad, because you can directly download the guest os iso file by providing Proxmox with the url of the download link.
Now you can add the iso.
In this stage, if you just want to try out how it works and if it works all fine as a simple VM with no particular attachment or peripherals, just make some obvious clicks through. Then you can get this. A fully functioning VM that is accessible from the web console or through SSH session (if installed)
Now, we’re going to import VM created at another place, namely VirtualBox on Windows. It doesn’t matter which configuration you’re choosing when making the VM an OVA file because when you’re exporting Debian/Ubuntu of recent release, you are likely to face some weird uninitialized network interface problem.
Let’s move it to our Proxmox host.
When extracted, those are the contents of it.
Now we’re going to create another VM but this time with the just-imported ubuntu server image.
It’s important to note (and also obvious but sometimes an idiot like myself overlooks) that now installation medium is no longer needed.
In this stage, what you have to do differently in comparison to the previous case where you installed the OS from scratch is to correctly set the boot order since the already-created root filesystem and boot partition lie in the disk image file. To do this, first remove the disk device which is useless in this case.
Going into the Proxmox host terminal (by whatever means available, I used ssh), and run the below command with the correct VM id and disk image file name.
It will attach the newly imported disk device to our VM. I forgot to save the edit page to make the attached device usable, but you don’t have to worry about it because it is (as a typical Proxmox interface) intuitive and easy.
Edit boot order as just mentioned.
Now, when you boot up, it is likely that you will face an empty useless network interface such as below screenshot.
In this case what you have to do is (speaking from my own experience) to check out a file in the default network interface initialization path.
As you can in the below screenshot, the target interface name should be corrected to make the networking work out as expected.
As mentioned at the start of this article, another salient feat of Proxmox is that it can help you with binding multiple machines to roll out some sort of a VM farm on a cluster. The above one is the original Proxmox host, and the below is the one being added.
Keep going along as the first one, but sometimes, you might not have activated yet VT-x, and VT-d (on Intel) or AMD-V and AMD-IOMMU. If that’s the case restart the machine and get into BIOS or UEFI to activate them. Don’t skip the part where you activate not only VT-x or AMD-V but also the others since it becomes a critical issue in the later stage when you try to make your VM a GPU-capable one.
On the first machine, click on Create Cluster and Proxmox will handle the initiation. But before going further, do note that Proxmox uses the RAFT algorithm to manage a cluster, it is advised to have machines of an odd number, say 3, 5, 7… (but in this article, the only availability for me was two machines, which is practically no different from having one machine in terms of fault-tolerance)
On the first machine after the cluster creation, copy the join information.
Now, on the second machine, use the Join Cluster button.
You can see now you have two nodes in a cluster.
From now on, we’re going to see how to configure a VM to see and make use of a GPU plugged in on the host machine using the technique called PCI passthrough.
On the host machine, assuming you have enabled VT-d or AMD-IOMMU, open the grub configuration.
Now, a long line of kernel parameters are listed below, which worked for my case, but you might have to make some tweaks or even add to that. Don’t be afraid to try it out because those configs alone can hardly break down the system.
If done editing, update the grub configuration.
Reboot after that.
Now add the below kernel modules and update initramfs.
Reboot again.
Using lspci -nn or whatever with nn option, check out the device id.
After that, execute the below lines. What they do is basically isolate the host GPU and prevent it from being initiated and used by the host kernel because that will cause the VM to fail to properly access the device as, in this case, only one client can access the GPU at a time.
There are essential commands to know if all steps are correctly checked out for the PCI passthrough.
After updating the grub configuration and rebooting, use the below command to check if IOMMU is working correctly.
After the vfio module addition, initramfs update and reboot, use the below command to check if everything is correctly working.
If everything is fine, you can see that now GPU device is properly aligned with vfio-pci module. Use lspci -nnk or -nnv to find out.
Even if everything checks out so far, due to its inner workings that are not intended primarily for this kind of use, the last part of setting up a GPU-attached VM should be handled with great care.
Make sure your GPU is not a primary graphic renderer on your host.
Let’s create a VM. A little different step to make a VM is taken below.
Others might work but I can’t be sure it’s guaranteed to wor.
Make the display like the below one.
Do not try to attach the PCI device yet. First, boot up the machine and immediately hit ESC to enter the device manager. In here, personally, I’d like to disable secure boot so that in the later stage, adding a GPU driver to the VM becomes easier.
Now, when booted, edit the blacklist and make VM (also) not able to use the GPU as a graphic rendering device
Update the initramfs inside the VM.
Turn off the machine and now is the time for adding the PCI device.
Do note that even if you might have more than one available GPU sub devices (more than one “function”), adding only the main GPU device works more reliably.
Now when booted up again, the device will be observable by the lspci command. So, the natural next step is installing the driver! (there might be a prettier approach for your GPU device, but mine is a somewhat outdated model, I had to download it directly from the download page)
Beautiful mix of colors, aren’t they?
Now, finally, we can use the host GPU from inside the VM.
You can check PyTorch is also working (obviously)
Now, if you want to make sure this works all the time in all cases including not only VM reboot but also host reboot, use the following script and add it to crontab reboot action.
Added to that, my personal step to ensure stability in booting up GPU attached VM is to always delete the PCI device after turning the VM down and after rebooting the host, reattach the PCI device and boot up the VM.
Let’s just say it’s an idiot’s method of clearing out all the uncertainty that might be lurking in the way, but it works anyway.
Thanks!