Image copyright ibmphoto24 CC-BY-NC-ND

KVM + iSCSI Part II: PCI-Passthrough a Virtual Function

Prerequisites

First, make sure to work through Part I to setup your iSCSI target. In order to follow this tutorial, you will need a few additional things:

Set up the Virtual Function

The first thing we must do is make sure that you have virtual functions enabled. PARENT is the name of the physical device that supports virtual functions. The following command works for intel nics. You may need to consult the manual of your card to figure out how to set up virtual functions on your card:

PARENT=ens2f0 # make sure to replace this with your NIC
echo 4 | sudo tee /sys/class/net/$PARENT/device/sriov_numvfs

Now set a few bash variables for the next series of commands:

PCI=$(basename $(readlink /sys/class/net/$PARENT/device/virtfn0))
DRIVER=$(lspci -v -s $PCI | grep modules | awk '{print $NF}')
VENDOR=$(cat /sys/class/net/$PARENT/device/virtfn0/vendor)
DEVICE=$(cat /sys/class/net/$PARENT/device/virtfn0/device)

Next, bind the virtual function to the vfio_pci driver:

sudo modprobe vfio_pci
echo $PCI | sudo tee /sys/bus/pci/drivers/$DRIVER/unbind
echo ${VENDOR##*x} ${DEVICE##*x} \
> /sys/bus/pci/drivers/vfio-pci/new_id
echo $PCI | sudo tee /sys/bus/pci/drivers/vfio-pci/bind

Reconfigure the Virtual Network

Put the virtual function onto its own vlan so that you don’t have to worry about hypervisor traffic:

sudo ip link $PARENT set vf 0 vlan 10
sudo ip link add vlan10 link $PARENT type vlan id 10
sudo ip link set up vlan10
sudo ip link set master virbr0 vlan10

Move the hypervisor ip from the bridge onto the vlan device:

sudo ip addr del 192.168.10.1/24 dev virbr0
sudo ip addr add 192.168.10.1/24 dev vlan10

You may find that traffic does not properly transit between the virtual function and the hypervisor vlan device. If this occurs, you can tell the device to transmit packets to the switch and let the switch loop them back:

sudo bridge link set dev $PARENT hwmode vepa

Note that the above will only work if the switch is setup to transit vlan 10 and port security is turned off.

Launch the Guest

At this point you should be able to successfully passthrough the virtual function to the guest:

sudo qemu-system-x86_64 \
-smp cpus=2 \
-display vnc=0.0.0.0:0 \
-boot order=n \
-net none \
-device vfio-pci,host=$PCI

If you connect to the vnc console, however, you will notice that iPXE is absent. There is no prebuilt ipxe rom for your network device.

Building iPXE

You’re going to have to build iPXE from source, so grab it:

git clone git://git.ipxe.org/ipxe.git
cd ipxe

Since you are building it yourself, you can embed a script to save you from catching the iPXE prompt and typing commands manually:

cat > script.ipxe << "EOF"
ifopen net0
set net0/ip 192.168.10.10
set net0/netmask 255.255.255.0
sanboot iscsi:192.168.10.1::::iqn.2016-01.com.example:cirros
EOF

Build a rom with the script targeted to the specific virtual function:

ID=${VENDOR##*x}${DEVICE##*x}
export EMBED=$PWD/script.ipxe
(cd src && make bin/$ID.rom)

If you get an error, ipxe may not have support for your nic. You may be able to use the generic network driver (undionly) instead:

(cd src && make bin/undionly.rom)

Launch With Custom iPXE

Start qemu again with the custom ipxe image:

sudo qemu-system-x86_64 \
-smp cpus=2 \
-display vnc=0.0.0.0:0 \
-boot order=n \
-net none \
-device vfio-pci,host=$PCI,romfile=$PWD/src/bin/$ID.rom

If you were forced to use the generic network driver, you may have to specify it with a slightly different command line:

sudo qemu-system-x86_64 \
-smp cpus=2 \
-display vnc=0.0.0.0:0 \
-boot order=n \
-net none \
-device vfio-pci,host=$PCI \
-option-rom $PWD/src/bin/undionly.rom

In either case, you should see the network device automatically configuring itself and booting the cirros image. Congratulations, you have network booted a vm using a blazing fast virtual function!