How to fix network issues after upgrading Proxmox from 7 to 8 and encountering the r8169 error?

Pattapong J.
5 min readJul 2, 2023

Update 2023–08–28: add some suggestion from comment.

Today, after upgrading my Proxmox from version 7 to 8, I encountered a problem where I couldn’t connect to the server after some time.

Upon investigation, I found the following error in the dmesg log just before the network went down.

[  127.084695] ------------[ cut here ]------------
[ 127.084697] NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
[ 127.084707] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x23a/0x250
[ 127.084724] Modules linked in: ..
[ 127.084768] snd_hda_codec ...
[ 127.084823] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P O 6.2.16-3-pve #1
[ 127.084825] Hardware name: HP HP ProDesk 400 G4 DM/83F3, BIOS Q23 Ver. 02.06.00 01/04/2019
[ 127.084826] RIP: 0010:dev_watchdog+0x23a/0x250
[ 127.084829] Code: 00 e9 2b ff ff ff 48 89 df c6 05 8a 6f 7d 01 01 e8 6b 08 f8 ff 44 89 f1 48 89 de 48 c7 c7 58 64 40 a1 48 89 c2 e8 06 ab 30 ff <0f> 0b e9 1c ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00
[ 127.084831] RSP: 0018:ffffbc16c0003e38 EFLAGS: 00010246
[ 127.084832] RAX: 0000000000000000 RBX: ffff993f00214000 RCX: 0000000000000000
[ 127.084834] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 127.084839] RBP: ffffbc16c0003e68 R08: 0000000000000000 R09: 0000000000000000
[ 127.084840] R10: 0000000000000000 R11: 0000000000000000 R12: ffff993f002144c8
[ 127.084841] R13: ffff993f0021441c R14: 0000000000000000 R15: 0000000000000000
[ 127.084842] FS: 0000000000000000(0000) GS:ffff99401f600000(0000) knlGS:0000000000000000
[ 127.084844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 127.084845] CR2: 000000c000b99000 CR3: 0000000165c10001 CR4: 00000000003726f0
[ 127.084847] Call Trace:
[ 127.084848] <IRQ>
[ 127.084850] ? __pfx_dev_watchdog+0x10/0x10
[ 127.084853] call_timer_fn+0x29/0x160
[ 127.084856] ? __pfx_dev_watchdog+0x10/0x10
[ 127.084859] __run_timers+0x259/0x310
[ 127.084863] run_timer_softirq+0x1d/0x40
[ 127.084865] __do_softirq+0xd6/0x346
[ 127.084867] ? hrtimer_interrupt+0x11f/0x250
[ 127.084870] __irq_exit_rcu+0xa2/0xd0
[ 127.084873] irq_exit_rcu+0xe/0x20
[ 127.084875] sysvec_apic_timer_interrupt+0x92/0xd0
[ 127.084877] </IRQ>
[ 127.084878] <TASK>
[ 127.084879] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 127.084881] RIP: 0010:cpuidle_enter_state+0xde/0x6f0
[ 127.084883] Code: 2a 97 5f e8 54 7e 4a ff 8b 53 04 49 89 c7 0f 1f 44 00 00 31 ff e8 82 86 49 ff 80 7d d0 00 0f 85 eb 00 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 12 02 00 00 4d 63 ee 49 83 fd 09 0f 87 c7 04 00 00
[ 127.084884] RSP: 0018:ffffffffa1c03da8 EFLAGS: 00000246
[ 127.084886] RAX: 0000000000000000 RBX: ffffdc16bfc00008 RCX: 0000000000000000
[ 127.084887] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 127.084888] RBP: ffffffffa1c03df8 R08: 0000000000000000 R09: 0000000000000000
[ 127.084889] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa1ec33a0
[ 127.084890] R13: 0000000000000006 R14: 0000000000000006 R15: 0000001d96d671a2
[ 127.084894] ? cpuidle_enter_state+0xce/0x6f0
[ 127.084896] cpuidle_enter+0x2e/0x50
[ 127.084898] do_idle+0x216/0x2a0
[ 127.084901] cpu_startup_entry+0x1d/0x20
[ 127.084903] rest_init+0xdc/0x100
[ 127.084905] ? acpi_enable_subsystem+0xe6/0x2a0
[ 127.084908] arch_call_rest_init+0xe/0x30
[ 127.084911] start_kernel+0x6b0/0xb80
[ 127.084913] ? load_ucode_intel_bsp+0x3d/0x80
[ 127.084916] x86_64_start_kernel+0x102/0x180
[ 127.084918] secondary_startup_64_no_verify+0xe5/0xeb
[ 127.084923] </TASK>
[ 127.084924] ---[ end trace 0000000000000000 ]---
[ 127.121363] r8169 0000:01:00.0 enp1s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
[ 127.123360] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.125367] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.127320] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.129387] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.131375] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.133418] r8169 0000:01:00.0 enp1s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 127.157920] r8169 0000:01:00.0 enp1s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 127.183123] r8169 0000:01:00.0 enp1s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 127.210150] r8169 0000:01:00.0 enp1s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 159.120265] net_ratelimit: 9 callbacks suppressed

After searching for a solution by using the keywords from the error message, I found many suggestions to turn off GSO (Generic Segmentation Offload) and TSO (TCP Segmentation Offload).
However, this solution did not work in my case because my network card does not support TSO to begin with. (If you are using an e1000 card, this link may be helpful: https://forum.proxmox.com/threads/e1000-driver-hang.58284/)

Run ethtool to check card feature.

root@pve:/var/log# ethtool -K enp1s0
Could not change any device features

So, I suspect that the error is happening due to the network card driver or a specific feature. I will now search for a solution specific to my network card.

# Check ethernet hw
root@pve:~# lspci -nnk | grep -A2 Ethernet
01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [103c:83f3]
Kernel driver in use: r8169
Kernel modules: r8169

# Dmesg error
[ 127.084697] NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out

The dmesg error indicates something about the “r8169” kernel driver, but your hardware is RTL8111/8168/8411.
Upon searching for errors related to “r8169” and “RTL8111/8168/8411,” I found this resource that may be helpful:
https://realtechtalk.com/Ubuntu_Debian_Linux_Mint_r8169_r8168_Network_Driver_Problem_and_Solution-2253-articles

Ubuntu Debian Linux Mint r8169 r8168 Network Driver Problem and Solution

This problem has been around forever, Linux seems to think it is fine to use the r8169 driver for an r8168 NIC but this often causes problems including the link not working at all.

In my case ethttool shows the link up and detected but it simply does not work especially on a laptop that has been resumed from suspension. Sometimes it takes several minutes for it to work or to unplug and replug the ethernet.

Try to follow the workaround solution:

root@pve:~# apt install r8168-dkms
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package r8168-dkms

It seems that the package “r8168-dkms” cannot be found. It appears that Debian has moved the non-free firmware away from the main repository.

Therefore, I need to edit the source list to add the “non-free” repository and try again.

Current /etc/apt/sources.list

deb http://ftp.debian.org/debian bookworm main contrib
deb http://ftp.debian.org/debian bookworm-updates main contrib

Add non-free non-free-firmware at the end

deb http://ftp.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://ftp.debian.org/debian bookworm-updates main contrib non-free non-free-firmware

By adding the “non-free” repositories, you should be able to locate the “r8168-dkms” package and install it successfully.

apt update

# some reader suggest to add pve-headers before install driver
apt install pve-headers

apt install r8168-dkms

After following the steps to add the “non-free” repository, you can proceed to disable the “r8169” driver.

echo blacklist r8169 >> /etc/modprobe.d/blacklist-r8169.conf

By blacklisting the “r8169” driver, it will be prevented from loading during system startup, allowing you to potentially resolve the network issue related to that driver.

After rebooting the system, the problem has been resolved, and the network is functioning properly.

root@pve:~# ethtool -i enp1s0
driver: r8168
version: 8.051.02-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

After following the instructions and rebooting. If you left without ethernet nic, it might be the driver did not load as Mr.Koloblicin suggested in the comment. So you can try to follow his suggestion below.

After following the guide and rebooting I was left without ethernet nic, because neither driver loaded (don’t know why). In my case dkms didn’t automatically install the r8168 driver (‘dkms status’ only showed ‘added’). So I had to:

dkms build r8168/8.051.02
dkms install r8168/8.051.02
modprobe r8168
systemctl restart networking

--

--