Packets flowing through the Matrix, created with Bing AI

Devlog 20 — Networking land behind the mirror

Jakub Neruda

--

Welcome to a devlog for my retro arena FPS game called Rend. This devlog is one of many, and you can access the other ones from this list. They provide insight into the development process and challenges of writing a game engine in C++ from scratch. The game is available for free on Itch.io.

To save you some time in scouring the previous devlogs, here are a few relevant points to get you into the loop:

  • The game is supposed to support 4-player LAN multiplayer. I don’t have to deal with hairpin conditions, NAT hole punching, and IPv4/v6 addressing conversion.
  • I opted for a rollback networking model — I store states of past N frames and only send inputs between the peers. Only those inputs can affect the simulation, with the goal being to keep simulations synchronized across all peers. If I don’t get an input on time, no stress. When it eventually arrives, I add it to the relevant stored state and resimulate from there, updating all newer states to synchronize the simulation. When the relevant stored state is no longer in the window, then the game must be terminated.
  • Server-side client: The client hosting the game runs the server logic on a parallel thread. They are connecting on the loopback interface and have essentially zero ping.

Last time I’ve been dealing with the following problem — the server-side client was lagging behind remote peers and desynced everybody else by sending them outdated packets.

I’ve spent the past few weeks trying to fix the issue, and I am writing this devlog to save you from repeating my mistakes. And oh boy, there were many.

Mistake 1 — Uncapped server loop

I started with profiling some (seemingly) unrelated performance issues. To catch performance issues early, I resimulate the last 20 frames EVERY frame (pretending I always get an input for the farthest state in the rollback window). Playing with three AI agents can lead to computing way more pathfinding than normal per frame.

As I was checking whether this is still the case or whether there is some unrelated performance problem, I indeed found one. My server thread busy-looped without any capping. Not ideal.

This understandably led to the server eating up performance that should’ve been allocated to the local client. Limiting the server to 120FPS freed up resources for the local client.

Mistake 2 — Round trip time

By running the server-side client to its full potential, I just inverted the problem. The remote client still got desynced, this time because it was receiving its own packets too late!

To explain better — each client records input for the current frame, sends it to the server, and then discards it. And then waits until it gets it from the server back. Since I only have two computers for testing, this allows me to mock cases with a third client having a higher ping.

And it certainly succeeded in doing so. The server-side client has ~0 ping because it runs on localhost. Every other client has some nonzero ping (and double RTT) meaning that it inevitably will get outdated packets and desync.

Fun fact about local networks: If your computers are hooked to the same local WIFI network, the communication is not multiplexed. Your router very quickly switches communication between ALL devices active in your home. Testing your net code while your partner watches Netflix is probably the best stress test you can get.

I made a feeble attempt to fix this issue by dropping a bunch of frames when a client encounters some older packets, trying to slow it down so others could catch up, but it didn’t work. Time for something more radical.

Rollbacking to lockstep

I realized I juggled too many variables. I decided to make a subpar solution in order to get a better picture of what was happening in the protocol.

I restricted the application to run in a lockstep mode. I store a few extra flags for each frame, denoting whether a client received inputs from a particular peer. I then read packets until all of these flags are set for currentFrame-N with N=1 (effectively giving me Lockstep).

And it was a surprisingly good idea to do so! I found out the game freezes almost immediately after the start (like 3–4 frames in). By inserting a huge amount of debug logs into my networking code, I discovered an off-by-one error when assigning packets from the future to correct frames. Oops.

It certainly helped. I suddenly got to experience a few dozen frames before the game froze again. A further examination of logs uncovered another bad mistake.

Mistake 3 — Local networks are not reliable!

From the start, I targeted four-player LAN games to work. And I assumed that data transfers in a LAN environment are reliable. And since working with TCP sockets is a bit more cumbersome (although SFML is trying hard to make it as easy as possible!), I just used UDP from the start without giving it a second thought.

With UDP, you only need a single socket bound to a port, waiting for data to arrive. When they do, you’ll get the remote address and a port, so you can identify the client who sent the data and do all the relevant processing. With TCP, you need a dedicated listener who waits for new incoming connections. Each connection requires its own persistent socket, tied to a particular client. This makes the initial processing slightly more complicated.

But LANs are not reliable. Because my game requires all packets to be delivered (although not necessarily in order), I either had to implement a delivery confirmation mechanism over UDP or migrate to TCP. I chose the second option, as I see no point in reinventing the wheel when an old Czech game Bulanci proved years back that you can have real-time four-player LAN multiplayer over TCP.

https://www.youtube.com/watch?v=yWa1ajKgOhI

And it worked! The framerate was choppy, which is to be expected in Lockstep, but it was a step in the right direction. And since I implemented lockstep atop of rollback-ready code, I only had to change the parameter of N to 20, meaning that the game will only block when waiting for packets twenty frames old.

This parameter modification immediately moved my testing environment into comfortable framerates once more. If there is a network hiccup, all clients might experience choppy framerates. But from what I’ve seen so far, packets tend to lag slightly and then arrive in a bigger bulk at once. So it is not constantly offsetting the communication, just creating localized spikes that should be covered by the rollback window.

Summary

I am not convinced I’ve fully slayed this challenge, but the game should be ready for some field testing. And a lot of UX tweaks, network timeouts and all those super annoying things to code that users expect for a user-friendly experience :D. Wish me luck so the next devlog can be an announcement of a new version release. See you then!

--

--

Jakub Neruda

10+ yrs of C++ experience. I care about Clean code, Software architecture and hobby gamedev.