Chronicles of the Development of a Multiplayer Game: Part 2 — Loops and Leaks

Gary Weiss
Javascript Multiplayer GameDev
7 min readFeb 17, 2017

This is part two of a series on multiplayer game development in JavaScript. It continues where the first part left off, describing challenges that we faced in the development process, and how we solved them. The resulting game is called spaaace and you can play the game online, there is a game server running in Europe, and another one in the United States. The physics & networking code was split off into a separate open-source project called Lance.gg.

I won’t be offended if you fork it. The code is available here.

I think the Death Star is just around the corner!

Client-Side Prediction

One problem with multiplayer games is that it takes some time for player input to be sent from the client to the server, get registered, affect the game state and have the result sent back to the client. So if Bob turned his car to the left, he'd expect to see the results immediately on his screen, not a third of a second later. This is where client side prediction kicks in - it tries to predict for the client what will happen to the game state once the input reaches the server. If updates arrive from the server which contradict what the client has predicted, the client must now make corrections necessary to resolve these differences. In a simple example, player Bob turns his car to the left, and because of network latency, player Alice doesn’t know about it yet. On Alice’s screen, she will see Bob’s car is still moving forward as predicted by her client.

Client-side prediction must do the following:

  • Generally, assume that objects continue moving in their current direction, at the current speeds. This calculation involves applying expected change of position with respect to time (a.k.a. velocity) and applying the expected change of orientation with respect to time (a.k.a. angular velocity).
  • Next, when new information arrives from the server, the client must roll-back time, and process the parameters that changed, at the exact time they changed.
  • Next, the client now re-enacts the missing steps to reach the present time.
  • Lastly, some objects will have new positions, but we cannot simply teleport them to these new position— this causes objects to move in jittery ways. So we must have the objects’ properties gradually bend towards their correct positions. This bending must be done incrementally over multiple steps. It must also be done in proportion to the exact time elapsed on each render event.

Challenge 3: Shadow objects.

The Symptom: “I fire a missile, but it doesn’t show up right away”

Client prediction is usually interpreted as a method for extrapolating positions of objects. But in truth, there is plenty of other stuff that happens in a game, which is an unavoidable necessity of game-play prediction. You may even need to predict that objects will be created on the server!

In our case, part of prediction involved the shooting of a new missile. If a player hits the ‘fire’ button — It’s unacceptable for the player to wait for 200ms until the server-created missile appears. To solve this, upon detecting ‘fire’ input — the client must predict that a new missile will be created on the server. For this purpose we defined the concept of shadow objects. The shadow object is an ephemeral object which is created on the client, and will soon exist on the server as well. Once the server copy arrives in a future broadcast, it must become the definitive object, replacing the ephemeral client object.

The details of this feature required us to create a separate object id range which is private to each client. Also we need a mechanism to associate the shadow with the corresponding true object. This we did by marking each object with the input id which caused it.

It’s not the size that counts. It’s how you use it. (source: Flickr — Creative Commons)

Challenge 4: Game Loop.

The Symptom: “Every time I fire a missile, the spaceship makes a weird jump”

A game executes as a series of “game steps.” And the loop which executes each step is called the Game Loop. By the end of each game step, the game rules have been applied — this includes movement of objects, shooting of missiles, or application of user input. Contrast this with the Render Loop, where each step of the loop involves communicating with the graphics card and drawing the current game image on the user’s screen, green monsters and all.

The game loop executes both on the server and on each client, independently. However the server only sends synchronization broadcasts at a certain interval — for example every 6 server game steps. Maybe we should call this the Client Update Loop.

Even though the game loop and render loop are both running at 60Hz on the client, there is no real guarantee that they are running at a constant phase difference. No so such luck. The unfortunate implication is that the renderer needs access to interpolated positions based on current time. Running both of these loops at 60Hz (i.e. 60 times per second) on the game engine can be done in a relatively precise way, both on the server and the client. And the render loop does not depend on the game loop, just so long as it can get object coordinates when it needs them.

However there are reasons which require temporary adjustment of the loop interval. For example if even a minor drift has occurred between the server and the client, or if the server has slowed down due to heavy load, or to handle a network spike in a smoother way.

Jittery when pressed

One specific issue we noticed, in the form of poltergeist, was a weird jump every time a missile is fired on an Android device. Looking at the traces, we found that our game loop was not being executed for 150ms. That’s 10 full game steps which are simply not happening.

The reason? Functions registered using a scheduled timeout (in JavaScript this issetTimeout()) will not run during the first 150ms of a touch (press down) state on Chrome running on Android devices. Who knew? The alternative option was to schedule the game step using the browser’s equally unreliable render event (in JavaScript this is requestAnimationFrame()).

Since both setTimeout() and requestAnimationFrame()are unreliable by definition, but in different situations, we decided to run both loops, giving both loops the ability to execute a game step. Additionally, we added drift adjustment factors, allowing the next client game step execution to be delayed or hurried as necessary, when the client is estimated to be running too fast or too slow.

A lesson learned here is that the server and the client should report any occasions when the loops are not executing at the expected intervals. This should be considered an error state and reported by the engine, because debugging these problems based on visual symptoms is going to be much harder. If the next iteration of the loop did not occur within 40% of the expected time, or if iterations have chronic lateness, consider it a bug!

Game Loop in Live Action! (Source: link)

Challenge 5: Event leak.

The Symptom: “when the server has been running for a while, new clients can’t connect”

We had a bit of luck with this problem, and I’m not sure why, but I decided to look at the server’s CPU usage. It was a great hunch. Sure enough the CPU usage jumps up after about one hour on a stronger server, and after 15 minutes on a weaker server.

One of the brownie points for JavaScript programming, and few would disagree with this statement, is for the debugging tools, a.k.a. Chrome Developer Tools. In a couple of seconds I brought up the node inspector, connected to it with Chrome, and was ready to capture a performance trace.

Looking at the trace data, I noticed event-emitter code taking up a significant amount of CPU, which made no sense. Digging deeper, I discovered an event handler leak — handlers which were registered when new ships were created, but never cleaned up. We were seriously surprised that such a bug could have such an impact on the game, but there you have it.

The lesson learned is that multiplayer games can run for long periods of time, so in such cases you will need to place close attention to leaks of all types, and if possible, maintain counters of resources which are not expected to change over time.

Challenge 6: Serialization leak.

The Symptom: “when the server has been running for a while, a new connection sees hundreds of missiles for a couple of seconds before the game starts”

Again we will see a bug with unexpected consequences. In general it has been very tricky to try and guess the root cause of a problem based on the symptom. Maybe with time one develops a better intuition in this respect but mine hasn’t been very impressive.

In this case we needed to isolate the problem by adding new trace points, until we narrowed down the problem to the serialization of the objects on the server. Serialization is the process of taking deep JavaScript objects, and packing them into data frames that can be sent over the network. At a set interval, the server serializes game objects and broadcasts them to all the clients. The server only needs to send the objects which have changed since the last broadcast. The client receives these frames and performs the inverse process (deserialization) to get the new object states.

Our bug was that the serialization event queue did not reset itself of “atomic” events — such as object creation events — during periods of time when no clients were connected. It sounds like a mouthful but as in most JavaScript cases, the fix was a two-liner.

In this article we covered four unpleasant surprises, errr… ok, bugs. Part 3 will discuss the general architecture of a multiplayer game, and provide some guidelines for the aspiring multiplayer game developer.

Next: Part 3 — Your turn to write a multiplayer game.

--

--