“black Microsoft Xbox 360 controller” by Arturo Rey on Unsplash

How does any MMO Games backend work?

This is the article which includes my write-up and the combination of information from numerous blog posts, forums, and WIKI.

There are two complex components in any MMO Game

  1. Client side/ Game architecture
  2. Server-side/ Backend architecture

You must be wondering, what is the need for me to know about client-side architecture?

This thinking is fine if you are designing a web app!! But this thinking won't help you when you are a game backend developer.

If you are too lazy to read, you can watch below video series

How does any game works?

First, you have the “game loop”. This is really just a nearly infinite loop where inside the loop you are checking user input, updating the game state, and rendering the graphics and sound to represent the game state.

It looks something like this:

while(true) {





And that is at a high level what a game is doing. Now, for each of those three things, you’ll get hideous levels of complexity.

Designing a complex Game system is itself vast subject, I am not going to touch any of that!!

There are 3 different type of games

  1. Strategic games
  2. Slow turn games
  3. First person games

Designing first two is a lot easy, as they are not realtime, everyone has there time/turn to play and data can be easily synced

But games like PUBG/CS/Call of duty or NFS/Asphalt these games need other player’s data as quick as possible. But can we sync information in real-time? not really!!

Say for example you’re in San Francisco, connected to a server in the NY. That’s approximately 4,000 km or 2,500 miles

Nothing can travel faster than light, not even bytes on the Internet (which at the lower level are pulses of light, electrons in a cable, or electromagnetic waves). Light travels at approximately 300,000 km/s, so it takes 13 ms to travel 4,000 km.

But in the real word our packets won't take the shortest/straight path, they should go through the routes/repeaters so in the end, the latency will come up to 50ms based on the bandwidth. !! what the hell!!!

So if I press the space bar, it will take 50ms to reach to the server + server takes ~100ms to process + 50ms to travel back to other guy who is 2nd player !!

That’s total of 200ms to send the information from player 1 to player 2

So that means if the red player shoots a black ball, the blue player sees after 200ms, with this much latency will our games work?

Know that you understand how difficult to send info between two players which are playing in real time now think about the MMO games? with 100s of people playing simultaneously? Say In PUBG at any given point of time 100s of people are dropped to the island rite, so these 100 people from different locations should play together without any glitch.

So in this session, i will explain how the game app should work in tandem with game server to make MMO experience better.

A Idea !!!

To reduce we can make phones talk directly without the help of servers / over LAN to reduce latency. So you have a game installed on you phone and similarly all other players, what happens if there is no server and apps are directly talking to each other? and updating the game data?

The problem of cheating

As a game developer, you usually don’t care whether a player cheats in your single-player game — his actions don’t affect anyone but him. A cheating player may not experience the game exactly as you planned, but since it’s their game, they have the right to play it in any way they please.

Multiplayer games are different, though. In any competitive game, a cheating player isn’t just making the experience better for himself, he’s also making the experience worse for the other players. As the developer, you probably want to avoid that, since it tends to drive players away from your game.

There are many things that can be done to prevent cheating, but the most important one (and probably the only really meaningful one) is simple : don’t trust the player. Always assume the worst — that players will try to cheat

Authoritative servers and dumb clients

This leads to a seemingly simple solution — you make everything in your game happen in a central server under your control, and make the clients just privileged spectators of the game. In other words, your game client sends inputs (key presses, commands) to the server, the server runs the game, and you send the results back to the clients. This is usually called using an authoritative server, because the one and only authority regarding everything that happens in the world is the server.

Of course, your server can be exploited for vulnerabilities, but that’s out of the scope of this series of articles. Using an authoritative server does prevent a wide range of hacks, though. For example, you don’t trust the client with the health of the player; a hacked client can modify its local copy of that value and tell the player it has 10000% health, but the server knows it only has 10% — when the player is attacked it will die, regardless of what a hacked client may think.

You also don’t trust the player with its position in the world. If you did, a hacked client would tell the server “I’m at (10,10)” and a second later “I’m at (20,10)”, possibly going through a wall or moving faster than the other players. Instead, the server knows the player is at (10,10), the client tells the server “I want to move one square to the right”, the server updates its internal state with the new player position at (11,10), and then replies to the player “You’re at (11, 10)”:

A simple client-server interaction.

In summary: the game state is managed by the server alone. Clients send their actions to the server. The server updates the game state periodically, and then sends the new game state back to clients, who just render it on the screen.

Dealing with networks !!

Remember we already learnt that talking to the server and players leads to 200ms latency??

Networked multiplayer games are incredibly fun, but introduce a whole new class of challenges. The authoritative server architecture is pretty good at stopping most cheats, but a straightforward implementation may make games quite unresponsive to the player.

No lets learn what are the different ways we can send data between games and servers and how to optimize the bandwidth utilization So the goal is the reduce latency and bandwidth utilization.

In the internet I found these animation videos by Glenn Fiedler,

Deterministic lockstep:

Deterministic lockstep is a method of networking a system from one computer to another by sending only the inputs that control that system, rather than the state of that system. In the context of networking a physics simulation, this means we send across a small amount of input, while avoiding sending state like position, orientation, linear velocity and angular velocity per-object.

The benefit is that bandwidth is proportional to the size of the input, not the number of objects in the simulation. Yes, with deterministic lockstep you can network a physics simulation of one million objects with the same bandwidth as just one.

All of the upcoming demos has cube move, roles, jumps and also sticks other small cubes to show all different behaviors

So lets see the deterministic lockstep demo https://gafferongames.com/videos/deterministic_lockstep_desync.mp4

Above you can see a simulation that is almost deterministic. The simulation on the left is controlled by the player. The simulation on the right has exactly the same inputs applied with a two second delay(Just to show the difference) starting from the same initial condition.

So what information was sent via network, ie from server to client?

Only the state of the keys that affect the simulation or keys pressed along with timestamp is sent. So for demo we send that input from the left simulation to the right simulation in a way that the simulation on the right side knows that the input belongs to frame n.

To stop the jittery lags when there is too much packet loss

Playout Delay Buffer: What you’re doing here is similar to what Netflix does when you stream a video. You pause a little bit initially so you have a buffer in case some packets arrive late and then once the delay has elapsed video frames are presented spaced the correct time apart. If your buffer isn’t large enough then the video playback will be hitchy. With deterministic lockstep your simulation behaves exactly the same way.


TCP is Connection-oriented protocol service is sometimes called a “reliable” network service, because it guarantees that data will arrive in the proper sequence. Transmission.

if there is any packet loss then Game should wait 2 * RTT times to get that packet and render, hence you game will not be smooth

250ms latency and 5% packet loss: https://gafferongames.com/videos/deterministic_lockstep_tcp_250ms_5pc.mp4

Use UDP instead of TCP

Here’s the trick. We need to ensure that all inputs arrive reliably and in order. But if we send inputs in UDP packets, some of those packets will be lost. What if, instead of detecting packet loss after the fact and resending lost packets, we redundantly include all inputs in each UDP packet until we know for sure the other side has received them? even though they are not coming in order we can always use timestamp to accurately know which data is latest data.

Even with 25% packetloss and 2 second delay, see the result, what we saw was 5% packet loss with 250ms lateny TCP was worst


Client-side Prediction for Smooth Multiplayer Gameplay

When you’re developing an online server-authoritative game, you want your player movement to feel responsive for both low and high latency players. To achieve this, you’ll need to implement client-side prediction and smoothing.

INTERPOLATION: In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points.

Instead of sending keys: we capture a snapshot of all relevant state from the server and on the game side we use those snapshots to reconstruct a visual approximation of the all the other players.So to solve that we need to do some prediction on the client side to make it look players movement smooth

There are many variants of client-side prediction but the basic idea is always the same: the client responds to player input by moving the player before the server processes the input and tells the client where they player should be. This of course means where a player sees themselves and where they actually are on the server can be different

You’ll need a way to prevent the client and server from diverging too far. There are many ways of handling this, and which works best for you will depend on your game

Interpolations to predict the next state !!
  • Linear Interpolation
  • Hermite Interpolation
  • Polor Interpolation
How to handle Game characters ?

A better approach is to have each state be it’s own class, generally with a base ‘state’ class that they inherit from.

The state pattern is a behavioral software design pattern that implements a state machine in an object-oriented way. … This pattern is used in computer programming to encapsulate varying behavior for the same object based on its internal state.

This way, the character or object only needs to have a reference to it’s current state and a call to make that state update. When we want to add a new state, we just create a new class.

Backend system design

Architecture for session-based multiplayer games:

  1. Players connect to a World service, which pairs them together, using Redis to help facilitate this.
  2. Once two players are joined for a game session, the matchmaker talks to the game server manager, to get it to provide a game server on our cluster of machines.
  3. The game server manager creates a new instance of a game server that runs on one of the machines in the cluster
  4. The game server manager also grabs the IP address and the port that the game server is running on, and passes that back the matchmaker service
  5. The matchmaker service passes the IP and port back to the players’ clients

6. and finally the players connect directly to the game server and can now start playing the game against each other


Creating TCP connections incurs both CPU and memory overhead. This overhead increases linearly with each connected client, or O(N) where N is the number of clients. Decoding compressed data streams, and especially encrypted data streams, adds more overhead. That cost increases more like O(N*M) where N is the number of clients and M is the number of packets received by each client.

Most MMO games involve walking, running, jumping, combat, and other activities that require repeated player input. This generates a constant stream of network messages from each client, so that the M part of our O(N*M) becomes pretty significant. Trying to handle that load in the same process that handles the game logic (and physics, AI, crafting, etc.) will degrade the player’s experience. If we decouple that work from the rest of the game, however, we have both more capacity and greater flexibility for handling that load.

World server

The world server tracks player character location and orchestrates high level cross-region activity.

Now we have added a new World Server, which manages all the Game Servers. When a Game Server is added (at runtime) or fails, the World Server will know about it and act accordingly. Clients will keep a constant connection to the World Server and to one of the Game Servers.

Develop a map-centric game server strategy that concentrates core game play activity with the maps in which it occurs. Do this by creating two server types, area server and world server. The game’s server cluster contains many area server processes connected to a single central world server process.

The world server serves as a central controller for area servers. It assigns region maps to area servers and coordinates movement of player characters between region maps and their area servers.

The world server is the central authority on a player character’s location within both the virtual world and the area server process nodes that make up the game server cluster. When a player character moves between area servers, the world server updates the routing data used by the connection servers so that client messages for a given player character reach the correct area server.

Area Servers

Area servers manage region maps and control all game behavior that occurs on them. Each region map occupies a single cell of the large uniform grid that comprises the entire virtual world map.

The simplest application of this pattern assigns a single region map to a single area server, but this may be inefficient.

Players usually congregate around points of interest (POIs) like towns, vendors, spawn points, and along common travel routes. For the sake of interest and aesthetics, POI distribution may not be uniform across regions. As a result, some region maps will experience higher game play load than others.

The Map-Centric Game Server pattern implements a straightforward way to address this. That is, an area server process may host multiple region map instances. Simply group several lightly loaded region maps together in the same area server process. More heavily loaded regions would call for a dedicated area server.

A data structure called a map template defines the terrain, geometry and other content for a given map. At run time, an area server creates one or more map instances from a given map template. These map instances provide a context within which the core game play for the map executes. Each area server may host multiple instances of one or more map templates.

Large maps designed as common spaces for many players usually exist as singletons. That is, only one instance of that map exists in the entire game. The area server for one of these maps creates its instance at server startup, and keeps it running perpetually.

Smaller maps exist for running scripted experiences with smaller groups of players. These maps are ephemeral. The area server creates instances of these maps on demand, and shuts them down when no longer needed.

When a player’s character moves between maps in the virtual world, the game moves the character’s state from the old map instance to the new one. If the map instances are in different area server processes, the origin area server saves and unloads the state, and the destination area server loads it into the new map instance.

The Map-Centric Game Server pattern specifies that when a player’s character moves between maps in the virtual world, the character’s state moves from the old map instance to the new one. That pattern implements the more general use case, where characters travel between two maps that are not adjacent region maps.

Region Boundaries

More complex use cases arise when player characters cross a region boundary or interact with game objects on the other side of a region boundary. These use cases are the key motivators for this pattern. This pattern’s goal is to simulate a vast contiguous world, hiding the detail that the world is really just a composite of smaller maps.

to do this, the game must make movement and interactions across region map boundaries as fluid as possible. Ideally, this means, in order of importance:

  1. Players experience no lag or hitching when moving their characters across region boundaries.
  2. Players must be able to see other player characters, NPCs and game objects on the other side of a boundary.
  3. Players must be able to interact with other player characters, NPCs and game objects on the other side of a boundary. Here, interact means a subset of game play that could normally happen between characters and game objects on the same map.

Seeing Across Boundaries

To support the ability for players to see game objects across region map boundaries:

  1. Enhance the server visibility system to include game objects on a map adjacent to the player character’s current map that should be visible to that character. This is the subsystem that determines which game objects are visible to each other.
  2. This means that the area server for each region map must send position, orientation, and state update messages to the area servers of adjacent maps when game objects are within some visibility threshold of a region boundary.
  3. In turn, these adjacent area servers must track these remote game objects in its visibility graph as if they were local, but located beyond the normal map extents.

Scaling Area servers

There is a lot of horizontal scaling, but instead of firing up servers on demand, we pre-allocate them and tend to divide them geographically — both in terms of real world location so as to be closer to players, and in terms of in-game locations, so that characters that are co-located also share the same game process. This would seem to require more effort on the game developer’s part but also imposes several extra limitations, such as making it harder to play with friends located overseas on different shards, requiring each game server to have different configuration and data, etc. Then there is the idea of ‘instancing’ a zone, which could be thought of as another geographical partition except in an invisible 4th dimension (and that is how I have implemented it in the past).

But in general, MMO game servers are limited in their capacity, so that you can typically only get 500–1500 players in one place. You can change the definition of ‘place’ by adding instancing and shards, you can make the world seem to hold more characters by seamlessly linking servers together at the boundaries, and you can increase concurrency a bit more via farming out tasks to special servers.

Partly, the current MMO server architecture seems to be born out of habit. What started off as servers designed to accommodate a small number of people grew and grew until we have what we see today — but the underlying assumption is that a game server should (in most cases) be able to take a request from a client, process it atomically and synchronously, and alter the game state instantly, often replying at the same time. We keep all game information in RAM because that is the only way we can effectively handle the request synchronously. And we keep all co-located entities in the same RAM because that’s the o

nly way we can easily handle multiple-entity transactions (eg. trading gold for items). But does this need to be the case?

In my experience, it’s better to have servers set up to handle their own zones, but written in a way so that in the future, it’s easy to modify a particularly busy zone to run multiple load-balanced servers. At that point, each server on the zone should be handling around 1000–2000 concurrent connections, and propagating player data using something like redis between the nodes.

Another common solution I’ve seen (particularly on popular minecraft servers) is to tweak the actual game design to limit the amount of players on a single server.

To do this, have multiple load-balanced servers for the same zone, but allow the player to see what server they’re connected to, and only render player data within that server. The result is multiple “realities”, where each server independently has a different crowd of players in the same apparent area, but this reduces server stress and gives the flexibility of allowing the player to switch freely between servers if they want. If you do allow them to switch (in order to meet with a friend or something), impose a maximum player count per server, and you’ll have a hard limit of how much each server should ever have to handle.

You should at least get a working game and run some simulation tests with a few hundred or thousand client connections to get actual performance specs before worrying where you should go from there. It’s possible that there might not be an issue at all.

Also, before you concern yourself with the massive numbers of players you’re asking about, consider that your game might not ever be that popular. I mean no offense, it’s just relatively rare to develop a “hit.” So instead, focus on more important issues like good game design, so that you’ll be able to worry about this later on.