How to fix Titanfall

p0358
19 min readJul 27, 2021

I have spent countless and way beyond reasonable amount of hours on reverse engineering this game and researching its design and associated caveats.

In this article I am going to list most of the exploits that I know of and ideas for mitigating flaws in the system design of the Titanfall engine. Although I will be focusing on Titanfall 1, some of these things will apply to the newer games based on this engine as well.

If you have a Titanfall game that you want to fix, you’ve come to the right place. I am certain the game will be reasonably safe and sound after implementing all of the measures outlined here. All of the things, without skipping anything.

I’m publishing this, because I am frustrated at how they struggle to fix everything (I would fix this all within a few days had they hired me :D). And because I know they would not manage to find and fix everything from here on their own.

Now seems to be a good chance, since they’ve finally decided to actually update those games. And if they don’t do so, people can keep poking them with this. “Sorry, it’s hard” cannot be an excuse anymore when all the info is outlined here on a plate.

Whenever I’m listing a few fixes, they are not exclusive and all of them should be implemented, unless some of them are client-side options, and have a server-side-only alternative provided. (Because I do recognise that you want to avoid a client-side update at all costs.)
For the record:

  • Type 0 server is a lobby server — one that only has mp_lobby and hosts the lobbies that you get onto when you press “PLAY”
  • Type 1 is a full server — the one that hosts the actual matchmaking games or private matches
  • Type 2 is a training-only server — at least that’s how they’re assigned in Titanfall 1.

Some of the fixes advised should be followed carefully in order to properly implement them and avoid exploitation in the future.

One might be wondering though… wait a second, are you just publishing all of this out there? Can’t it make the situation worse? No, it can’t. The games were literally unplayable anyways, ie. it can’t be any worse than that. And everything I publish here is already well known to the attacker and exploited in the wild for at least months in time. Anything that isn’t known to attackers will be sent to Respawn directly and privately, and the article will be updated when I ensure that they fixed those issues too.

Name is too long

For whatever reason name length check for connecting players was removed. Believe it or not, you could join a game with a name being over 330 characters long and the server wouldn’t even blink. But the ones that would blink were other clients.

Data between the client and server is exchanged over Source engine’s network protocol. To put things simply, all the data is exchanged using network messages, which are encapsulated in packets sent over netchannel (which are packets sent in both directions at a constant interval, containing outgoing sequence number and the sequence number of last packet received from the other party), or the so-called “net data blocks” (Respawn’s invention designed to send bigger amounts of data from the server to the client rapidly, without the limitations of the netchannel) — to which we’ll come back later.

Network messages themselves consist of classes that are responsible for writing and reading them from packet buffer and later processing them. The first 6 bits are read by a higher level code to determine the message type ID, then the appropriate class is used to handle reading the data from buffer. It is assumed that the message will be successfully read

Now what’s the issue with long names? One of the message types is a game event message. It consists of 11 bits for data length and is followed by said data, whose format is 9 bits for event ID and the rest being data defined by event res files, however the contents are outside of our interest here. We are interested in the 11 bits for data. That’s a maximum of 2047, however the value is in bits, so… 255 bytes. You can probably already see the issue with name lengths exceeding 330 bytes. It’s going to overflow!

As a result less data is read from the wire before returning to message reader. Random garbage data, starting with contents of user-provided name, are now being read by the client as network messages. And what can potentially happen, is that the bad actor can do anything to the client that a server can do. Of course most likely with garbage data it’s going to do either nothing or break something, which is what our “hacker” was doing with his bots. He spent lot of time trying various random names and seeing what side effects can they cause. And indeed there were many distinct side effects, from the clients getting a message that brand new DLC packs are waiting to get installed, through disconnect with random bytes as a reason or just popping back to main menu, to closing the game via either “Too many proxies for datatable” error message or just a crash.

It’s worth noting that in my private tests I was able to drop clients with a custom disconnect message embedded in the malicious client name. The thing wasn’t too reliable however, probably due to how the reliable subchannels are implemented (ironically?).

How to fix?

Just check the max name length. Even the OriginSDK’s struct allows for max of 64 bytes (63 + null terminator). If you want to be on the safe side, in case they increase their length limits, just drop anyone with name length of over 63 and that will fix this issue.

Ideally you should check the docs for current Origin APIs, back in 2014 they were pretty barebones, but nowadays there most likely is an endpoint to verify player name from their token. Just enforce that name then. Because otherwise identity theft is still a concern, and it’s been happening before. Verifying the name will make things like this harder.

To be on the safe side one could also increase the amount of bits for event message data up from 11 to something bigger, however that would require a client update, which is probably something that Respawn wants to avoid at all cost. Worth noting nonetheless in case they decide to perform a client update.

Why did that even happen?

Respawn wanted to ensure that the player name is always what comes from Origin. However the implementation turned out to be very poor. The code indeed ensured at every connection phase that the client is connecting with a name that comes from a function of OriginSDK that returns the client name. Nobody checked what the function does though. The function did return a name from a struct sitting in a constant place in the memory, which was populated only once on client startup and never reverified. In fact this has made name manipulation trivially easy for any tinkerers. In addition to that the name length check somehow disappeared on the server. I didn’t check if it was in fact removed or changed to some insanely high number, but why even bother changing it? Source had it capped at 128, but Origin names are like 20-ish something max in length. Don’t change it if it ain’t broken. Other than this check the client-side name checking is ultimately futile anyways. This is why I suggest to look into whether that can be enforced server-side nowadays. I heard the names are enforced in Apex, so it should be.

Speaking of events…

The events are divided into client events and server events, defined in res files under the resource folder. Server events were normally meant to be used by different parts of the engine and notably external plugins and were not networked to clients. In Titanfall however all event types are sent to the clients. Including the player_connect event, containing the field address. Yes, client IP addresses are networked to everyone. In other words, the game leaks IP addresses of every connecting player to all the other players. Weren’t the dedicated servers supposed to protect against this exact thing?

How to fix?

The field cannot be removed without client update. Just empty the value of the field where the event is generated, or set it to “[::ffff:0.0.0.0]:0” to ensure nothing is broken in case it is parsed anywhere.

The famous so-called “DDoS”

Let’s be real. It’s not a DDoS, it’s a DoS. And not one that can be blocked by any “external partner”, because the amount of data sent is no bigger than someone downloading a file no bigger than a few megabytes.

The attacker is exploiting things on application level, simply sending a bunch of data that is expensive in a certain way to process (in this case over connectionless packets that don’t even require the client to be connected to the server with an estabilished netchannel).

So you might think, is this something new, unique, revolutionary, something that nobody ever dealt with before Respawn? Of course not! In fact the protection against this already existed in Source engine for several years.

But Respawn… disabled it.

Okay, they had a reason for it, but the way they did it was really bad. Remember the “net data block” I mentioned before? It was designed to rapidly send some data from server to client, bypassing the netchannel and its constant interval and data size limitations. Over this they could send things like the list of playlists or definitions for persistent player storage (this needs to be synced with clients, and can get updated even without client updates, the client then caches those in memory until they’re updated again), which can be updated server-side without updating the whole client. For example they can add a timed special playlist or reorder the matchmaking queue, all by just updating a single txt file on their S3 bucket. There are some other things sent over the netblock too.

Anyways, the max length of data sent in a single packet is around ~1200. So to send all this data, the server may end up sending even around 100 packets to the client at once. And the client needs to send 100 acknowledgements back to the server, otherwise it’s going to keep resending all this data over and over and over again. They disabled this to ensure that the client will never get rate limited during that.

How to fix?

Now of course the solution to this is very simple. Just exclude the netblock connectionless packet type from the rate limiting!!! It’s that easy.

Can the entirety of “DDoS” be fixed in 5 lines of code?

With this quite simple-looking (well, it took some time to reverse all the needed things, compared to just having access to the source code) function hook I was able to confirm that it indeed fixes the issue. Here I simply took advantage of the cvar sv_limit_queries that Respawn introduced to toggle the protection, and just flipped it on or off, depending on whether the packet about to be processed is the net block ACK coming from the client, or anything else.

While I was never able to fully reproduce the flood that the attacker uses most frequently, notably on Titanfall 2 servers, I was able to confirm that this approach will solve the “Disconnect: <player> overflowed reliable channel.” thing.

Literally, just implement this and the issue disappears.

The overflow thing is caused by a bot repeatedly connecting and disconnecting from a server. The ratelimiting will stop it from connecting again very quickly and after ~100 attempts it will be blocked, and the clients won’t be able to sense anything happened at all. They would need to send about a few thousand for this attack to succeed.

The default settings for this thing are good enough, but they can be tightened a bit just to be safe. There’s no reason to send that many connectionless packets outside of netblock stuff unless the server is lagging and not responding. One may count how many packets does a client send to a dead server before giving up and over how much time, and use those values as the configuration — under no circumstances should any user send more connectionless packets than that.

So to do that, just look into the “CBaseClientState::ProcessConnectionlessPacket” function in engine/baseclientstate.cpp. There you will probably find that it calls a function named “CheckConnectionLessRateLimits” to check for the limits. Below that it reads the char from the message. Just move the rate limit function call below that, append char as the extra parameter, and inside the function return true if the char for message type is either one of the net data block ones. In addition to that, you should probably also check against messages used for reservations/transferring persistent data to other servers, since these can be quite spammy too. Don’t forget to enable the cvar sv_limit_queries by setting its default value to “1” now!

Evil connections and bad client authorization design

Titanfall 1 private lobbies are a pool of servers that can hold a single physical party at once, compared to Titanfall 2 that can hold like 32 random players at once, with parties being virtual and done on another layer. At the same time Titanfall 1 does not use cloud scaling. The amount of servers is fixed. That means that enough players trying to get into the game will fill all of these servers (like 10 of them per region currently?) quickly, and new players will be sitting at main menu with “No servers found” (fix this!!! either increase the amount of servers or make it scale dynamically, there used to be dedicated servers for just private lobbies that could host 35 of them each).

Not only can real players “exploit” this, but malicious bots can take spots on all available servers and cause the exact same effect artificially. Bots aren’t even real players and don’t supply the Origin client key. They just send random packets and stay in this half-connected state forever, avoiding the 3 minutes limitation imposed on regular clients (designed for if they die and don’t send any packets, because a loading Source engine client can lag severely on a low-end machine). Even if they send such a packet, they can probably stay in this state forever.

How to fix?

(of course besides dynamic scaling/adding more servers)
(the items listed do not exclude each other — all of them should be implemented)

  1. Have a fixed not-long amount of time in which the client needs to supply the Origin client token. In addition of that, take a note of the client exceeding the standard 3 minutes time to fully connect to a server. If any of these is exceeded, ban the IP for 10 minutes or 1 hour. Even 1 hour or more is perfectly fine for a type 0 lobby server, because if a client did this accidentally, they won’t even notice on reconnect, because the game will silently attempt to join a different server after the banned one rejects them. And bots will relatively quickly get timed out and then kept out for some time while others will be able to join them.
    Keep in mind that a bot could eventually be modified to supply a valid Origin token as well, so the general limitations must still be applied even if a token was supplied.
    The solution isn’t that ideal, but a proper fix would require sending the Origin token right in C2S_CONNECT, and verifying it before even letting them in, something that cannot be done without a client update.
    A thing that can be done, however, would be proxy servers, these would verify the key and everything else matching up, then create a reservation on a real lobby server, and redirect the client to join this way, disallowing direct join on empty servers. Though sadly I don’t think Respawn would go out of their way to reimplement the basic protocol in an external client this way…
  2. Do not allow more than one partial connection from a single IP. If someone is trying to connect, check if there’s already an existing client in a partial, non-full connection state. This won’t affect people playing from a single location much, as joining lobbies is fairly quick. And the limitation would only apply to joining with invite, not joining via reservation. So that an existing party could be migrated to a different server all at once without any disruption. It’s important to exclude reserved join from this limitation!
  3. In emptyservers.php endpoint, actually check the token field that is being sent with Origin servers for type 0 searches (edit: it seems this might have been done two days ago already). Don’t return more than 2–4 servers at once.
  4. For the above and the game itself, just outright ban the following Origin IDs.
    These are known accounts of the attacker and his team or confirmed cheaters. Just never let them on the servers, or to request the emptyservers.php after verifying the token belongs to them. They won’t be able to afford an infinite number of accounts to attack the game. Please note that in case of hackers the former FairFight ban is not enough, because it still allows them to connect to the servers. FairFight bans are good enough only for cheaters who are not trying to destroy the whole game.
    The list to ban:

1006229610560
1006016068778
1008485087761
1004976664278
1007073858111
1006634630029
1002401300395
2463174216
2429556846
2351570509
1006412806000
1009153717786
1001130189292
1012407959430
1006999558148
1006876777968
2321555144
1007277372415
1006617398848
1006670789617
1006917184640
2265083847
1000744388918
1003605538230
1005213644049
1000454955614
2284244365
2809754266
1008725587542

Parties are cool, as long as you’re invited

The second biggest flaw, next to the Origin token being sent after connection is already made, is that parties can be joined by anyone. It’s another thing that cannot be fixed without a client-side update and can only be mitigated in some ways. With client-side update, the clients could have be assigned an individual random number, that would be part of the joinsecret command, and allow only the actual invited users join in. Because currently you can join a server by just specifying the user ID of the person that you want to join. This is the second biggest design flaw of the current implementation. Other than that the servers would be a relatively secure garden — you already cannot join a random matchmaking server “just like that”. You can only join empty servers, join with reservation if a trusted server decides to send you there, or join with invite.

But you may think, and so what? If someone whom you don’t like joins your party, then you can just leave and it’s not a big deal. Okay, you know what’s a bigger deal in this situation? The party leader can leave the party while pulling everyone else out with this little neat function in the script engine:

Basically the dialog box calls the console command “LeaveMatchWithParty”, and that executes this callback function. Do you see something missing in this function?

Yes, anyone can use it. So a bad actor can join anyone, and then pull them out of the game, just like that. Pretty annoying, isn’t it?

How to fix?

Just take an example from the other functions and check whether the player who executed that command in a party leader:

Besides one could implement some configurable settings for the players that control whether they want anyone to join their party, and whether the client wants to get pulled out with their party while in-game and not in a lobby. Even if those are added as server-side cvars or concommands. Then players could still configure those with a mod, client update unneeded. Not much else can be done about it.

I don’t have DLCs

Every client, while connecting, tells the server which DLC packs does it have installed. In practice most people have all three installed, or zero if they forgot to download them for some reason. All playlists except campaign require all 3 to be installed. This is another thing that the attacker took advantage of. If you got to a private lobby with one of his bots in it, the bot would prevent you from searching the games on any playlists that require them.

How to fix?

  1. Do not take into account the DLC options of players who are not fully connected to the server as a map restriction. Especially in type 0 servers, if you start a search for a match, only players who were already connected at the moment the search has begun will be taken to the found match. People who joined in the meantime will be left over. So it doesn’t make sense to count their DLC preferences for the search. (assuming it can be done server-side; it should)
  2. In the event a client-side update was decided, it should only warn about the DLC, but if it fails, it should allow to start the search regardless, and just leave these players behind.
  3. In server-side variant, for type 0 servers, if someone is joining an existing party, and people in that party have DLCs, do not let anyone without DLCs connect to that server. Disconnect them with a descriptive message that encourages them to download the DLCs.
    But! Do let people without any DLCs join into an empty type 0 server, so that they can reach the campaign carousel meanwhile.
  4. Forcing a playlist override is an useful and pretty harmless feature (zero irony here), since it only works on empty servers. However, it’s only really useful on type 1 servers, and rather useless on type 0 servers. Especially with the bug of forcing the “private_match” playlist. Because apparently forcing “private_match” playlist on a type 0 server does nothing besides saying that some party member doesn’t have DLCs and locking out all the playlists which require them!
    So there are the two only options to go about this: either make it so that forcing “private_match” playlist actually does force the type 1 server into a private match one as if they selected such option from the main menu, or prevent forcing the “playlist_match” playlist altogether (if the first thing is implemented, still prevent “private_match” to be overriden in type 0 and 2 servers)

Too many commands, give me a break

A client can easily DoS the server by burning its CPU by spamming it with really many console commands. Newer Source engine games have a protection against this, but Respawn’s branch was too old to have it included back when they forked it.

How to fix?

Just ratelimit bruh. The quota should be implemented in “CGameClient::ExecuteStringCommand” function inside of engine/sv_client.cpp. Add a cvar of “sv_quota_stringcmdspersecond”, its default value of 40 guard against DoS and is more than enough for any legit use-case. The game seems to never send more than 1–2 per second, but we want to be on the safe side here with 40. Then just follow what Valve did.

In addition to that, you can try implementing a limit of how much processing time the game server can spend on processing packets for each client per tick, which can probably be implemented in ProcessMessages. This third counter-DoS change should completely close the window for potential DoS attacks. However the exact time value would need to be tested and determined for real clients before implementing. If the time was exceeded, then just kick the player.

Some quality of life changes

While at fixing things, we would appreciate some tiny quality of life changes that can be implemented using the opportunity of already changing things :)
Not really security fixes at all, but would be nice if these could be implemented as well.
Don’t worry about exposing this functionality to clients without a client update, they could be used with keyboard binds and client mods. Just enough to physically enable them.

  1. In _consts.nut, find pmSettingsMap[“pm_ai_lethality”] and change the second “0” to “1” to allow people to disable NPCs in private matches completely.
0 stands for all, 1 for none, 2 for grunts only, 3 for spectres only

2. To enum ePrivateMatchModes, add some extra gamemodes, at least “mfdp” (not sure if others wouldn’t be bugged).

3. Add deadly ground to allowed private match options. Add pmSettingsMap[“pm_floorislava”] <- [0, 1] and playlistVarMap[“pm_floorislava”] <- “riff_floorislava” to _consts.nut, and in menu/_lobby.nut, find case “pm_ai_lethality”:, and append case “pm_floorislava”: beneath it.

4. Allow any player to start a private match game on their own, because why not let people explore the maps alone? This seems to be a heavily desired feature, we keep getting asked about whether it’s possible by many players.
Go into menu/_lobby.nut and find out where it uses IsAnyPlayerMMDebug() function, just make it always “file.teamReady[team] = true” in that place.
To complement this change properly, you will need to go to mp/_gamestate.nut, find function “DoneWaitingForPlayers()” and change

if ( GetDeveloperLevel() == 1 || ( IsPrivateMatch() && IsAnyPlayerMMDebug() ) )

to

if ( GetDeveloperLevel() == 1 || IsPrivateMatch() )

Otherwise they would be able to start the game alone, but would be stuck on “Waiting for players…” after loading it in.

5. Allow people to leave the party without leaving the current match. Will probably need to add a new Squirrel-native function in code on server.dll, and calling it within _menu_callbacks.nut, naming the client command something like “LeaveParty”.
The reason for this is that many of the old players would like to join each other in order to be in one match, but they’d like to play on the opposite teams or let the game balance the teams on its own. If the good players are forced to stay in one team, they’ll end up stomping the other team, or if they don’t use parties, it matchmakes them in different games and then they have no chance to fight each other.

Short summary

  1. Fix too long names (≥64), verify names with Origin if possible
  2. Do not leak player IP addresses in “player_connect” game event
  3. Re-enable the built-in Source’s protection against DoS for connectionless packets, excluding the net data block packet types from rate limiting
  4. Be aggressive about players who don’t send their Origin auth token (which is done on one of the earlier connection phases) or are stuck connecting for just too long (>~3 minutes?), IP ban them temporarily on the particular server
  5. Do not allow more than one connecting client from a single IP on a server at once. Players from a single IP need to connect to a server one after another, when the previous player has finished loading. This limitation must only apply to connect with invite and must not be applied to reservations.
  6. Verify token in emptyservers.php, return less servers to players at once (2–4?)
  7. Outright ban and prevent from connecting the known bad Origin IDs, including game servers and emptyservers.php endpoint, do not use the FairFight banning feature, it’s not effective against hackers rather than just cheaters
  8. Only allow the party leader to run “LeaveMatchWithParty”, not any random person
  9. Ignore player DLC choices before they’re fully connected. Don’t let players without DLCs onto type 0 servers with people on there who already have DLCs (the other way around is ok). Fix force playlist override of “private_match” not working and having the side effect as if someone without DLCs was on the server.
  10. Implement a quota for string commands, a value of 40 per second is more than fine and gives enough safe room
  11. Try implementing a max limit a client’s netchan processing can take, will require some adjustments to find the good value, but may protect against potential future DoS attacks…
  12. Spin up more servers, the current amount is way too little
  13. Do some quality of life changes using the opportunity as the last thing

Closing words

I would appreciate some communication from Respawn side, letting us know when these are implemented, and also spinning up a second dev/int environment where I could test and confirm in isolation on whether the bugs were already fixed, possibly with the banning protections disabled, to not risk getting those on production.

Speaking of which, I got IP-banned on production while testing one thing and whether it’s now verified, please unban me… :/
[edit: to clarify, the ban is on IP and on Stryder, not on Origin account]

Once we are really sure everything is fully fixed, the game’s population could probably be restored back to a healthy state by putting it on a free weekend, assuming that there’s enough servers to handle all these players (which is currently not the case).

The article may be updated in the future with new things, especially when something unpublic gets fixed.

PayPal link in case anyone wants to give a tip for this work: https://paypal.me/p0358donate

--

--