An Introduction to Facebook’s System Architecture: Social Graph and TAO

How your social media data on Facebook are organized and stored

Published in

The Startup

6 min readSep 7, 2020

There is no need for an introduction to Facebook. Facebook has more than 1 billion active users who record their relationships, share their interests, upload and comment on text, images and videos. In this blog, I will mainly discuss two aspects of Facebook’s backend system:

How the social graph is modeled and stored, that is, the database schema adopted by Facebook.
How Facebook scaled its infrastructure, including servers, cache and databases, to serve 1 billion reads and millions of writes per second.

The Data Model For Social Graph

Facebook stores majority, if not all, of users’ data, such as profiles, friends, posts and comments, inside a single giant social graph. There are two elements inside a social graph, nodes and edges.

A node represents an entity, such a user, a post, a comment and a location.
An edge represents the relationships between the nodes. For example, one edge could mean that a particular user created a particular post.

For example, let’s imagine that the user Alice creates a post and tags the user Bob, as illustrated below. The user Cathey comments on the post, which David likes.

Alice: It was a great party! @ Bob
— — Cathey: Thanks for coming. (David likes this)

The post, comment and like will result a few changes to the social graph, as illustrated in the figured below. Assume the users, Alice, Bob, Cathey and David are already friends. Inside the social graph, 4 nodes already exist, 1 for each user. For instance, the node representing the user Alice has ID 105 and otype as User. (otype is an abbreviation for object type. Inside the Facebook system, node is referred to as object.) There are 2 edges between each pair of the users that represents one is a friend of the other. (Edges that represent friendship between Alice and Bob etc. are left out for brevity.)

When the user Alice creates the post, the following changes will be applied to the social graph. A new node representing the Post is added to the social graph. The node is assigned a unique ID of 632. 4 new edges are added to the social graph as well. The edges between Alice and the post represents that Alice authored the post. The edges between Bob and the post means that the post tagged the user Bob.

When the user Cathey adds a comment to the post, it will also change the social graph. A new node the represents the comment is added to the social graph. 3 new edges are created. The edge between the post and the comment shows that the comment node is the comment to the post. The nodes between Cathey and the comment indicates that Cathey is the author.

Finally, when David likes the comments, 2 new edges are added to the social graph that shows David likes the comment.

The database design for the social graph

Surprisingly, Facebook uses only two database tables to represent the social graph that captures the activities of its one billion users, object table and association table.

In the following sections, object and node mean the same thing. So are association and edge.

Object table

Object table has a very simple schema. It has 3 columns. The id column stores the unique id of the object. otype stores the object type. Additionally, each object/node can have a list of key-value pairs. otype specifies the possible keys and value type. For instance, otype of User means there could be a key name with value type string. The list of key-value pairs are serialized and stored in the data column.

id: int
otype: string
data: byte

For the post, the following row will be stored to the object table

632 | “Post” | {“text”: “It was a great party! @ Bob”}

Association table

The schema of the association table is similar. It has 4 columns. id1 and id2 represents the source and destination of the edge. atype is the edge type and data stores the optional list of key-value pairs associated with the edge.

id1: int
id2: int
atype: string
data: byte

The edge from the post to the comment will result in the following row in the association table

632 | 731 | Comment | null

The highly scalable backends that serve the social graph

There are two major components in Facebook’s backend system, TAO and the database.

TAO

TAO is Facebook’s distributed data store. It serves two primary purposes:

Define data access API

TAO exposes a list of APIs to query and mutate objects and associations. Facebook’s application servers will talk to TAO instead of the database. The APIs can be classified into 2 categories, object related APIs and association related APIs.

* Object APIs

Object APIs provides operations to allocate a new object and id, and to retrieve, update or delete the object associated with an id.

* Association APIs

Association APIs provides similar operations to add, modify and delete an association. However it provides a much richer set of APIs to query for associations. Some examples are

assoc_get(id1, atype) — Returns all the associations originating from id1 and with type, atype.
assoc_count(id1, atype) — Returns the size of the association list for (id1, atype)

To render the post and comment on the timeline, the Facebook application server would first call obj_get(632) to get the content of the post and assoc_get(632, “Comment”) to get ids of all the comments. It will then fetch the content of comments with obj_get(comment_id) and likes of the comments with assoc_get(comment_id, “Like”).

2. Cache

Additionally, TAO behaves as a write-through cache. It aggressively caches the objects and associations to reduce latency and load on the database system.

However, as we can imagine, there are many challenges to engineer a cache for petabytes of data that are geographically distributed. How do we maintain the consistency? How do we manage and promptly invalidate the cache stored in America for a post that is stored in Europe? How do we handle failures properly? TAO as a distributed cache for petabytes of data is therefore quite sophisticated and we will have a dedicated blog to discuss it. So stay tuned!

Database

Facebook’s database is based on MySQL database. It is obviously not possible for a MySQL database to serve tens of petabytes of data. Facebook made two modifications to the database

Database Sharding

Facebook partitions the objects and associations into logical shards. Each database instance will host one shard. It is TAO’s responsibility to decide which shard a new data should go into and remember which shard to query for an existing data.

An optimization Facebook made is to always store the association on the same shard of its id1. This is a result of the common traffic patterns of the Facebook. For example, when we view a post, Facebook will render its comments. By co-locating objects (the post) and edges from the object (comments of the post), we can get the data by querying a single database instance.

2. LSM tree storage engine

This is another database optimization Facebook did. Originally, Facebook uses InnoDB engine. InnoDB engine uses B+ trees. However B+ tree leads to index fragmentation (the storage is wasted. It is not holding any useful data nor can it be used for new data). This issue becomes more and more serious as Facebook replaces HDD by Flash or SSD (wasted space is more expensive). Facebook developed a new storage engine, MyRocks DB, based on Log Structure Merge (LSM) tree. It helped reduced storage usage by 50% and reduced database latency. MyRocks DB is an involved topic and I will write a separate blog for it. Stay tuned.

References

Venkataramani, Venkateshwaran, et al. “Tao: how facebook serves the social graph.” Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 2012.

Sharma, Yogeshwer, et al. “Wormhole: Reliable pub-sub to support geo-replicated internet services.” 12th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 15). 2015.

Dong, Siying, et al. “Optimizing Space Amplification in RocksDB.” CIDR. Vol. 3. 2017.