Database Design for a system like LinkedIn.

Ayush Dixit
Towards Data Engineering
5 min readJan 29, 2023

If you’re preparing for Data engineering interviews at big product-based companies, then you must have to appear in the data modeling or database designing round in order to get your dream job!!. Today let’s discuss the same with a real interview problem. So Let’s design the database for a system like Linkedin...

A database design for a system like LinkedIn would likely include several different tables to store information about users, their profiles, connections, and various types of content they can post or share. Here is an example of how the tables might be structured:

  1. Users: This table would store basic information about users such as their name, email, password, and registration date.
  2. Profiles: Each user would have a unique profile, which would contain information such as their current job title, industry, location, and summary. The profile table would also include a foreign key linking it to the user who owns it.
  3. Connections: This table would store information about the relationships between users, including the user ID of the person making the connection and the user ID of the person they are connecting with. This table would also have a status column to indicate whether the connection is pending, accepted, or blocked.
  4. Education: This table would store information about the user’s education like school name, degree, field of study, and start and end date. This table would also include a foreign key linking it to the user who owns it.
  5. Experience: This table would store information about the user’s work experience company name, job title, location, and start and end date. This table would also include a foreign key linking it to the user who owns it.
  6. Skills: This table would store information about the user’s skills. This table would also include a foreign key linking it to the user who owns it.
  7. Posts: This table would store information about posts made by users, including the user ID of the person who made the post, the post’s content, and the date it was posted.
  8. Comments: This table would store information about comments made on posts, including the user ID of the person who made the comment, the comment’s content, and the date it was posted. It would also include a foreign key linking it to the post it is commenting on.
  9. Likes: This table would store information about likes on posts, including the user ID of the person who made the like and the date it was posted. It would also include a foreign key linking it to the post it is liking on.
  10. Shares: This table would store information about shares on posts, including the user ID of the person who made the share and the date it was posted. It would also include a foreign key linking it to the post it is sharing.
  11. Groups: This table would store information about groups created by users, including the group name, description, and the user ID of the person who created the group.
  12. Group_members: This table would store information about the relationship between groups and users, including the user ID of the person and the group ID they are joining. It would also have a status column to indicate whether the request is pending, accepted, or blocked.

It’s important to note that these tables and fields are examples and may vary depending on the specific requirements of the system.

Relationships between the tables:

Here is an example of how the relationships between the tables in the LinkedIn database design might be structured:

  1. One-to-one relationship between Users and Profiles, where one user can have only one profile, but one profile can belong to only one user.
  2. One-to-many relationship between Users and Connections, where one user can have multiple connections, but one connection can belong to only one user.
  3. One-to-many relationship between Users and Education, where one user can have multiple education, but one education can belong to only one user.
  4. One-to-many relationship between Users and Experience, where one user can have multiple experiences, but one experience can belong to only one user.
  5. One-to-many relationship between Users and Skills, where one user can have multiple skills, but one skill can belong to only one user.
  6. One-to-many relationship between Users and Posts, where one user can have multiple posts, but one post can belong to only one user.
  7. One-to-many relationship between Posts and Comments, where one post can have multiple comments, but one comment can belong to only one post.
  8. One-to-many relationship between Posts and Likes, where one post can have multiple likes, but one like can belong to only one post.
  9. One-to-many relationship between Posts and Shares, where one post can have multiple shares, but one share can belong to only one post.
  10. One-to-many relationship between Users and Groups, where one user can have multiple groups, but one group can belong to only one user.
  11. Many-to-many relationship between Users and Group_members, where one user can join multiple groups and one group can have multiple members.

It’s important to note that these relationships are examples and may vary depending on the specific requirements of the system.

Structure of all the tables:

Here is an example of the table structures for a LinkedIn-like system:

1. Users:
(i) user_id (primary key)
(ii) email
(iii) password
(iv) name
(v) location
(vi) join_date

2. Profiles:
(i) profile_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) headline
(iv) summary
(v) industry
(vi) website

3. Connections:
(i) connection_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) connection_user_id (foreign key referencing Users)
(iv) connection_status

4. Education:
(i) education_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) school_name
(iv) degree
(v) field_of_study
(vi) start_date
(vii) end_date

5. Experience:
(i) experience_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) company_name
(iv) title
(v) location
(vi) start_date
(vii) end_date

6. Skills:
(i) skill_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) skill_name

7. Posts:
(i) post_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) content
(iv) post_date

8. Comments:
(i) comment_id (primary key)
(ii) post_id (foreign key referencing Posts)
(iii) user_id (foreign key referencing Users)
(iv) content
(v) comment_date

9. Likes:
(i) like_id (primary key)
(ii) post_id (foreign key referencing Posts)
(iii) user_id (foreign key referencing Users)

10. Shares:
(i) share_id (primary key)
(ii) post_id (foreign key referencing Posts)
(iii) user_id (foreign key referencing Users)

11. Groups:
(i) group_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) group_name
(iv) description
(v) created_date

12. Group_members:
(i) group_member_id (primary key)
(ii) user_id (foreign key referencing Users)
(iii) group_id (foreign key referencing Groups)

Here is a data model for a system like LinkedIn for your reference:

Data Model

It’s important to note that these table structures are examples and may vary depending on the specific requirements of the system. Additionally, Depending on the complexity of the system and data, there could be many more tables and fields required.

Reach out to me

If that article got you interested in reaching out to me, then this section is for you. You can reach out to me on Linked, or If you’d like to discuss more such interview problems, text me on LinkedIn…

Let’s connect on LinkedIn:

https://www.linkedin.com/in/ayush-dixit-dataengineer/

--

--

Ayush Dixit
Towards Data Engineering

Data Enthusiast | Loves to explore | Learning something new every day